-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update documentation for LightGBM and add missing binary references to console app. #452
Conversation
…d transform including native code to console command line binary.
<Project> | ||
|
||
|
||
<Target Name="CopyNativeAssemblies" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move this up to a common Directory.Build.targets
instead of duplicating it with the test\Directory.Build.targets
. #Resolved
|
||
if (useLightGBM) | ||
{ | ||
Assert.True(predictions.ElementAt(0).Sentiment.IsTrue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. Why would we get different predictions using different algorithms? Wouldn't we want to get the same prediction? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, they are different algorithms and results can vary. Even evaluation metrics should be different. I would be very surprised if the evaluation metrics were the same.
In reply to: 199031606 [](ancestors = 199031606)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would myself be utterly shocked if the two algorithms yielded identical predictions.
In reply to: 199031723 [](ancestors = 199031723,199031606)
[TlcModule.EntryPoint(Name = "Trainers.LightGbmRanker", Desc = "Train an LightGBM ranking model", UserName = LightGbmRankingTrainer.Summary, ShortName = LightGbmRankingTrainer.ShortName)] | ||
[TlcModule.EntryPoint( | ||
Name = "Trainers.LightGbmRanker", | ||
Desc = "Train an LightGBM ranking model. This API requires Microsoft.ML.LightGBM nuget.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Train an LightGBM
=> Train a LightGBM
#Resolved
[TlcModule.EntryPoint(Name = "Trainers.LightGbmRanker", Desc = "Train an LightGBM ranking model", UserName = LightGbmRankingTrainer.Summary, ShortName = LightGbmRankingTrainer.ShortName)] | ||
[TlcModule.EntryPoint( | ||
Name = "Trainers.LightGbmRanker", | ||
Desc = "Train a LightGBM ranking model. This API requires Microsoft.ML.LightGBM nuget.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requires Microsoft.ML.LightGBM nuget [](start = 61, length = 36)
Entry-points are not just for the C# API. (Indeed, in a few months they won't be for the C# API at all.) This description will also show up in the other language bindings. This note is specific to .NET, so it does not belong here.
Rather, you must make the change in the C# API itself. Exploit the fact that these are partial classes. If you must put, the specific logic in the C# API generator. This is not the right place for it. #Resolved
[TlcModule.EntryPoint(Name = "Trainers.LightGbmRegressor", Desc = LightGbmRegressorTrainer.Summary, UserName = LightGbmRegressorTrainer.UserNameValue, ShortName = LightGbmRegressorTrainer.ShortName)] | ||
[TlcModule.EntryPoint( | ||
Name = "Trainers.LightGbmRegressor", | ||
Desc = LightGbmRegressorTrainer.Summary + ". This API requires Microsoft.ML.LightGBM nuget.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API requires Microsoft.ML.LightGBM nuget.", [](start = 57, length = 48)
Same comment. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
@@ -163,38 +175,71 @@ public void CrossValidateSentimentModelTest() | |||
Assert.True(predictions.ElementAt(1).Sentiment.IsTrue); | |||
} | |||
|
|||
private void ValidateBinaryMetrics(BinaryClassificationMetrics metrics) | |||
private void ValidateBinaryMetrics(BinaryClassificationMetrics metrics, bool useLightGBM = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, bool useLightGBM = false [](start = 78, length = 26)
This is really weird design.
Why can't you just have two different methods? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test will go away after a month when the old API is removed.
In reply to: 199234671 [](ancestors = 199234671)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it go away, or be replaced with something calling the new API?
Are we saying we don't care about quality and comprehensibility of the tests over the existing API, because the plan is to do away with the existing API? Would that be a sensible plan given that we are still on the hook to support the old API?
I agree that this test looks "odd." It doesn't quite resemble any other test we have for learners, which will make the time when we want to migrate it to the new API harder to do. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test will have to be rewritten. There is no concept of Pipeline in the new API. I have just added an option to switch between LightGBM and FastTree trainer in a function that will need to be deleted. #Resolved
|
||
<NativeAssemblyReference Include="FastTreeNative" /> | ||
<NativeAssemblyReference Include="CpuMathNative" /> | ||
<NativeAssemblyReference Include="FactorizationMachineNative" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about LdaNative
? https://github.com/dotnet/machinelearning/tree/master/src/Native/LdaNative. Is that necessary? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. We will need that if someone uses that via command line.
In reply to: 199242696 [](ancestors = 199242696)
…to console # Conflicts: # test/BaselineOutput/Common/EntryPoints/core_manifest.json
using System.Text; | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's usually header first, using second. #Resolved
On the subject of fixes for LightGBM, I noticed in #392 there was a comment from @Ivanidzo4ka about the catalog not being updated. See here. Since you're fixing bugs or oversights in the initial checkin, now is also a good time to add the reference to LightGBM in that test project. |
Which test project are you referring to? This is already added to Microsoft.ML.Core.Tests. In reply to: 401545453 [](ancestors = 401545453) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LightGBM support looks very promising, are there any plans to leverage the GPU training features it has? |
The Microsoft CNTK nuget package is a fully self-contained package that includes the necessary GPU libraries, is there any reason the same couldn't be done here? Failing that, can we add a configuration parameter to allow for (GPU enabled) locally built versions of |
…o console app. (dotnet#452) * Add LightGBM to the entrypoint manifest and references to trainers and transform including native code to console command line binary. * Add documentation to use LightGBM nuget for LightGBM APIs.
fixes #450
fixes #451