-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: spark.executor.cores' default value based on master when counting workers #855
fix: spark.executor.cores' default value based on master when counting workers #855
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
will need to test this out a bit since there have been a lot of issues in the past around this specific code section |
Codecov Report
@@ Coverage Diff @@
## master #855 +/- ##
==========================================
+ Coverage 84.56% 85.22% +0.66%
==========================================
Files 189 189
Lines 8709 8733 +24
Branches 544 543 -1
==========================================
+ Hits 7365 7443 +78
+ Misses 1344 1290 -54
Continue to review full report at Codecov.
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
the build was all green but it's not updating in github, weird, looks like some github bug |
@imatiach-msft I wrote new unittest and pushed. However, It is hard to tested mmlspark's unittest in my local environment . Please check it :) |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@imatiach-msft Can I see why the build failed? |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@ocworld the style checker is failing on: [error] /home/vsts/work/1/s/src/test/scala/com/microsoft/ml/spark/core/utils/VerifyClusterUtil.scala: Header does not match expected text It looks like you need to add this copyright message at the top of the file VerifyClusterUtil.scala to pass the style checker:
|
src/main/scala/com/microsoft/ml/spark/core/utils/ClusterUtil.scala
Outdated
Show resolved
Hide resolved
@ocworld I tried this PR out on a cluster (azure databricks) with Higgs dataset and it didn't change the training time for me (compared to previous runs), so it seems safe to me at least in that scenario. I think they use spark standalone mode for databricks clusters. |
@imatiach-msft Copyright is added in the file |
@imatiach-msft Thanks for your test |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
it looks like @mhamilton723 is required to review as a code owner (can't seem to merge), can you please review this PR @mhamilton723 ? |
This pr is fixing spark.executor.cores' default value based on master when counting workers.
The default value of spark.executor.cores is different as what master is.
It is 1 in YARN mode. In other case, It is all the available cores on the worker in standalone and Mesos coarse-grained modes.
(https://spark.apache.org/docs/latest/configuration.html)
Currently, I've tried to use mmlspark based on spark on k8s. When I did not set spark.executor.cores, error was occured. It is because not matched expected num of mmlspark workers with actual value.