-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Settings documentation #429
Settings documentation #429
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## smartsim-docs-refresh #429 +/- ##
=========================================================
+ Coverage 90.23% 90.42% +0.19%
=========================================================
Files 60 60
Lines 3859 3738 -121
=========================================================
- Hits 3482 3380 -102
+ Misses 377 358 -19 |
doc/run_settings.rst
Outdated
:widths: 25 55 25 | ||
:header-rows: 1 | ||
|
||
* - ``SrunSettings`` func |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that we can use CSV files as the source of table information in the sphinx guide: https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html#csv-files
I wonder if we might benefit by generating code documentation and transforming to csv. That could ensure that the arguments are always up to date and documentation is updated automagically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead I am going to lead the user to the list of funcs in the SmartSimAPI instead of having a table here. I think it might make things less lengthy
|
||
.. code-block:: python | ||
|
||
exp = Experiment("name-of-experiment", launcher="local") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We say we're going to use srun settings but the launcher is local. perhaps it should be
launcher="slurm"
- where is the allocation specified. the intro text states that it must be present. I know slurm looks for the
SLURM_JOB_ID
env var but i think we can also specify usingrun_settings.alloc = "12345" # slurm job id here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a .. note:: stating the env var lookup behavior w/WLMs and an example passing it to the run settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding this information into a note!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of suggested changes. I think it would be good to also have @ashao to quickly look over the examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the a fast-follow review. I didn't realize RunSettings were in the same PR as BatchSettings. Like before, I think @ashao should double check these examples after revisions have been made to check for errors.
doc/run_settings.rst
Outdated
|
||
When the user initializes the ``Experiment`` at the beginning of the Python driver script, | ||
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class | ||
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this is true: "A user may also specify the launcher when initializing the run settings object". create_run_settings
doesn't take in the launcher. It is true that the RunSettings
constructor has the launcher
parameter, but I'm not sure we want users calling the constructor directly. Are you seeing something different in other examples? (I see this through your example so we really need to verify)
If I end up being right, I think we should show Experiment
creation and then Experiment. create_run_settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took this from the test_run_settings.py file where we pass in the launcher to create_run_settings() - but I agree that I shouldnt expose this since it will be specified by the user in Exp creation.
doc/run_settings.rst
Outdated
# Initialize a RunSettings object | ||
run_settings = exp.create_run_settings(launcher="local", "echo", exe_args="Hello World", run_command=None) | ||
# Set the number of nodes | ||
run_settings.set_nodes(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that these examples make a lot of sense. If something like mpiexec isn't being used, what does it mean to set_nodes and tasks_per_node, etc. Are we assuming mpiexec would be auto detected. I could be wrong, but I don't think it would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of creating a base run settings class without the run command specified and then further configuring the run settings object was to demonstrate how u have access to the run settings functions after init, I was thinking here that hello would be echoed 10 times, 5 times per node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the local launcher doesn't really have a concept of nodes, or multiple tasks when run_command is None, does it? is it just ignoring these parameters at launch time? @al-rigazzi or @ashao can you chime in on this example? Maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was very incorrect, they throw logger warning messages :O
doc/run_settings.rst
Outdated
|
||
The Slurm `launcher` supports the :ref:`SrunSettings API <srun_api>` as well as the :ref:`MpirunSettings API <openmpi_run_api>`, | ||
:ref:`MpiexecSettings API <openmpi_exec_api>` and :ref:`OrterunSettings API <openmpi_orte_api>` that each can be used to run executables | ||
with arbitrary launch binaries like `mpirun`, `mpiexec` and `orterun`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "arbitrary"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some changes and comments which add to what MattE wrote.
doc/run_settings.rst
Outdated
When the user initializes the ``Experiment`` at the beginning of the Python driver script, | ||
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class | ||
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing | ||
the run settings object. Below we demonstrate creating and configuring the base ``RunSettings`` object for local launches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess "launches" -> "launchers", but if it's intentional, just disregard this comment (as it can still make sense)
doc/run_settings.rst
Outdated
When the user initializes the ``Experiment`` at the beginning of the Python driver script, | ||
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class | ||
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing | ||
the run settings object. Below we demonstrate creating and configuring the base ``RunSettings`` object for HPC launches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, do we want to write "launches"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of small changes and one question
doc/run_settings.rst
Outdated
# Initialize a RunSettings object | ||
run_settings = exp.create_run_settings(launcher="local", "echo", exe_args="Hello World", run_command=None) | ||
# Set the number of nodes | ||
run_settings.set_nodes(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the local launcher doesn't really have a concept of nodes, or multiple tasks when run_command is None, does it? is it just ignoring these parameters at launch time? @al-rigazzi or @ashao can you chime in on this example? Maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small typos but otherwise LGTM!
This PR merges in the run settings and batch settings page for the SmartSim API docs