Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Settings documentation #429

Conversation

amandarichardsonn
Copy link
Contributor

@amandarichardsonn amandarichardsonn commented Dec 4, 2023

This PR merges in the run settings and batch settings page for the SmartSim API docs

@amandarichardsonn amandarichardsonn changed the title Run Settings PR Run Settings documentation Dec 4, 2023
@amandarichardsonn amandarichardsonn marked this pull request as draft December 4, 2023 18:37
Copy link

codecov bot commented Dec 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (79e8ce5) 90.23% compared to head (47975f9) 90.42%.
Report is 8 commits behind head on smartsim-docs-refresh.

Additional details and impacted files

Impacted file tree graph

@@                    Coverage Diff                    @@
##           smartsim-docs-refresh     #429      +/-   ##
=========================================================
+ Coverage                  90.23%   90.42%   +0.19%     
=========================================================
  Files                         60       60              
  Lines                       3859     3738     -121     
=========================================================
- Hits                        3482     3380     -102     
+ Misses                       377      358      -19     

see 29 files with indirect coverage changes

@amandarichardsonn amandarichardsonn marked this pull request as ready for review December 5, 2023 05:20
@amandarichardsonn amandarichardsonn changed the title Run Settings documentation Settings documentation Dec 6, 2023
:widths: 25 55 25
:header-rows: 1

* - ``SrunSettings`` func
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that we can use CSV files as the source of table information in the sphinx guide: https://sublime-and-sphinx-guide.readthedocs.io/en/latest/tables.html#csv-files

I wonder if we might benefit by generating code documentation and transforming to csv. That could ensure that the arguments are always up to date and documentation is updated automagically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So instead I am going to lead the user to the list of funcs in the SmartSimAPI instead of having a table here. I think it might make things less lengthy


.. code-block:: python

exp = Experiment("name-of-experiment", launcher="local")
Copy link
Contributor

@ankona ankona Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We say we're going to use srun settings but the launcher is local. perhaps it should be launcher="slurm"
  • where is the allocation specified. the intro text states that it must be present. I know slurm looks for the SLURM_JOB_ID env var but i think we can also specify using run_settings.alloc = "12345" # slurm job id here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a .. note:: stating the env var lookup behavior w/WLMs and an example passing it to the run settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding this information into a note!

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of suggested changes. I think it would be good to also have @ashao to quickly look over the examples.

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the a fast-follow review. I didn't realize RunSettings were in the same PR as BatchSettings. Like before, I think @ashao should double check these examples after revisions have been made to check for errors.


When the user initializes the ``Experiment`` at the beginning of the Python driver script,
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this is true: "A user may also specify the launcher when initializing the run settings object". create_run_settings doesn't take in the launcher. It is true that the RunSettings constructor has the launcher parameter, but I'm not sure we want users calling the constructor directly. Are you seeing something different in other examples? (I see this through your example so we really need to verify)

If I end up being right, I think we should show Experiment creation and then Experiment. create_run_settings

Copy link
Contributor Author

@amandarichardsonn amandarichardsonn Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this from the test_run_settings.py file where we pass in the launcher to create_run_settings() - but I agree that I shouldnt expose this since it will be specified by the user in Exp creation.

# Initialize a RunSettings object
run_settings = exp.create_run_settings(launcher="local", "echo", exe_args="Hello World", run_command=None)
# Set the number of nodes
run_settings.set_nodes(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that these examples make a lot of sense. If something like mpiexec isn't being used, what does it mean to set_nodes and tasks_per_node, etc. Are we assuming mpiexec would be auto detected. I could be wrong, but I don't think it would be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of creating a base run settings class without the run command specified and then further configuring the run settings object was to demonstrate how u have access to the run settings functions after init, I was thinking here that hello would be echoed 10 times, 5 times per node

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the local launcher doesn't really have a concept of nodes, or multiple tasks when run_command is None, does it? is it just ignoring these parameters at launch time? @al-rigazzi or @ashao can you chime in on this example? Maybe I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was very incorrect, they throw logger warning messages :O


The Slurm `launcher` supports the :ref:`SrunSettings API <srun_api>` as well as the :ref:`MpirunSettings API <openmpi_run_api>`,
:ref:`MpiexecSettings API <openmpi_exec_api>` and :ref:`OrterunSettings API <openmpi_orte_api>` that each can be used to run executables
with arbitrary launch binaries like `mpirun`, `mpiexec` and `orterun`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "arbitrary"

Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some changes and comments which add to what MattE wrote.

When the user initializes the ``Experiment`` at the beginning of the Python driver script,
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing
the run settings object. Below we demonstrate creating and configuring the base ``RunSettings`` object for local launches
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "launches" -> "launchers", but if it's intentional, just disregard this comment (as it can still make sense)

When the user initializes the ``Experiment`` at the beginning of the Python driver script,
a `launcher` argument may be specified. SmartSim will register or detect the `launcher` and return the supported class
upon a call to ``Experiment.create_run_settings()``. A user may also specify the launcher when initializing
the run settings object. Below we demonstrate creating and configuring the base ``RunSettings`` object for HPC launches
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, do we want to write "launches"?

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small changes and one question

# Initialize a RunSettings object
run_settings = exp.create_run_settings(launcher="local", "echo", exe_args="Hello World", run_command=None)
# Set the number of nodes
run_settings.set_nodes(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the local launcher doesn't really have a concept of nodes, or multiple tasks when run_command is None, does it? is it just ignoring these parameters at launch time? @al-rigazzi or @ashao can you chime in on this example? Maybe I'm missing something.

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very small typos but otherwise LGTM!

@amandarichardsonn amandarichardsonn merged commit 5a74666 into CrayLabs:smartsim-docs-refresh Jan 12, 2024
@amandarichardsonn amandarichardsonn deleted the run_batch_settings branch January 12, 2024 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants