Skip to content

Commit b5acabf

Browse files
authored
Update documentation (ufs-community#1398)
* update/condense Ch2 Code Overview, Ch2 links, formatting, add Glossary terms, etc.
1 parent df5da59 commit b5acabf

12 files changed

+1434
-959
lines changed

doc/UsersGuide/source/AutomatedTesting.rst

+74-82
Original file line numberDiff line numberDiff line change
@@ -4,69 +4,71 @@
44
Automated Testing
55
*****************
66

7-
The UFS Weather Model repository on GitHub employs two types of automated testing.
8-
One is the CI/CD on cloud and the other is the AutoRT on NOAA R&D platforms.
9-
Both are application level tests, and utilize the regression testing framework
7+
The UFS Weather Model repository on GitHub employs two types of automated testing:
8+
9+
#. CI/CD (Continuous Integration/Continuous Development) testing on the cloud
10+
#. AutoRT on NOAA R&D platforms
11+
12+
Both are application level tests and utilize the regression testing framework
1013
discussed in :numref:`Section %s <UsingRegressionTest>`.
1114

1215
=====
1316
CI/CD
1417
=====
1518

16-
The following summarizes the CI/CD used in the UFS Weather Model:
19+
The UFS Weather Model (:term:`WM`) uses GitHub Actions (GHA), a GitHub-hosted continuous integration service,
20+
to perform CI/CD testing. Build jobs are done on GHA-provided virtual machines. Test jobs are
21+
performed on the Amazon Web Services (AWS) cloud platform using a number of EC2 instances.
22+
Builds and tests are carried out in a Docker container. The container includes a pre-installed version of the
23+
:term:`HPC-Stack`, which includes all prerequisite libraries. Input data needed to run the tests
24+
are stored as a separate Docker container.
1725

18-
* GitHub Actions (GHA), a GitHub-hosted continuous integration service is used.
19-
* Build jobs are done on GHA-provided virtual machines.
20-
* Test jobs are performed on Amazon cloud using a number of EC2 instances.
21-
* Builds and tests are carried out using a Docker container.
22-
* Docker container has prerequisite libraries installed via the hpc-stack.
23-
* Input data needed to run tests are stored as a separate Docker container.
26+
When a developer makes a pull request (PR) to the UFS WM repository, a code
27+
manager may add the `run-ci` label, which triggers the CI/CD workflow.
28+
The CI/CD workflow then executes the following steps:
2429

30+
#. A check is performed to make sure the UFS Weather Model and its first level
31+
subcomponents are up to date with the top of the ``develop`` branch.
2532

26-
When a developer makes a pull request (PR) to the UFS Weather Model repository, and a code
27-
manager subsequently adds the `run-ci` label, the CI/CD workflow is triggerd:
33+
#. If the check is successful, build jobs are started on GHA-provided virtual machines
34+
by downloading the hpc-stack Docker container stored in Docker Hub.
2835

29-
#. A check is performed to make sure the UFS Weather Model and its first level
30-
subcomponents are up to date with the top of develop branch.
36+
#. Once all build jobs are successful, the created executable files are stored as
37+
artifacts in GHA.
3138

32-
#. If the check is successful, build jobs are started on GHA-provided virtual machines
33-
by downloading the hpc-stack Docker container stored in Docker Hub.
39+
#. A number of AWS EC2 instances are started.
3440

35-
#. Once all build jobs are successful, the created executable files are stored as
36-
artifacts in GHA.
41+
#. Test jobs are started on AWS after downloading the hpc-stack Docker container,
42+
the executable file from the build job, and the input-data Docker container.
3743

38-
#. A number of AWS EC2 instances are started.
44+
#. When all tests are complete, EC2 instances are stopped. Test results are reported
45+
on GitHub.
3946

40-
#. Test jobs are started on Amazon cloud by downloading the hpc-stack Docker container,
41-
the executable file from the build job, and the input-data Docker container.
4247

43-
#. When all tests are finished, EC2 instances are stopped. Test results are reported
44-
on GitHub.
48+
The GHA-related ``yaml`` scripts are located in the ``.github/workflows/`` directory.
49+
``build_test.yml`` is the main workflow file, and ``aux.yml`` is an auxiliary
50+
file responsible for (1) checking that the PR branch is up-to-date and
51+
(2) starting/stopping the EC2 instances.
4552

53+
Other CI-related scrips are located in the ``tests/ci/`` directory. ``ci.sh`` is the main script that
54+
invokes Docker build and run. ``Dockerfile`` is used to build the UFS Weather Model.
55+
Other shell and python scripts help with various tasks. For example:
4656

47-
The GHA-related yaml scripts are located in the ``.github/workflows/`` directory.
48-
``build_test.yml`` is the main workflow file, and ``aux.yml`` is an auxiliary
49-
file responsible for checking the up-to-dateness of the PR branch, and starting
50-
and stopping the EC2 instances. Other CI-related scrips are located in the ``tests/ci/``
51-
directory. ``ci.sh`` is the main script that invokes Docker build and run. ``Dockerfile``
52-
is used to build UFS Weather Model. Other shell and python scripts help with various
53-
tasks such as checking the up-to-dateness of the PR branch (``repo_check.sh``),
54-
checking the status of EC2 instances (``check_status.py``), and configuring the test cases
55-
to carry out in the CI/CD workflow (``setup.py`` and ``ci.test``).
57+
* ``repo_check.sh`` checks that the PR branch is up-to-date.
58+
* ``check_status.py`` checks the status of EC2 instances.
59+
* ``setup.py`` and ``ci.test`` configure the test cases to execute in the CI/CD workflow.
5660

61+
.. COMMENT: It sounds like aux.yml and repo_check.sh do the same thing... What's the difference?
5762
5863
=======
5964
Auto RT
6065
=======
6166

62-
The Automated Regression Testing (AutoRT) system:
63-
64-
* Automates the process of regression testing on NOAA HPC platforms.
65-
66-
* Written in python.
67-
68-
* Contains the following files:
67+
The Automated Regression Testing (AutoRT) system is a python program that automates the process
68+
of regression testing on NOAA HPC platforms.
69+
It contains the files in :numref:`Table %s <autoRT-files>` below:
6970

71+
.. _autoRT-files:
7072
.. table:: *Files for Automated Regression Testing (AutoRT) system*
7173

7274
+-------------------+-----------------------------------------------------+
@@ -81,56 +83,46 @@ The Automated Regression Testing (AutoRT) system:
8183
| jobs/rt.py | Functions for the regression test job |
8284
+-------------------+-----------------------------------------------------+
8385

84-
~~~~~~~~~~~~~~~
86+
-----------------
8587
AutoRT Workflow
86-
~~~~~~~~~~~~~~~
87-
* Cron-job on supported HPC systems runs start_rt_auto.sh bash script every
88-
15 minutes.
89-
90-
* This script verifies the HPC name, and sets the python paths. Runs
91-
rt_auto.py.
92-
93-
* rt_auto.py: Uses the Github API (Through pyGitHub)
94-
95-
* Checks the pull requests to ufs-community/ufs-weather-model for
96-
labels specific to the HPC name. If no match to HPC name, exits.
97-
(i.e. hera-intel-RT or cheyenne-gnu-BL)
98-
99-
* If the HPC name matches the label in ufs-weather-model pull
100-
request, the label provides the HPC with the compiler and job to run on
101-
the machine.
88+
-----------------
10289

103-
* For example the label gaea-intel-BL will be recognized by the HPC
104-
machine 'Gaea', set the RT_COMPILER variable to 'intel' and run the
105-
baseline creation script (bl.py).
90+
On supported HPC systems, a :term:`cron job` runs the ``start_rt_auto.sh`` bash script every 15 minutes.
91+
This script checks the HPC name and sets certain python paths. Then, it runs ``rt_auto.py``,
92+
which uses the Github API (through pyGitHub) to check the labels on pull requests to
93+
``ufs-weather-model``. If a PR label matches the HPC name
94+
(e.g., hera-intel-RT or cheyenne-gnu-BL), the label provides the HPC
95+
with the compiler and job information to run a test or task on the machine.
96+
If no PR label matches HPC name, the script exits.
10697

107-
* Creates a Job class that contains all information from the machine
108-
that the job will need to run. That is sent into the jobs/rt[bl].py script.
98+
For example, a PR labeled ``gaea-intel-BL`` will be recognized by the HPC machine 'Gaea'.
99+
It will set the ``RT_COMPILER`` variable to 'intel' and run the baseline creation script (``bl.py``).
100+
This script creats a job class that contains all information from the machine that the job will need to run.
101+
That information is sent into the ``jobs/rt[bl].py`` script.
109102

110-
* rt.py: Sets directories for storage, gets repo information, runs RT,
111-
post processes.
103+
``rt.py`` sets directories for storage, gets repo information, runs the regression test, and
104+
completes any required post processing.
112105

113106
.. code-block:: python3
114107
115-
def run(job_obj):
116-
logger = logging.getLogger('RT/RUN')
117-
workdir = set_directories(job_obj)
118-
branch, pr_repo_loc, repo_dir_str = clone_pr_repo(job_obj, workdir)
119-
run_regression_test(job_obj, pr_repo_loc)
120-
post_process(job_obj, pr_repo_loc, repo_dir_str, branch)
108+
def run(job_obj):
109+
logger = logging.getLogger('RT/RUN')
110+
workdir = set_directories(job_obj)
111+
branch, pr_repo_loc, repo_dir_str = clone_pr_repo(job_obj, workdir)
112+
run_regression_test(job_obj, pr_repo_loc)
113+
post_process(job_obj, pr_repo_loc, repo_dir_str, branch)
121114
122-
* bl.py: (similar to rt.py) Adds functionality to create baselines before
123-
running regression testing.
115+
``bl.py``: (similar to ``rt.py``) Adds functionality to create baselines before running regression testing.
124116

125117
.. code-block:: python3
126-
:emphasize-lines: 5,6,7
127-
128-
def run(job_obj):
129-
logger = logging.getLogger('BL/RUN')
130-
workdir, rtbldir, blstore = set_directories(job_obj)
131-
pr_repo_loc, repo_dir_str = clone_pr_repo(job_obj, workdir)
132-
bldate = get_bl_date(job_obj, pr_repo_loc)
133-
bldir = f'{blstore}/develop-{bldate}/{job_obj.compiler.upper()}'
134-
bldirbool = check_for_bl_dir(bldir, job_obj)
135-
run_regression_test(job_obj, pr_repo_loc)
136-
post_process(job_obj, pr_repo_loc, repo_dir_str, rtbldir, bldir)
118+
:emphasize-lines: 5,6,7
119+
120+
def run(job_obj):
121+
logger = logging.getLogger('BL/RUN')
122+
workdir, rtbldir, blstore = set_directories(job_obj)
123+
pr_repo_loc, repo_dir_str = clone_pr_repo(job_obj, workdir)
124+
bldate = get_bl_date(job_obj, pr_repo_loc)
125+
bldir = f'{blstore}/develop-{bldate}/{job_obj.compiler.upper()}'
126+
bldirbool = check_for_bl_dir(bldir, job_obj)
127+
run_regression_test(job_obj, pr_repo_loc)
128+
post_process(job_obj, pr_repo_loc, repo_dir_str, rtbldir, bldir)

0 commit comments

Comments
 (0)