Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logutils accumulator API #677

Merged
merged 7 commits into from
Aug 30, 2022
Merged

logutils accumulator API #677

merged 7 commits into from
Aug 30, 2022

Conversation

carns
Copy link
Contributor

@carns carns commented Mar 3, 2022

  • Adds a "darshan_accumulator" API to the logutils library
    • _create(), _inject(), _emit(), and _destroy()
    • generalizes the mechanism for producing summation records and derived metrics for sets of records from a given module
    • refactors darshan-parser to use new API
    • implements support for accumulators in POSIX, STDIO, and MPIIO modules
  • Integrates the µnit Testing Framework in darshan-util
    • implements unit tests for darshan_accumlator API; they can be invoked with make check at the top level build directory

Note that this does not include Python bindings; those need to be added separately after this is merged so that these routines can be used withing Python utilities.

Fixes #642

@carns
Copy link
Contributor Author

carns commented Mar 8, 2022

Pausing work on this PR until refactoring some underlying code. First step is in #684

@carns carns force-pushed the dev-issue-642-accumulator branch from b04294f to 3966283 Compare March 14, 2022 14:46
@carns carns force-pushed the dev-issue-642-accumulator branch from 74b52a5 to 1922d8a Compare April 4, 2022 19:45
@carns carns force-pushed the dev-issue-642-accumulator branch from 1922d8a to f3c0c5d Compare July 26, 2022 15:29
@carns
Copy link
Contributor Author

carns commented Jul 26, 2022

Current status:

  • rebased PR against origin/main following 3.4.0 release (clean, all tests pass)
  • munit testing in place
  • api stubbed out (see bottom of darshan-logutils.h)
  • implementation of record aggregation in place for mpi-io, posix, and stdio, with test cases (see darshan-logutils-accumulator.c)
  • backported fixes to agg_record functions to origin/main

TODO:

  • evaluate if any of the above can be factored into darshan-parser yet
  • implement initial metrics struct calculation for the three primary module types
  • pull logic from calc_perf, calc_file, and file_list functions in darshan-parser into accumulator

@carns
Copy link
Contributor Author

carns commented Jul 26, 2022

Next development target should probably be to replace the posix_accum_file() (and subsequently, stdio and mpiio versions of that same function) in darshan-parser. Right now it calls the agg_record functions directly and does some metric accumulation. It is used twice; once per unique file and once globally.

@carns
Copy link
Contributor Author

carns commented Jul 26, 2022

The primary complexity with replacing/refactoring the posix_accum_file() path is that it is used for two cases: once for summary metrics, and once for per-file metrics. The latter is (in part) used for the --file-list and --file-list-detailed options to darshan-parser.

I'm not sure how important these options are, but their existence means that (regardless of accumulator API capabilities) the darshan-parser has to continue to maintain it's own explicit hash table to track distinct files. If it weren't for those options then the hash table could be hidden within the accumulator implementation to produce derived values (like in the --file option) to greatly simplify the darshan-parser code base.

@carns
Copy link
Contributor Author

carns commented Jul 26, 2022

rebased atop the PR changes that remove --file-list and --file-list-detailed to take another look at how to replace the posix_accum_file() path.

@carns carns force-pushed the dev-issue-642-accumulator branch from 86ca5b1 to fb7cb17 Compare July 27, 2022 13:58
@carns
Copy link
Contributor Author

carns commented Jul 29, 2022

Accumulator API now implements functionality needed for --total and --file arguments to darshan-parser. Next steps:

  • integrate into darshan-parser
  • validate darshan-parser output
  • implement unit tests

After that, the same process will need to be done for the --perf capability.

@carns
Copy link
Contributor Author

carns commented Jul 29, 2022

Final steps are for performance metrics:

  • implement --perf functionality in accumulator
  • integrate into darshan-parser
  • validate darshan-parser output
  • implement unit tests

@carns carns force-pushed the dev-issue-642-accumulator branch from 7262e90 to d115e3e Compare August 17, 2022 17:39
@carns
Copy link
Contributor Author

carns commented Aug 18, 2022

Functionality complete. Need to make any necessary doc/changelog updates and squash.

- generalized, reusable functions for summing and deriving metrics from
  arbitrary sets of records within a module
- logutils plugins implemented for STDIO, POSIX, and MPIIO
- refactored darshan-parser accordingly and validated results for a
  range of logs
- integrated µnit Testing Framework
- implemented unit test for darshan_accumulator API at the C level
@carns carns force-pushed the dev-issue-642-accumulator branch from 8c05d3a to 0cdf237 Compare August 18, 2022 19:42
@carns carns changed the title WIP: logutils accumulator API logutils accumulator API Aug 18, 2022
carns and others added 6 commits August 30, 2022 09:53
- copy and paste from posix case where it was needed; not necessary here
- also improve clarity of comments describing them
@shanedsnyder
Copy link
Contributor

Looks good, thanks!

@shanedsnyder shanedsnyder merged commit 56efe4a into main Aug 30, 2022
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Oct 27, 2022
* turn on `make check` in CI to run the `darshan-util`
unit tests Phil added in darshan-hpcgh-677 -- they run very quickly
so I don't see why we shouldn't just run them

* we could eventually turn on `gcov` to show which lines
are being traced through in the C code, but that's a bit
more work to fuse into the Python cov report, so delaying
that for now...
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Oct 27, 2022
* early draft of Python/CFFI interface to
derived metrics/accumulators described in:
- darshan-hpcgh-642
- darshan-hpcgh-677

* for now this definitely doesn't work, and feels like I'm basically
reconstituting a C control flow in CFFI/Python instead of using a sensible
exposure point between C and Python to pull out populated structs from
a single entry point

* perhaps folks can just help me sort out the current issues noted
in the source changes rather than providing a convenient API, though
once thorough regression tests are in place that might be something
to consider in the future... (or even just maintaining it in
`pandas`/Python someday if the perf is ~similar)
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Nov 4, 2022
* early draft of Python/CFFI interface to
derived metrics/accumulators described in:
- darshan-hpcgh-642
- darshan-hpcgh-677

* for now this definitely doesn't work, and feels like I'm basically
reconstituting a C control flow in CFFI/Python instead of using a sensible
exposure point between C and Python to pull out populated structs from
a single entry point

* perhaps folks can just help me sort out the current issues noted
in the source changes rather than providing a convenient API, though
once thorough regression tests are in place that might be something
to consider in the future... (or even just maintaining it in
`pandas`/Python someday if the perf is ~similar)
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Nov 21, 2022
* early draft of Python/CFFI interface to
derived metrics/accumulators described in:
- darshan-hpcgh-642
- darshan-hpcgh-677

* for now this definitely doesn't work, and feels like I'm basically
reconstituting a C control flow in CFFI/Python instead of using a sensible
exposure point between C and Python to pull out populated structs from
a single entry point

* perhaps folks can just help me sort out the current issues noted
in the source changes rather than providing a convenient API, though
once thorough regression tests are in place that might be something
to consider in the future... (or even just maintaining it in
`pandas`/Python someday if the perf is ~similar)
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Nov 30, 2022
* early draft of Python/CFFI interface to
derived metrics/accumulators described in:
- darshan-hpcgh-642
- darshan-hpcgh-677

* for now this definitely doesn't work, and feels like I'm basically
reconstituting a C control flow in CFFI/Python instead of using a sensible
exposure point between C and Python to pull out populated structs from
a single entry point

* perhaps folks can just help me sort out the current issues noted
in the source changes rather than providing a convenient API, though
once thorough regression tests are in place that might be something
to consider in the future... (or even just maintaining it in
`pandas`/Python someday if the perf is ~similar)
tylerjereddy added a commit to tylerjereddy/darshan that referenced this pull request Dec 16, 2022
* early draft of Python/CFFI interface to
derived metrics/accumulators described in:
- darshan-hpcgh-642
- darshan-hpcgh-677

* for now this definitely doesn't work, and feels like I'm basically
reconstituting a C control flow in CFFI/Python instead of using a sensible
exposure point between C and Python to pull out populated structs from
a single entry point

* perhaps folks can just help me sort out the current issues noted
in the source changes rather than providing a convenient API, though
once thorough regression tests are in place that might be something
to consider in the future... (or even just maintaining it in
`pandas`/Python someday if the perf is ~similar)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

public logutils API for accumulated / derived metrics
2 participants