Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle memory limits reported from Docker as floats #159

Merged
merged 4 commits into from
Mar 9, 2022

Conversation

huddlej
Copy link
Contributor

@huddlej huddlej commented Mar 5, 2022

Context

The fix in #153 produced a new error for me with Docker 4.5.0 on OS X 10.14.6:

$ python3 -m pip install --upgrade nextstrain-cli
$ nextstrain --version
nextstrain.cli 3.1.1
$ cat ~/.nextstrain/config
[docker]
image = nextstrain/base:build-20220215T000459Z

[core]
runner = native
$ nextstrain check-setup --set-default
nextstrain-cli is up to date!

Testing your setup…
Traceback (most recent call last):
  File "/Users/jlhudd/miniconda3/envs/nextstrain/bin/nextstrain", line 8, in <module>
    sys.exit(main())
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/__main__.py", line 10, in main
    return cli.run( argv[1:] )
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/__init__.py", line 33, in run
    return opts.__command__.run(opts)
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/command/check_setup.py", line 63, in run
    runner_tests = [
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/command/check_setup.py", line 64, in <listcomp>
    (runner, runner.test_setup())
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/runner/docker.py", line 226, in test_setup
    *test_memory_limit(),
  File "/Users/jlhudd/miniconda3/envs/nextstrain/lib/python3.8/site-packages/nextstrain/cli/runner/docker.py", line 172, in test_memory_limit
    limit = min(filter(None, map(int_or_none, limits)))
ValueError: min() arg is an empty sequence

Inspecting the values of limits in pdb, I find:

ipdb> limits
['4.1257e+09', 'max']
ipdb> print(list(map(int_or_none, limits)))
[None, None]

Since the OS reports the memory limit with an exponential value, int cannot interpret the string and returns a ValueError. However, float can interpret this string, so int(float("4.1257e+09")) produces 4125700000, as expected.

After changing int_or_none accordingly, I ran the nextstrain command again (this time from a local dev installation with my changes).

$ nextstrain check-setup --set-default
$ cat ~/.nextstrain/config
[docker]
image = nextstrain/base:build-20220215T000459Z

[core]
runner = docker

And the command ran as expected.

Description of proposed changes

Fixes the issue described above on Mac OS X with Docker where check-setup --set-default fails when the memory limits reported from the Docker OS are in floating point values (e.g., "") that cannot be cast to integers by Python. This commit addresses the issue by first casting the string to float before casting to int.

Related issue(s)

Related to #153

Testing

  • Tested locally to confirm the issue was fixed for me
  • Tested by CI

Fixes an issue on Mac OS X with Docker where `check-setup --set-default`
fails when the memory limits reported from the Docker OS are in floating
point values (e.g., "") that cannot be cast to integers by Python. This
commit addresses the issue by first casting the string to float before
casting to int.

Related to #153
@huddlej huddlej requested a review from tsibley March 5, 2022 00:06
@tsibley
Copy link
Member

tsibley commented Mar 8, 2022

Ugh, sorry about that. Thanks for this fix. I think we should also address the underlying ValueError: min() arg is an empty sequence issue so it doesn't remain latent until a new reason causes it. Will push another commit to this branch.

@tsibley
Copy link
Member

tsibley commented Mar 8, 2022

I was incredulous that /proc/meminfo reported this in exponential notation, so I dug a bit deeper. Indeed, proc(7) documents MemTotal to be a long unsigned int (%lu), like most fields in this file. The culprit is awk (!):

nextstrain:/nextstrain/build $ awk '/^MemTotal:/ { print $2 * 1024 }' /proc/meminfo
1.657e+10
nextstrain:/nextstrain/build $ awk '/^MemTotal:/ { print $2 }' /proc/meminfo
16181632

Locally this doesn't happen, so presumably there's some newer/different version of awk in the container than I have or maybe some locale-dependent thing. But this as-documented:

The predefined variable OFMT contains the format specification that print uses with sprintf() when it wants to convert a number to a string for printing. The default value of OFMT is "%.6g".

The upshot is we could fix this by instead doing the right thing in awk:

printf "%lu\n", $2 * 1024

But int(float(…)) in Python is also fine.

@tsibley
Copy link
Member

tsibley commented Mar 8, 2022

printf "%lu\n", $2 * 1024

Nevermind! This hits the limit of the (signed) long int (2³¹ - 1), which I didn't notice earlier:

nextstrain:/nextstrain/build $ grep ^MemTotal: /proc/meminfo
MemTotal:       16181632 kB
nextstrain:/nextstrain/build $ awk '/^MemTotal:/ { print $2 }' /proc/meminfo
16181632
nextstrain:/nextstrain/build $ awk '/^MemTotal:/ { print $2 * 1024 }' /proc/meminfo
1.657e+10
nextstrain:/nextstrain/build $ awk '/^MemTotal:/ { printf("%lu\n", $2 * 1024) }' /proc/meminfo
2147483647
nextstrain:/nextstrain/build $ perl -anE 'say $F[1] * 1024 if /^MemTotal:/' /proc/meminfo
16569991168

Maybe worth ditching the vagaries of awk's numeric handling, even if the exponential notation it produces seems ok on the face of it? I dunno.

tsibley added 2 commits March 8, 2022 10:10
…k-setup

Previously if no memory limits could be found, then we effectively
called min([]), which is a ValueError.  @huddlej's previous commit
addressed an issue with exponential notation that could lead to no
memory limits being found, but it's still good to guard min() against
similar issues in the future.

Diff best viewed with whitespace ignored.
Helpful for troubleshooting to see what the found limit was even when
it's higher than 2 GiB.
@tsibley
Copy link
Member

tsibley commented Mar 8, 2022

@huddlej Pushed two additional commits addressing my other concern. Would you take a look?

@huddlej
Copy link
Contributor Author

huddlej commented Mar 8, 2022

Maybe worth ditching the vagaries of awk's numeric handling, even if the exponential notation it produces seems ok on the face of it?

Agreed about avoiding awk vagaries where possible. It's been ages since I used Perl; will it avoid creating exponential values for the conversion to bytes? The other solution that seems fine is to make two separate system calls to parse /proc/meminfo and the cgroups info and handle the math in Python.

Pushed two additional commits addressing my other concern. Would you take a look?

These changes look good to me! Also agree it's better to protect against the empty list in the future.

Mypy really doesn't like redefinitions¹, and we start with List[str] but
then convert to List[int].  The limited support for redefinition
(--allow-redefinition) doesn't apply here because two different code
blocks are involved.

This particular shift of code into the try block doesn't impact the
handling of exceptions.

¹ python/mypy#1174
@tsibley
Copy link
Member

tsibley commented Mar 9, 2022

Agreed about avoiding awk vagaries where possible. It's been ages since I used Perl; will it avoid creating exponential values for the conversion to bytes? The other solution that seems fine is to make two separate system calls to parse /proc/meminfo and the cgroups info and handle the math in Python.

I'm going to leave it as-is with awk for now to avoid accidentally breaking it another way. I didn't mean to suggest replacing awk with Perl here. Perl will generally avoid creating exponential notation… but I believe only up to a certain point (that's larger than awk's).

Two separate run_bash() calls aren't enticing because they're a bit slow, but yeah, that's an option. We could also run Python in the container instead and have it spit out the final limit. But I'll save one of those for another day and merge this now (once CI passes) to fix the immediate issue.

@tsibley tsibley merged commit ddefaa0 into master Mar 9, 2022
@tsibley tsibley deleted the handle-float-memory-limits branch March 9, 2022 21:25
@tsibley
Copy link
Member

tsibley commented Mar 9, 2022

@huddlej Released with 3.2.0, so you can install that and make check-setup work again!

@huddlej
Copy link
Contributor Author

huddlej commented Mar 11, 2022

@tsibley It works! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

2 participants