-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Conda runtime to our CI integration tests #223
Conversation
Hmm. All the CI jobs for macOS and Linux seem to have stalled out during the new |
I've been debugging this more this morning. Re-running the GitHub Actions jobs hasn't worked. Issue is reliably reproducible every time. I tried reproducing locally first by setting
This unfortunately worked fine, although revealed that the slightly different Thinking about the issue a bit more, I thought maybe the issue was that in CI we're running
and then running:
The \o/ This ended up stalling out during "Linking openmpi…"! Process tree:
top showed PID 7510 pegging a CPU, so I straced it to see what it was doing:
It's waiting (in a thread 7511) for a The content of the shell wrapper script in the process tree is: export MAMBA_EXE='/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba'
eval "$($MAMBA_EXE 'shell' 'hook' '-s' 'bash' '-p' '/tmp/stall/nextstrain/runtimes/conda/micromamba')"
micromamba activate "/tmp/stall/nextstrain/runtimes/conda/env"
. /tmp/stall/nextstrain/runtimes/conda/env/bin/.openmpi-post-link.sh Observations:
Will dig more after lunch. |
I think the thread waiting on SIGINT is part of libmamba's normal thread handling and not indicative of anything amiss. The thread that's blocked in The First, the wrapper script (
This shell hook is emitted and evaled just fine under all conditions I've tried. It's not the proximate issue. Next, the wrapper activates the prefix path with:
Here, Shell hook source# Copyright (C) 2012 Anaconda, Inc
# SPDX-License-Identifier: BSD-3-Clause
__add_sys_prefix_to_path() {
# In dev-mode MAMBA_EXE is python.exe and on Windows
# it is in a different relative location to condabin.
if [ -z "${MAMBA_ROOT_PREFIX}" ]; then
return 0
fi;
if [ -n "${WINDIR+x}" ]; then
PATH="${MAMBA_ROOT_PREFIX}/bin:${PATH}"
PATH="${MAMBA_ROOT_PREFIX}/Scripts:${PATH}"
PATH="${MAMBA_ROOT_PREFIX}/Library/bin:${PATH}"
PATH="${MAMBA_ROOT_PREFIX}/Library/usr/bin:${PATH}"
PATH="${MAMBA_ROOT_PREFIX}/Library/mingw-w64/bin:${PATH}"
PATH="${MAMBA_ROOT_PREFIX}:${PATH}"
else
PATH="${MAMBA_ROOT_PREFIX}/bin:${PATH}"
fi
\export PATH
}
__conda_hashr() {
if [ -n "${ZSH_VERSION:+x}" ]; then
\rehash
elif [ -n "${POSH_VERSION:+x}" ]; then
: # pass
else
\hash -r
fi
}
__mamba_activate() {
\local cmd="$1"
shift
\local ask_conda
CONDA_INTERNAL_OLDPATH="${PATH}"
__add_sys_prefix_to_path
ask_conda="$(PS1="$PS1" "/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba" shell --shell bash "$cmd" "$@")" || \return $?
rc=$?
PATH="${CONDA_INTERNAL_OLDPATH}"
\eval "$ask_conda"
if [ $rc != 0 ]; then
\export PATH
fi
__conda_hashr
}
__mamba_reactivate() {
\local ask_conda
CONDA_INTERNAL_OLDPATH="${PATH}"
__add_sys_prefix_to_path
ask_conda="$(PS1="$PS1" "/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba" shell --shell bash reactivate)" || \return $?
PATH="${CONDA_INTERNAL_OLDPATH}"
\eval "$ask_conda"
__conda_hashr
}
micromamba() {
if [ "$#" -lt 1 ]; then
"/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba"
else
\local cmd="$1"
shift
case "$cmd" in
activate|deactivate)
__mamba_activate "$cmd" "$@"
;;
install|update|upgrade|remove|uninstall)
CONDA_INTERNAL_OLDPATH="${PATH}"
__add_sys_prefix_to_path
"/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba" "$cmd" "$@"
\local t1=$?
PATH="${CONDA_INTERNAL_OLDPATH}"
if [ $t1 = 0 ]; then
__mamba_reactivate
else
return $t1
fi
;;
*)
CONDA_INTERNAL_OLDPATH="${PATH}"
__add_sys_prefix_to_path
"/tmp/stall/nextstrain/runtimes/conda/micromamba/bin/micromamba" "$cmd" "$@"
\local t1=$?
PATH="${CONDA_INTERNAL_OLDPATH}"
return $t1
;;
esac
fi
}
if [ -z "${CONDA_SHLVL+x}" ]; then
\export CONDA_SHLVL=0
# In dev-mode MAMBA_EXE is python.exe and on Windows
# it is in a different relative location to condabin.
if [ -n "${_CE_CONDA+x}" ] && [ -n "${WINDIR+x}" ]; then
PATH="${MAMBA_ROOT_PREFIX}/condabin:${PATH}"
else
PATH="${MAMBA_ROOT_PREFIX}/condabin:${PATH}"
fi
\export PATH
# We're not allowing PS1 to be unbound. It must at least be set.
# However, we're not exporting it, which can cause problems when starting a second shell
# via a first shell (i.e. starting zsh from bash).
if [ -z "${PS1+x}" ]; then
PS1=
fi
fi
if [ -n "${ZSH_VERSION:+x}" ]; then
if ! command -v compinit > /dev/null; then
autoload -U +X compinit && if [[ "${ZSH_DISABLE_COMPFIX-}" = true ]]; then
compinit -u
else
compinit
fi
fi
autoload -U +X bashcompinit && bashcompinit
_umamba_zsh_completions()
{
COMPREPLY=($($MAMBA_EXE completer "${(@s: :)${(@s: :)COMP_LINE}:1}"))
}
complete -o default -F _umamba_zsh_completions micromamba
fi
if [ -n "${BASH_VERSION:+x}" ]; then
_umamba_bash_completions()
{
COMPREPLY=($($MAMBA_EXE completer "${COMP_WORDS[@]:1}"))
}
complete -o default -F _umamba_bash_completions micromamba
fi That shell function ends up calling:
in order to capture output and Crucially, two things must be true for me to reproduce the spinning:
Wild. I tested unsetting all other Conda-related env vars I could think of/find and none else seemed to matter. Here's what's in the already-activated Conda env:
Confirming the issue is related to the prompt-modification code in libmamba, I can also set The possible code paths this implicates in libmamba/micromamba are a bit sprawling and complicated, as it's part of the intricate activation/deactivation/reactivation/env stacking/unstacking logic. I want a debugger attached and a backtrace, but I don't know how easy that will be. |
|
Backtrace lands right in the code I'd been looking at related to prompt modification, but a few function calls deeper than I'd gotten. Nice. I need to spend some more time with the debugger, but I'm pretty sure I found the infinite loop. I believe that this …leads to this implementation with and that ends up in an infinite loop never advancing |
Yep, infinite loop confirmed with a tiny C++ program exercising those same methods. Compile it with
|
The empty string is found at the start of the string, pos = 0, and replaced over and over again. If the replacement string is also empty, the size of the data string will not change. If the replacement string is not empty, it will be repeatedly prepended to the data string. While most callers are careful to ensure the search string is not empty, PosixActivator::update_prompt() lost this property in "Micromamba: Substitute env vars in .condarc (mamba-org#1423)" (9e5f8fd). Rather than restore the check in the caller, fix the root of the issue in replace_all() itself. I ran into this bug as a Micromamba user in a particular set of circumstances where the CONDA_PROMPT_MODIFIER env var was the empty string.¹ It manifested to me the user as a `micromamba create` invocation hanging at the linking phase for any package with a post-link script (such as openmpi). The generated post-link script wrapper invoked, indirectly, `micromamba shell --shell bash activate <path>` which was hanging during update_prompt(). ¹ nextstrain/cli#223 (comment)
Submitted a PR to fix libmamba. A little closer over there to understanding why In any case, my fix for our runtime here allowed CI to get past …and now test-dist jobs are failing during the zika-tutorial build with Progress! |
ed66080
to
d07fde6
Compare
I adjusted last week's fix to increase isolation.
These are now resolved. Recapping Slack this morning… So these were running into this Snakemake bug because the bug was only fixed in Snakemake ≥7.15.2 but we were getting 6.8.0 because that's the upper limit required by pangolin. The Pangolin Conda recipe also requires exactly tabulate 0.8.10 which should in theory avoid that bug on Snakemake 6.8.0, but during env resolution somehow tabulate 0.9.0 is getting installed instead. Ah! The pin to tabulate 0.8.10 was only added yesterday. So it works now. The issue was the pin was missing from pangolin's recipe last week. |
d07fde6
to
151de50
Compare
Repushed to rebase onto latest master given merge of #222 and then added a changelog commit. |
Splits up the big initial command block to setup runtimes into single-command steps so we can condition runtime setup on host OS support (instead of trying setup on all OSes and ignoring errors).
…ivated environments Filter out all CONDA_* and MAMBA_* env vars and set the root prefix and no rc option for descendant processes of our micromamba invocation. The CONDA_* vars are set during Conda environment activation and can impact the operation and behaviour of Micromamba. We don't need or want that impact, as we don't want to make use of any of the activation features of Conda/Mamba/Micromamba.
151de50
to
f69c8f8
Compare
Rebased onto latest master again for merge of #224. |
Going to merge this pre-review as it's important to be testing the Conda runtime in CI, esp. for other work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looked ugly! Glad you got it resolved. Post-merge approval 👍
Splits up the big initial command block to setup runtimes into single-command steps so we can condition runtime setup on host OS support (instead of trying setup on all OSes and ignoring errors).
Related issue(s)
Based on #222, because
nextstrain setup docker
is handy there.Testing