-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing submission to GPU partition in CI run #242
Comments
@laraPPr Can you easily obtain the contents of the job script, and the exact |
I did some further digging and the problem can't be with the job script itself, the test-suite or reframe if reframe outside of the CI the error does not occur. And the test-suite runs fine. So it is gonna be something that is picked up in the CI environment that causes the error. but I haven't tracked it down yet. |
Yes you can find the job script in the stage directory and the run command you can find in the logs but the problem is not with either of those. |
I tried to find what the environment is right before th jobs gets submitted by reframe by adding this to the following file: reframe/reframe/core/schedulers/slurm.py.
but than it was complaining that their is no such file or directory named '>'. Maybe i should look at what ReFrame purges from the environment when |
I'm able to reproduce it without reframe by doing this
As with reframe it only causes trouble when trying to submit to the GPU and not the CPU partition |
|
All
rfm_job.sh
fail to be submitted by reframe on the GPU partitions. with the following error.However when I go to the
Stage directory
and run thecommand that failed
the job gets submitted and runs. Their must be something that I'm missing that is set by ReFrame but not in the jobscript of launch command. That causes the job not to get submitted. I've already investigated it a little bit and I can't find what the difference might be.ReFrame version: 4.7.3 & 4.7.2
Test-suite version: 0.5.1
The text was updated successfully, but these errors were encountered: