Skip to content

Commit

Permalink
disable ap
Browse files Browse the repository at this point in the history
  • Loading branch information
sirutBuasai committed Feb 14, 2025
1 parent 136918c commit 50bfe82
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
2 changes: 1 addition & 1 deletion tensorflow/training/buildspec-2-18-sm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ framework: &FRAMEWORK tensorflow
version: &VERSION 2.18.0
short_version: &SHORT_VERSION "2.18"
arch_type: x86
autopatch_build: "True"
# autopatch_build: "True"

repository_info:
training_repository: &TRAINING_REPOSITORY
Expand Down
1 change: 1 addition & 0 deletions tensorflow/training/docker/2.18/py3/cu125/Dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,7 @@ RUN rm -rf /tmp/*
# Copy workaround script for incorrect hostname
COPY start_cuda_compat.sh /usr/local/bin/start_cuda_compat.sh
COPY dockerd-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
RUN chmod +x /usr/local/bin/start_cuda_compat.sh
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh

RUN HOME_DIR=/root \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,10 @@
import subprocess

# run compat mounting by default
subprocess.run(["bash", "-m", "/usr/local/bin/start_cuda_compat.sh"])
try:
subprocess.run(["bash", "-m", "/usr/local/bin/start_cuda_compat.sh"])
except Exception as e:
print(f"Error running script: {e}")

if not os.path.exists("/opt/ml/input/config"):
subprocess.call(["python", "/usr/local/bin/deep_learning_container.py", "&>/dev/null", "&"])
Expand Down

0 comments on commit 50bfe82

Please sign in to comment.