diff --git a/docker/README.md b/docker/README.md index 0c8cb0ed9a..0a39b7a496 100644 --- a/docker/README.md +++ b/docker/README.md @@ -1,24 +1,114 @@ # icefall dockerfile -We provide a dockerfile for some users, the configuration of dockerfile is : Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8-python3.8. You can use the dockerfile by following the steps: +2 sets of configuration are provided - (a) Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8, and (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8. + +If your NVIDIA driver supports CUDA Version: 11.3, please go for case (a) Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8. + +Otherwise, since the older PyTorch images are not updated with the [apt-key rotation by NVIDIA](https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key), you have to go for case (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8. Ensure that your NVDIA driver supports at least CUDA 11.0. + +You can check the highest CUDA version within your NVIDIA driver's support with the `nvidia-smi` command below. In this example, the highest CUDA version is 11.0, i.e. case (b) Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8. + +```bash +$ nvidia-smi +Tue Sep 20 00:26:13 2022 ++-----------------------------------------------------------------------------+ +| NVIDIA-SMI 450.119.03 Driver Version: 450.119.03 CUDA Version: 11.0 | +|-------------------------------+----------------------+----------------------+ +| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | +| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | +| | | MIG M. | +|===============================+======================+======================| +| 0 TITAN RTX On | 00000000:03:00.0 Off | N/A | +| 41% 31C P8 4W / 280W | 16MiB / 24219MiB | 0% Default | +| | | N/A | ++-------------------------------+----------------------+----------------------+ +| 1 TITAN RTX On | 00000000:04:00.0 Off | N/A | +| 41% 30C P8 11W / 280W | 6MiB / 24220MiB | 0% Default | +| | | N/A | ++-------------------------------+----------------------+----------------------+ + ++-----------------------------------------------------------------------------+ +| Processes: | +| GPU GI CI PID Type Process name GPU Memory | +| ID ID Usage | +|=============================================================================| +| 0 N/A N/A 2085 G /usr/lib/xorg/Xorg 9MiB | +| 0 N/A N/A 2240 G /usr/bin/gnome-shell 4MiB | +| 1 N/A N/A 2085 G /usr/lib/xorg/Xorg 4MiB | ++-----------------------------------------------------------------------------+ + +``` ## Building images locally +If your environment requires a proxy to access the Internet, remember to add those information into the Dockerfile directly. +For most cases, you can uncomment these lines in the Dockerfile and add in your proxy details. + +```dockerfile +ENV http_proxy=http://aaa.bb.cc.net:8080 \ + https_proxy=http://aaa.bb.cc.net:8080 +``` +Then, proceed with these commands. + +### If you are case (a), i.e. your NVIDIA driver supports CUDA version >= 11.3: + +```bash +cd docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8 +docker build -t icefall/pytorch1.12.1 . +``` + +### If you are case (b), i.e. your NVIDIA driver can only support CUDA versions 11.0 <= x < 11.3: ```bash cd docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8 -docker build -t icefall/pytorch1.7.1:latest -f ./Dockerfile ./ +docker build -t icefall/pytorch1.7.1 . ``` -## Using built images -Sample usage of the GPU based images: +## Running your built local image +Sample usage of the GPU based images. These commands are written with case (a) in mind, so please make the necessary changes to your image name if you are case (b). Note: use [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) to run the GPU images. ```bash -docker run -it --runtime=nvidia --name=icefall_username --gpus all icefall/pytorch1.7.1:latest +docker run -it --runtime=nvidia --shm-size=2gb --name=icefall --gpus all icefall/pytorch1.12.1 ``` -Sample usage of the CPU based images: +### Tips: +1. Since your data and models most probably won't be in the docker, you must use the -v flag to access the host machine. Do this by specifying `-v {/path/in/docker}:{/path/in/host/machine}`. + +2. Also, if your environment requires a proxy, this would be a good time to add it in too: `-e http_proxy=http://aaa.bb.cc.net:8080 -e https_proxy=http://aaa.bb.cc.net:8080`. + +Overall, your docker run command should look like this. + +```bash +docker run -it --runtime=nvidia --shm-size=2gb --name=icefall --gpus all -v {/path/in/docker}:{/path/in/host/machine} -e http_proxy=http://aaa.bb.cc.net:8080 -e https_proxy=http://aaa.bb.cc.net:8080 icefall/pytorch1.12.1 +``` + +You can explore more docker run options [here](https://docs.docker.com/engine/reference/commandline/run/) to suit your environment. + +### Linking to icefall in your host machine + +If you already have icefall downloaded onto your host machine, you can use that repository instead so that changes in your code are visible inside and outside of the container. + +Note: Remember to set the -v flag above during the first run of the container, as that is the only way for your container to access your host machine. +Warning: Check that the icefall in your host machine is visible from within your container before proceeding to the commands below. + +Use these commands once you are inside the container. + +```bash +rm -r /workspace/icefall +ln -s {/path/in/docker/to/icefall} /workspace/icefall +``` + +## Starting another session in the same running container. +```bash +docker exec -it icefall /bin/bash +``` + +## Restarting a killed container that has been run before. +```bash +docker start -ai icefall +``` +## Sample usage of the CPU based images: ```bash -docker run -it icefall/pytorch1.7.1:latest /bin/bash -``` \ No newline at end of file +docker run -it icefall /bin/bash +``` diff --git a/docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8/Dockerfile b/docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8/Dockerfile new file mode 100644 index 0000000000..db4dda8647 --- /dev/null +++ b/docker/Ubuntu18.04-pytorch1.12.1-cuda11.3-cudnn8/Dockerfile @@ -0,0 +1,72 @@ +FROM pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel + +# ENV http_proxy=http://aaa.bbb.cc.net:8080 \ +# https_proxy=http://aaa.bbb.cc.net:8080 + +# install normal source +RUN apt-get update && \ + apt-get install -y --no-install-recommends \ + g++ \ + make \ + automake \ + autoconf \ + bzip2 \ + unzip \ + wget \ + sox \ + libtool \ + git \ + subversion \ + zlib1g-dev \ + gfortran \ + ca-certificates \ + patch \ + ffmpeg \ + valgrind \ + libssl-dev \ + vim \ + curl + +# cmake +RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \ + cd /opt && \ + tar -zxvf cmake-3.18.0.tar.gz && \ + cd cmake-3.18.0 && \ + ./bootstrap && \ + make && \ + make install && \ + rm -rf cmake-3.18.0.tar.gz && \ + find /opt/cmake-3.18.0 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \ + cd - + +# flac +RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz && \ + cd /opt && \ + xz -d flac-1.3.2.tar.xz && \ + tar -xvf flac-1.3.2.tar && \ + cd flac-1.3.2 && \ + ./configure && \ + make && make install && \ + rm -rf flac-1.3.2.tar && \ + find /opt/flac-1.3.2 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \ + cd - + +RUN pip install kaldiio graphviz && \ + conda install -y -c pytorch torchaudio + +#install k2 from source +RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \ + cd /opt/k2 && \ + python3 setup.py install && \ + cd - + +# install lhotse +RUN pip install git+https://github.com/lhotse-speech/lhotse + +RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \ + cd /workspace/icefall && \ + pip install -r requirements.txt + +ENV PYTHONPATH /workspace/icefall:$PYTHONPATH + +WORKDIR /workspace/icefall \ No newline at end of file diff --git a/docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8/Dockerfile b/docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8/Dockerfile index 746c2c4f3c..7a14a00ad8 100644 --- a/docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8/Dockerfile +++ b/docker/Ubuntu18.04-pytorch1.7.1-cuda11.0-cudnn8/Dockerfile @@ -1,7 +1,13 @@ FROM pytorch/pytorch:1.7.1-cuda11.0-cudnn8-devel -# install normal source +# ENV http_proxy=http://aaa.bbb.cc.net:8080 \ +# https_proxy=http://aaa.bbb.cc.net:8080 +RUN rm /etc/apt/sources.list.d/cuda.list && \ + rm /etc/apt/sources.list.d/nvidia-ml.list && \ + apt-key del 7fa2af80 + +# install normal source RUN apt-get update && \ apt-get install -y --no-install-recommends \ g++ \ @@ -21,20 +27,25 @@ RUN apt-get update && \ patch \ ffmpeg \ valgrind \ - libssl-dev \ - vim && \ - rm -rf /var/lib/apt/lists/* - - -RUN mv /opt/conda/lib/libcufft.so.10 /opt/libcufft.so.10.bak && \ + libssl-dev \ + vim \ + curl + +# Add new keys and reupdate +RUN curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub | apt-key add - && \ + curl -fsSL https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub | apt-key add - && \ + echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \ + echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \ + rm -rf /var/lib/apt/lists/* && \ + mv /opt/conda/lib/libcufft.so.10 /opt/libcufft.so.10.bak && \ mv /opt/conda/lib/libcurand.so.10 /opt/libcurand.so.10.bak && \ mv /opt/conda/lib/libcublas.so.11 /opt/libcublas.so.11.bak && \ mv /opt/conda/lib/libnvrtc.so.11.0 /opt/libnvrtc.so.11.1.bak && \ - mv /opt/conda/lib/libnvToolsExt.so.1 /opt/libnvToolsExt.so.1.bak && \ - mv /opt/conda/lib/libcudart.so.11.0 /opt/libcudart.so.11.0.bak + # mv /opt/conda/lib/libnvToolsExt.so.1 /opt/libnvToolsExt.so.1.bak && \ + mv /opt/conda/lib/libcudart.so.11.0 /opt/libcudart.so.11.0.bak && \ + apt-get update && apt-get -y upgrade # cmake - RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \ cd /opt && \ tar -zxvf cmake-3.18.0.tar.gz && \ @@ -45,11 +56,7 @@ RUN wget -P /opt https://cmake.org/files/v3.18/cmake-3.18.0.tar.gz && \ rm -rf cmake-3.18.0.tar.gz && \ find /opt/cmake-3.18.0 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \ cd - - -#kaldiio - -RUN pip install kaldiio - + # flac RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz && \ cd /opt && \ @@ -62,15 +69,8 @@ RUN wget -P /opt https://downloads.xiph.org/releases/flac/flac-1.3.2.tar.xz && find /opt/flac-1.3.2 -type f \( -name "*.o" -o -name "*.la" -o -name "*.a" \) -exec rm {} \; && \ cd - -# graphviz -RUN pip install graphviz - -# kaldifeat -RUN git clone https://github.com/csukuangfj/kaldifeat.git /opt/kaldifeat && \ - cd /opt/kaldifeat && \ - python setup.py install && \ - cd - - +RUN pip install kaldiio graphviz && \ + conda install -y -c pytorch torchaudio=0.7.1 #install k2 from source RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \ @@ -79,14 +79,13 @@ RUN git clone https://github.com/k2-fsa/k2.git /opt/k2 && \ cd - # install lhotse -RUN pip install torchaudio==0.7.2 -RUN pip install git+https://github.com/lhotse-speech/lhotse -#RUN pip install lhotse +RUN pip install git+https://github.com/lhotse-speech/lhotse + +RUN git clone https://github.com/k2-fsa/icefall /workspace/icefall && \ + cd /workspace/icefall && \ + pip install -r requirements.txt + +ENV PYTHONPATH /workspace/icefall:$PYTHONPATH -# install icefall -RUN git clone https://github.com/k2-fsa/icefall && \ - cd icefall && \ - pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple - -ENV PYTHONPATH /workspace/icefall:$PYTHONPATH +WORKDIR /workspace/icefall