Fail to build Tensorrt-LLM, error related to build ucxx #2826

tjliupeng · 2025-02-26T10:08:01Z

System Info

I try to build tensorrt-llm on X86. Below is the host information:
OS: Ubuntu 20.04
CPU: X86-84
Docker: 24.0.2
GPU Driver: 535.183.01
CUDA Version: 12.2

I follow the guide from https://nvidia.github.io/TensorRT-LLM/installation/build-from-source-linux.html.

I clone the Tensorrt-llm git repo and use the main branch.
Run the command: make -C docker release_build

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I follow the guide from https://nvidia.github.io/TensorRT-LLM/installation/build-from-source-linux.html.

I clone the Tensorrt-llm git repo and use the main branch.
Run the command:

make -C docker release_build

Expected behavior

Can build successfully.

actual behavior

After the long-time installation, get the error below:

#0 1015.1 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
#0 1015.1 dask-cuda 24.10.0 requires pynvml<11.5,>=11.0.0, but you have pynvml 12.0.0 which is incompatible.
#0 1015.1 Successfully installed StrEnum-0.4.15 accelerate-1.4.0 aenum-3.1.15 bandit-1.7.7 build-1.2.2.post1 cfgv-3.4.0 click_option_group-0.5.6 colored-2.3.0 coverage-7.6.12 datasets-2.19.2 diffusers-0.32.2 dill-0.3.8 distlib-0.3.9 distro-1.9.0 evaluate-0.4.3 fastapi-0.115.4 flashinfer-python-0.2.2 fsspec-2024.3.1 graphviz-0.20.3 h5py-3.12.1 huggingface-hub-0.29.1 identify-2.6.8 jieba-0.42.1 jiter-0.8.2 jsonlines-4.0.0 lark-1.2.2 mako-1.3.9 multiprocess-0.70.16 mypy-1.15.0 nltk-3.9.1 nodeenv-1.9.1 nvidia-cuda-nvrtc-cu12-12.8.61 nvidia-ml-py-12.570.86 nvidia-modelopt-0.23.2 nvidia-modelopt-core-0.23.2 nvidia-nccl-cu12-2.25.1 onnx_graphsurgeon-0.5.5 openai-1.64.0 optimum-1.24.0 ordered-set-4.1.0 parameterized-0.9.0 pbr-6.1.1 peft-0.14.0 pillow-10.3.0 pre-commit-4.1.0 py-1.11.0 pyarrow-hotfix-0.6 pybind11-stubgen-2.5.3 pynvml-12.0.0 pyproject_hooks-1.2.0 pytest-8.3.4 pytest-asyncio-0.25.3 pytest-cov-6.0.0 pytest-csv-3.0.0 pytest-forked-1.6.0 pytest-split-0.10.0 pytest-timeout-2.3.1 rouge-1.0.1 rouge_score-0.1.2 sentencepiece-0.2.0 starlette-0.41.3 stevedore-5.4.1 tokenizers-0.21.0 transformers-4.47.1 uvicorn-0.34.0 virtualenv-20.29.2 xxhash-3.5.0
#0 1015.1 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
#0 1016.2
#0 1016.2 [notice] A new release of pip is available: 24.3.1 -> 25.0.1
#0 1016.2 [notice] To update, run: python3 -m pip install --upgrade pip
#0 1017.4 DEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/nvfuser-0.2.23a0+6627725-py3.12-linux-x86_64.egg is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
#0 1017.4 DEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/opt_einsum-3.4.0-py3.12.egg is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
#0 1017.4 DEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/looseversion-1.3.0-py3.12.egg is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
#0 1017.4 DEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/lightning_utilities-0.12.0.dev0-py3.12.egg is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
#0 1017.4 DEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/lightning_thunder-0.2.0.dev0-py3.12.egg is deprecated. pip 25.1 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at pypa/pip#12330
#0 1018.1 -- The CXX compiler identification is GNU 13.3.0
#0 1018.1 -- Detecting CXX compiler ABI info
#0 1018.2 -- Detecting CXX compiler ABI info - done
#0 1018.3 -- Check for working CXX compiler: /usr/bin/c++ - skipped
#0 1018.3 -- Detecting CXX compile features
#0 1018.3 -- Detecting CXX compile features - done
#0 1018.3 -- NVTX is disabled
#0 1018.3 -- Importing batch manager
#0 1018.3 -- Importing executor
#0 1018.3 -- Importing nvrtc wrapper
#0 1018.3 -- Importing internal cutlass kernels
#0 1018.3 -- Building PyTorch
#0 1018.3 -- Building Google tests
#0 1018.3 -- Building benchmarks
#0 1018.3 -- Not building C++ micro benchmarks
#0 1018.4 -- TensorRT-LLM version: 0.18.0.dev2025021800
#0 1018.4 -- Looking for a CUDA compiler
#0 1021.4 -- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
#0 1021.4 -- CUDA compiler: /usr/local/cuda/bin/nvcc
#0 1021.5 -- GPU architectures: 80-real;86-real;89-real;90-real;100-real;120-real
#0 1021.6 -- The C compiler identification is GNU 13.3.0
#0 1023.2 -- The CUDA compiler identification is NVIDIA 12.8.61 with host compiler GNU 13.3.0
#0 1023.2 -- Detecting C compiler ABI info
#0 1023.3 -- Detecting C compiler ABI info - done
#0 1023.4 -- Check for working C compiler: /usr/bin/cc - skipped
#0 1023.4 -- Detecting C compile features
#0 1023.4 -- Detecting C compile features - done
#0 1023.4 -- Detecting CUDA compiler ABI info
#0 1027.2 -- Detecting CUDA compiler ABI info - done
#0 1027.4 -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
#0 1027.4 -- Detecting CUDA compile features
#0 1027.4 -- Detecting CUDA compile features - done
#0 1027.4 -- Found CUDAToolkit: /usr/local/cuda/targets/x86_64-linux/include (found version "12.8.61")
#0 1027.4 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
#0 1027.6 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
#0 1027.6 -- Found Threads: TRUE
#0 1027.7 -- CUDA library status:
#0 1027.7 -- version: 12.8.61
#0 1027.7 -- libraries: /usr/local/cuda/lib64
#0 1027.7 -- include path: /usr/local/cuda/targets/x86_64-linux/include
#0 1027.7 -- pybind11 v3.0.0 dev1
#0 1027.8 -- Found PythonInterp: /usr/bin/python3 (found suitable version "3.12.3", minimum required is "3.8")
#0 1027.8 -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.12.so
#0 1027.8 -- Performing Test HAS_FLTO
#0 1028.1 -- Performing Test HAS_FLTO - Success
#0 1028.1 -- ========================= Importing and creating target nvinfer ==========================
#0 1028.1 -- Looking for library nvinfer
#0 1028.1 -- Library that was found /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/libnvinfer.so
#0 1028.1 -- ==========================================================================================
#0 1028.1 -- CUDAToolkit_VERSION 12.8 is greater or equal than 11.0, enable -DENABLE_BF16 flag
#0 1028.1 -- CUDAToolkit_VERSION 12.8 is greater or equal than 11.8, enable -DENABLE_FP8 flag
#0 1028.1 -- CUDAToolkit_VERSION 12.8 is greater or equal than 12.8, enable -DENABLE_FP4 flag
#0 1028.4 -- Found MPI_C: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1")
#0 1028.9 -- Found MPI_CXX: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1")
#0 1028.9 -- Found MPI: TRUE (found version "3.1")
#0 1028.9 CMAKE_CUDA_FLAGS: --expt-extended-lambda --expt-relaxed-constexpr --fatbin-options -compress-all
#0 1028.9 -- COMMON_HEADER_DIRS: /src/tensorrt_llm/cpp
#0 1029.7 -- Found Python3: /usr/bin/python3.12 (found version "3.12.3") found components: Interpreter Development Development.Module Development.Embed
#0 1031.9 -- USE_CXX11_ABI is set by python Torch to 1
#0 1031.9 -- TORCH_CUDA_ARCH_LIST: 8.0;8.6;8.9;9.0;10.0;12.0
#0 1031.9 CMake Warning at CMakeLists.txt:530 (message):
#0 1031.9 Ignoring environment variable TORCH_CUDA_ARCH_LIST=7.5 8.0 8.6 9.0 10.0
#0 1031.9 12.0+PTX
#0 1031.9
#0 1031.9
#0 1032.0 -- Found Python executable at /usr/bin/python3.12
#0 1032.0 -- Found Python libraries at /usr/lib/x86_64-linux-gnu
#0 1036.2 -- Found CUDA: /usr/local/cuda (found version "12.8")
#0 1036.2 -- Found CUDAToolkit: /usr/local/cuda/include (found version "12.8.61")
#0 1036.3 -- Caffe2: CUDA detected: 12.8
#0 1036.3 -- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
#0 1036.3 -- Caffe2: CUDA toolkit directory: /usr/local/cuda
#0 1036.5 -- Caffe2: Header version is: 12.8
#0 1037.0 -- Found Python: /usr/bin/python3.12 (found version "3.12.3") found components: Interpreter
#0 1037.0 CMake Warning at /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message):
#0 1037.0 Failed to compute shorthash for libnvrtc.so
#0 1037.0 Call Stack (most recent call first):
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
#0 1037.0 CMakeLists.txt:563 (find_package)
#0 1037.0
#0 1037.0
#0 1037.0 CMake Warning (dev) at /usr/local/lib/python3.12/dist-packages/cmake/data/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
#0 1037.0 The package name passed to `find_package_handle_standard_args` (nvtx3) does
#0 1037.0 not match the name of the calling package (Caffe2). This can lead to
#0 1037.0 problems in calling code that expects `find_package` result variables
#0 1037.0 (e.g., `_FOUND`) to follow a certain pattern.
#0 1037.0 Call Stack (most recent call first):
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:178 (find_package_handle_standard_args)
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
#0 1037.0 CMakeLists.txt:563 (find_package)
#0 1037.0 This warning is for project developers. Use -Wno-dev to suppress it.
#0 1037.0
#0 1037.0 -- Could NOT find nvtx3 (missing: nvtx3_dir)
#0 1037.0 CMake Warning at /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/public/cuda.cmake:184 (message):
#0 1037.0 Cannot find NVTX3, find old NVTX instead
#0 1037.0 Call Stack (most recent call first):
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
#0 1037.0 CMakeLists.txt:563 (find_package)
#0 1037.0
#0 1037.0
#0 1037.0 -- USE_CUDNN is set to 0. Compiling without cuDNN support
#0 1037.0 -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
#0 1037.0 -- USE_CUDSS is set to 0. Compiling without cuDSS support
#0 1037.0 -- USE_CUFILE is set to 0. Compiling without cuFile support
#0 1037.0 -- Added CUDA NVCC flags for: -gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_100,code=sm_100;-gencode;arch=compute_120,code=sm_120
#0 1037.0 CMake Warning at /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
#0 1037.0 static library kineto_LIBRARY-NOTFOUND not found.
#0 1037.0 Call Stack (most recent call first):
#0 1037.0 /usr/local/lib/python3.12/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:121 (append_torchlib_if_found)
#0 1037.0 CMakeLists.txt:563 (find_package)
#0 1037.0
#0 1037.0
#0 1037.0 -- Found Torch: /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch.so
#0 1037.0 -- TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
#0 1037.3 CMake Error at CMakeLists.txt:9 (include):
#0 1037.3 include could not find requested file:
#0 1037.3
#0 1037.3 rapids-cmake
#0 1037.3
#0 1037.3
#0 1037.3 CMake Error at CMakeLists.txt:10 (include):
#0 1037.3 include could not find requested file:
#0 1037.3
#0 1037.3 rapids-cpm
#0 1037.3
#0 1037.3
#0 1037.3 CMake Error at CMakeLists.txt:11 (include):
#0 1037.3 include could not find requested file:
#0 1037.3
#0 1037.3 rapids-export
#0 1037.3
#0 1037.3
#0 1037.3 CMake Error at CMakeLists.txt:12 (include):
#0 1037.3 include could not find requested file:
#0 1037.3
#0 1037.3 rapids-find
#0 1037.3
#0 1037.3
#0 1037.9 CMake Error at CMakeLists.txt:54 (rapids_cmake_build_type):
#0 1037.9 Unknown CMake command "rapids_cmake_build_type".
#0 1037.9
#0 1037.9
#0 1037.9 -- The C compiler identification is GNU 13.3.0
#0 1037.9 -- The CXX compiler identification is GNU 13.3.0
#0 1037.9 -- Detecting C compiler ABI info
#0 1037.9 -- Detecting C compiler ABI info - done
#0 1037.9 -- Check for working C compiler: /usr/bin/cc - skipped
#0 1037.9 -- Detecting C compile features
#0 1037.9 -- Detecting C compile features - done
#0 1037.9 -- Detecting CXX compiler ABI info
#0 1037.9 -- Detecting CXX compiler ABI info - done
#0 1037.9 -- Check for working CXX compiler: /usr/bin/c++ - skipped
#0 1037.9 -- Detecting CXX compile features
#0 1037.9 -- Detecting CXX compile features - done
#0 1037.9 -- Configuring incomplete, errors occurred!
#0 1037.9
#0 1037.9 CMake Error at CMakeLists.txt:635 (message):
#0 1037.9 ucxx build failed
#0 1037.9
#0 1037.9
#0 1037.9 -- Configuring incomplete, errors occurred!
#0 1037.9 Traceback (most recent call last):
#0 1037.9 File "/src/tensorrt_llm/scripts/build_wheel.py", line 447, in
#0 1037.9 main(**vars(args))
#0 1037.9 File "/src/tensorrt_llm/scripts/build_wheel.py", line 208, in main
#0 1037.9 build_run(
#0 1037.9 File "/usr/lib/python3.12/subprocess.py", line 571, in run
#0 1037.9 raise CalledProcessError(retcode, process.args,
#0 1037.9 subprocess.CalledProcessError: Command 'cmake -DCMAKE_BUILD_TYPE="Release" -DBUILD_PYT="ON" -DBUILD_PYBIND="ON" -DNVTX_DISABLE="ON" -DBUILD_MICRO_BENCHMARKS=OFF -DTRT_LIB_DIR=/usr/local/tensorrt/targets/x86_64-linux-gnu/lib -DTRT_INCLUDE_DIR=/usr/local/tensorrt/include -S "/src/tensorrt_llm/cpp"' returned non-zero exit status 1.

Dockerfile.multi:77

76 | ARG BUILD_WHEEL_ARGS="--clean --trt_root /usr/local/tensorrt --python_bindings --benchmarks"
77 | >>> RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=/root/.cache/ccache
78 | >>> python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}
79 |

ERROR: failed to solve: process "/bin/bash -c python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}" did not complete successfully: exit code: 1

additional notes

No other notes.

The text was updated successfully, but these errors were encountered:

chuangz0 · 2025-02-28T03:55:27Z

do you run git submodule update --init --recursive before run docker ...

tjliupeng added the bug Something isn't working label Feb 26, 2025

chuangz0 added the triaged Issue has been triaged by maintainers label Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail to build Tensorrt-LLM, error related to build ucxx #2826

Fail to build Tensorrt-LLM, error related to build ucxx #2826

tjliupeng commented Feb 26, 2025

chuangz0 commented Feb 28, 2025

Fail to build Tensorrt-LLM, error related to build ucxx #2826

Fail to build Tensorrt-LLM, error related to build ucxx #2826

Comments

tjliupeng commented Feb 26, 2025

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

Dockerfile.multi:77

76 | ARG BUILD_WHEEL_ARGS="--clean --trt_root /usr/local/tensorrt --python_bindings --benchmarks" 77 | >>> RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=/root/.cache/ccache 78 | >>> python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS} 79 |

additional notes

chuangz0 commented Feb 28, 2025

76 | ARG BUILD_WHEEL_ARGS="--clean --trt_root /usr/local/tensorrt --python_bindings --benchmarks"
77 | >>> RUN --mount=type=cache,target=/root/.cache/pip --mount=type=cache,target=/root/.cache/ccache
78 | >>> python3 scripts/build_wheel.py ${BUILD_WHEEL_ARGS}
79 |