Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove OSX support, add cudadevrt #5

Merged
merged 2 commits into from
Apr 29, 2020

Conversation

gmarkall
Copy link

The primary aim of this PR is to add the cudadevrt static library, which is needed for dynamic parallelism, grid sync, etc. For example, it is required for this Numba PR: numba/numba#4551

It seems that the OSX packages aren't built anymore:

$ conda search cudatoolkit[subdir=osx-64]
Loading channels: done
# Name                       Version           Build  Channel     
cudatoolkit                      9.0      h41a26b3_0  pkgs/main  

and the recipe doesn't even point to a correct URL for the toolkit, so rather than fixing that up, I've removed the OSX builder instead.

OSX CUDA Toolkit packages have not been published since 9.1, and the
present build.py implementation does not correctly build the package.
cudadevrt is a static library, so we add an additional
platform-specific key and related logic to handle copying static
libraries.
@gmarkall
Copy link
Author

I forgot to mention, example packages built with the recipe in this PR for Linux and Windows are at: https://anaconda.org/gmarkall/cudatoolkit/files

@jakirkham
Copy link

@jjhelmus, would it be possible to get a review here? 🙂

@leofang
Copy link

leofang commented Mar 24, 2020

Holy xxxx! This is not in cudatoolkit yet?! CuPy also needs it for the very same reasons...

@leofang
Copy link

leofang commented Mar 24, 2020

It seems even with this addition, CuPy would still be unable to locate it...This is our searching strategy: https://github.com/cupy/cupy/blob/775cb7d9ccfbee8aebb46ff7c310dddc3a2b72af/cupy/cuda/compiler.py#L78-L100
Basically, we look for

  1. CUDA_PATH (@jakirkham, is this set by cudatoolkit??)
  2. the parent directory of nvcc (on CF this does not come with cudatoolkit)
  3. /usr/local/cuda (not applicable to CF)

Am I right that none of these is met by conda-forge's packaging?

@jakirkham
Copy link

Interesting did not realize this was needed by CuPy too. Should we open an issue on the cupy-feedstock to discuss further? 😉

@gmarkall
Copy link
Author

With the cudatoolkit package installed (and also an installation of the toolkit at /usr/local/cuda, I get:

In [1]: import cupy                                                                                                                                          

In [2]: from cupy.cuda import get_cuda_path                                                                                                                  

In [3]: get_cuda_path()                                                                                                                                      
Out[3]: '/usr/local/cuda'

In [4]: from numba.cuda.cudadrv.libs import test                                                                                                             

In [5]: test()                                                                                                                                               
Finding cublas from Conda environment
	located at /raid/gmarkall/miniconda3/envs/numbaenv/lib/libcublas.so.10.0.130
	trying to open library...	ok
Finding cusparse from Conda environment
	located at /raid/gmarkall/miniconda3/envs/numbaenv/lib/libcusparse.so.10.0.130
	trying to open library...	ok
Finding cufft from Conda environment
	located at /raid/gmarkall/miniconda3/envs/numbaenv/lib/libcufft.so.10.0.145
	trying to open library...	ok
Finding curand from Conda environment
	located at /raid/gmarkall/miniconda3/envs/numbaenv/lib/libcurand.so.10.0.130
	trying to open library...	ok
Finding nvvm from Conda environment
	located at /raid/gmarkall/miniconda3/envs/numbaenv/lib/libnvvm.so.3.3.0
	trying to open library...	ok
Finding libdevice from Conda environment
	searching for compute_20...	ok
	searching for compute_30...	ok
	searching for compute_35...	ok
	searching for compute_50...	ok
Out[5]: True

It looks like CuPy doesn't find the cudatoolkit binaries in general? (c.f. Numba, which prioritises the conda package - e.g. https://github.com/numba/numba/blob/master/numba/cuda/cuda_paths.py#L85 in order of search)

@leofang
Copy link

leofang commented Mar 24, 2020

Right, CuPy's searching strategy (aka get_cuda_path()) would not be aware of cudatoolkit, but since the CF CuPy is correctly linked against it, nothing would go wrong in production...

@gmarkall Let me know if I am wrong: Numba can find cudatoolkit in runtime because it needs to dynamically load the CUDA libraries? In CuPy the linking is done at compile time (and for CF, package installation time), so they don't need this.

Sounds like the simplest solution is to set CUDA_PATH in cupy-feedstock? No code change is needed. @jakirkham Is this possible?

@jakirkham
Copy link

@gmarkall, how did you install cupy? Here is what I see with the conda-forge package:

In [1]: from cupy.cuda import get_cuda_path                                     

In [2]: get_cuda_path()                                                         
Out[2]: '/datasets/jkirkham/miniconda/envs/rapids13dev'
``

@leofang
Copy link

leofang commented Mar 24, 2020

@jakirkham I think it's because you have nvcc somewhere in your conda env. Do you happen to build nvcc-feedstock locally? This is what I get from a fresh CF cupy (conda create -n CF_cupy_test -c conda-forge python=3.7 cupy cudatoolkit=10.0):

$ conda activate CF_cupy_test
$ python
Python 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cupy.cuda import get_cuda_path 
>>> get_cuda_path()
'/usr/local/cuda'

@jakirkham
Copy link

Ah ok. That might be. Yeah I installed nvcc_linux-64 into this environment.

@jakirkham
Copy link

Thanks for identifying this issue. 🙂

It seems like an easy fix upstream. Have raised issue ( cupy/cupy#3222 ).

As to setting CUDA_PATH, I'm guessing you mean setting this at runtime? We could add an activation script to the recipe. This would also work. Though generally is not preferred when there is another option like fixing upstream's path detection logic.

@gmarkall
Copy link
Author

@gmarkall, how did you install cupy? Here is what I see with the conda-forge package:

In [1]: from cupy.cuda import get_cuda_path                                     

In [2]: get_cuda_path()                                                         
Out[2]: '/datasets/jkirkham/miniconda/envs/rapids13dev'
``

I did:

$ conda install cupy
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.8.2
  latest version: 4.8.3

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /raid/gmarkall/miniconda3/envs/numbaenv

  added / updated specs:
    - cupy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    cudnn-7.6.5                |       cuda10.0_0       165.0 MB
    cupy-6.0.0                 |   py37hc0ce245_0        10.2 MB
    fastrlock-0.4              |   py37he6710b0_0          29 KB
    nccl-1.3.5                 |       cuda10.0_0         1.3 MB
    ------------------------------------------------------------
                                           Total:       176.6 MB

The following NEW packages will be INSTALLED:

  cudnn              pkgs/main/linux-64::cudnn-7.6.5-cuda10.0_0
  cupy               pkgs/main/linux-64::cupy-6.0.0-py37hc0ce245_0
  fastrlock          pkgs/main/linux-64::fastrlock-0.4-py37he6710b0_0
  nccl               pkgs/main/linux-64::nccl-1.3.5-cuda10.0_0


Proceed ([y]/n)? y


Downloading and Extracting Packages
nccl-1.3.5           | 1.3 MB    | ################################################################################################################## | 100% 
cudnn-7.6.5          | 165.0 MB  | ################################################################################################################## | 100% 
fastrlock-0.4        | 29 KB     | ################################################################################################################## | 100% 
cupy-6.0.0           | 10.2 MB   | ################################################################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

The cudatoolkit package is a 10.0 one I self-built (basically this PR backported to 10.0 because I'm on a machine with an older driver that can't support 10.1 or 10.2):

$ conda list
# packages in environment at /raid/gmarkall/miniconda3/envs/numbaenv:
#
# Name                    Version                   Build  Channel
...
cudatoolkit               10.0.130             h6bb024c_0    <unknown>
...

@leofang
Copy link

leofang commented Mar 24, 2020

We could add an activation script to the recipe. This would also work.

Yes this is exactly what I meant!

Though generally is not preferred when there is another option like fixing upstream's path detection logic.

I feel this is our own issue (providing a customized CUDA Toolkit that is split apart in a highly non-trivial way), and if possible we should just fix it with our own (conda) tools. It's a bit unfair to bother upstream folks for this (even if the final PR comes from one of us) as this CUDA usage was not what was originally intended (by NVIDIA or any sensible people) at all.

@jakirkham
Copy link

Thanks Graham! I'm also able to reproduce without nvcc_linux-64 installed.

@jakirkham
Copy link

Leo, let's find a new forum for this discussion. I don't want us to derail Graham's PR 😉

@leofang
Copy link

leofang commented Mar 24, 2020

Sounds good, redirecting to conda-forge/cupy-feedstock#46.

@leofang
Copy link

leofang commented Mar 24, 2020

@gmarkall Do you know if cudadevrt is picky on the CUDA driver version (which is beyond control of conda)?

@gmarkall
Copy link
Author

@gmarkall Do you know if cudadevrt is picky on the CUDA driver version (which is beyond control of conda)?

I'm afraid I don't know, but I would guess it's probably no more picky than the other components (e.g. cudart).

@leofang
Copy link

leofang commented Apr 8, 2020

Just curious, what's blocking this PR? LGTM.

@jakirkham
Copy link

cc @jjhelmus

@leofang
Copy link

leofang commented Apr 28, 2020

Ping @jjhelmus again. Without including cudadevrt it could make downstream packages fail if dynamic parallelism or cooperative group is required, see, e.g.:
https://dev.azure.com/nsls2forge/nsls2forge/_build/results?buildId=2455&view=logs&j=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e&t=6d4b912b-175d-51da-0fd9-4d30fe1eb4e7&l=2832

@jjhelmus
Copy link

This is great, sorry for taking so long to review.
I'll bump the build number and work on getting new packages out. I will post an update when they are available.

@jjhelmus jjhelmus merged commit d14a6dd into AnacondaRecipes:master Apr 29, 2020
@gmarkall
Copy link
Author

Many thanks for the review and merge!

@jakirkham
Copy link

Awesome! Thanks Jonathan! 😄

@jjhelmus
Copy link

cudatoolkit 10.2.89 build 1 packages which include cudadevrt are now available for linux-64 and win-64 in defaults.

The specific build strings files are:
linux-64: cudatoolkit-10.2.89-hfd86e86_1
win-64: cudatoolkit-10.2.89-h74a9793_1

@leofang
Copy link

leofang commented Apr 30, 2020

Thanks Jonathan! I confirm this works.

@leofang
Copy link

leofang commented Jun 9, 2020

Hi @jjhelmus Is it possible to add cudadevrt to all cudatoolkit, not just to 10.2?

@jakirkham
Copy link

I raised issue ( #8 ) about how we might manage/maintain multiple cudatoolkit versions. Had a couple ideas on how we might do this. Thoughts welcome 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants