Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDARelativeDifferencePrior #1408

Merged
merged 51 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
5745b8c
first attempt at integration with stir
Imraj-Singh Apr 25, 2024
2c4fa32
attempting to add to registry
Imraj-Singh Apr 25, 2024
52c4943
registry add and cmake cuda package/include dirs
Imraj-Singh Apr 25, 2024
f41728e
changes to cudarelativedifference prior const etc
Imraj-Singh Apr 25, 2024
d3f3732
updated Cmakelists
Imraj-Singh Apr 25, 2024
063633e
attempting to get the registries to work...
Imraj-Singh Apr 25, 2024
483cc13
update cuda error
Imraj-Singh Apr 25, 2024
7e26bd4
Merge branch 'UCL:master' into cuda_stir
Imraj-Singh May 16, 2024
0249e93
Register errors
Imraj-Singh May 17, 2024
843cd13
Merge branch 'UCL:master' into cuda_stir
Imraj-Singh Jun 4, 2024
5eec80b
error handling in cmake
Imraj-Singh Jun 4, 2024
2e90b17
Merge branch 'UCL:master' into cuda_stir
Imraj-Singh Jun 30, 2024
13def4f
Not working, but added the functionality and the registery works than…
Imraj-Singh Jul 4, 2024
5ddce33
Added small test, constructors not inheriting correctly causing an error
Imraj-Singh Jul 4, 2024
1a7d4d4
Merge branch 'UCL:master' into cuda_stir
Imraj-Singh Jul 4, 2024
1c61543
using parent constructors in RegisteredParsingObject
KrisThielemans Jul 4, 2024
5b3f768
fix CUDARDP constructors and set_up etc
KrisThielemans Jul 4, 2024
8cc358f
fix set_up() signature for both RDP versions
KrisThielemans Jul 4, 2024
08d0f81
run clang-format on .cu files
KrisThielemans Jul 4, 2024
227c48e
[recon_test_pack] add a test for parallelproj and CUDARDP
KrisThielemans Jul 4, 2024
95e8428
minor clean-up of CUDA RDP
KrisThielemans Jul 4, 2024
aed4a1d
[CMake] use check_language(CUDA)
KrisThielemans Jul 4, 2024
f163578
run dos2unix
KrisThielemans Jul 4, 2024
f79f95b
[CMake] only compile cuda_rdptest if CUDA is enabled
KrisThielemans Jul 4, 2024
35937dd
disable using constructors for SWIG
KrisThielemans Jul 5, 2024
9e8daae
[recon_test_pack] remove set -e to avoid early exit
KrisThielemans Jul 6, 2024
2e45fdc
[SWIG] reduce some SWIG warnings
KrisThielemans Jul 6, 2024
304f121
fix CUDA error handling
KrisThielemans Jul 6, 2024
8825485
create helper functions for Arrays <-> CUDA device
KrisThielemans Jul 6, 2024
50ae299
more *Prior::set_* functions require set_up()
KrisThielemans Jul 6, 2024
51ef4fc
CudaRDP: move weights and kappa to set_up
KrisThielemans Jul 6, 2024
06ec3de
CudaRDP: use float in kernel for value
KrisThielemans Jul 6, 2024
3e04184
don't run CUDA tests on GHA
KrisThielemans Jul 6, 2024
a5ccc4c
CudaRDP: minor clean-up
KrisThielemans Jul 6, 2024
d2f6d6d
correct check for kappa
KrisThielemans Jul 6, 2024
c31953e
CudaRDP: more checks and document 3x3x3 limitation
KrisThielemans Jul 7, 2024
a0114ab
CudaRDP: put cppdim3 type in the class definition
KrisThielemans Jul 7, 2024
ef66303
add CudaRDP to test_priors (but fails)
KrisThielemans Jul 7, 2024
c634a0d
Cuda RDP fixes
KrisThielemans Jul 7, 2024
165da28
decrease verbosity of SeperableGaussianFilter
KrisThielemans Jul 7, 2024
7ac03fd
correct array_to_device for non-contiguous case
KrisThielemans Jul 7, 2024
fa4d7fe
correct test on kappa characteristics in all priors
KrisThielemans Jul 7, 2024
ccc962a
add and correct tests for priors with kappa (still fails on RDP)
KrisThielemans Jul 7, 2024
dcd3b86
reduce eps for numerical Hessian test
KrisThielemans Jul 7, 2024
3e43ca7
make running of CUDA tests optional
KrisThielemans Jul 7, 2024
66c082b
skip CUDA test if nvidia-smi not found
KrisThielemans Jul 8, 2024
7a2cf76
CudaRDP: honour elemT template argument
KrisThielemans Jul 8, 2024
44dad5a
update release notes on the CUDA RDP [ci skip]
KrisThielemans Jul 8, 2024
d4f2c9b
added ctest for CUDA vs non-CUDA RDP
KrisThielemans Jul 8, 2024
05cdfaf
added RDP timings to stir_timings
KrisThielemans Jul 8, 2024
1f0d692
disable CUDA RDP tests if no CUDA found
KrisThielemans Jul 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ option(DISABLE_NLOHMANN_JSON "disable use of nlohmann JSON libraries" OFF)
option(STIR_ENABLE_EXPERIMENTAL "disable use of STIR experimental code" OFF) # disable by default
option(DISABLE_NiftyPET_PROJECTOR "disable use of NiftyPET projector" OFF)
option(DISABLE_Parallelproj_PROJECTOR "disable use of Parallelproj projector" OFF)
option(DISABLE_STIR_CUDA "disable use of STIR cuda" OFF)
OPTION(DOWNLOAD_ZENODO_TEST_DATA "download zenodo data for tests" OFF)
option(DISABLE_UPENN "disable use of UPENN filetypes" OFF)

Expand Down Expand Up @@ -186,6 +187,24 @@ else()
message(STATUS "NiftyPET projector support disabled.")
endif()

if(NOT DISABLE_STIR_CUDA)
## check
find_package(CUDAToolkit REQUIRED)
find_package(CUDA REQUIRED)
if("${CMAKE_VERSION}" VERSION_LESS "3.23")
message(FATAL_ERROR "CMake 3.23 or newer is required to use CMAKE_CUDA_ARCHITECTURES set to 'all'")
else()
set(CMAKE_CUDA_ARCHITECTURES "all")
endif()
if (CUDAToolkit_FOUND)
if(STIR_WITH_CUDA)
enable_language(CUDA)
message(STATUS "STIR CUDA support enabled. Using CUDA version ${CUDAToolkit_VERSION}.")
set(STIR_WITH_CUDA ON)
endif()
endif()
endif()

# Parallelproj
if(NOT DISABLE_Parallelproj_PROJECTOR)
find_package(parallelproj 1.3.4 CONFIG)
Expand Down
3 changes: 3 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,9 @@ add_library(stir_registries OBJECT ${STIR_REGISTRIES})
# TODO, really should use stir_libs.cmake
target_include_directories(stir_registries PRIVATE ${STIR_INCLUDE_DIR})
target_include_directories(stir_registries PRIVATE ${Boost_INCLUDE_DIR})
if (STIR_WITH_CUDA)
target_include_directories(stir_registries PRIVATE ${CUDA_INCLUDE_DIRS})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really hope we don't need CUDA_INCLUDE_DIRS for the registries. STIR .h files ideally have no CUDA includes at all. It's going to create trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we'd want the target to be the registeries as we need the cuda_runtime.h when building cuda files that will be in the registry... Also I couldn't quite figure out where else to put the target_include_directories, where would you suggest?

Copy link
Collaborator

@KrisThielemans KrisThielemans May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see,cuda_runtime.h should NOT be included in src/include/stir/recon_buildblock/CUDA/CudaRelativeDifferencePrior.h, but in the .cxx/.cu. That means that the registries wouldn't need to know about the CUDA include dir, therefore you don't need the target_include_directories` at all. (The .cu files need it, but they get it by depending on CUDA::cudart target).

endif()

# go and look for CMakeLists.txt files in all those directories
foreach(STIR_DIR ${STIR_DIRS} ${STIR_TEST_DIRS})
Expand Down
6 changes: 6 additions & 0 deletions src/cmake/STIRConfig.cmake.in
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,12 @@ if(@STIR_WITH_NiftyPET_PROJECTOR@)
set(STIR_WITH_NiftyPET_PROJECTOR TRUE)
endif()

if(@STIR_WITH_CUDA@)
find_package(CUDAToolkit REQUIRED)
enable_language(CUDA)
set(STIR_WITH_CUDA TRUE)
endif()

if(@STIR_WITH_Parallelproj_PROJECTOR@)
find_package(parallelproj REQUIRED CONFIG)
set(STIR_WITH_Parallelproj_PROJECTOR TRUE)
Expand Down
2 changes: 2 additions & 0 deletions src/cmake/STIRConfig.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,8 @@ namespace stir {

#cmakedefine STIR_WITH_NiftyPET_PROJECTOR

#cmakedefine STIR_WITH_CUDA

#cmakedefine STIR_WITH_Parallelproj_PROJECTOR
#cmakedefine parallelproj_built_with_CUDA

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#ifndef __stir_recon_buildblock_CudaRelativeDifferencePrior_h__
#define __stir_recon_buildblock_CudaRelativeDifferencePrior_h__

#include "stir/recon_buildblock/RelativeDifferencePrior.h"
#include "stir/DiscretisedDensity.h"
#include <cuda_runtime.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to cxx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done but dim3 is cuda specific, I changed it for a native struct and removed the cuda_runtime include, aren't the other two needed?

#include "stir/Succeeded.h"

START_NAMESPACE_STIR

template <typename elemT>
class CudaRelativeDifferencePrior : public RelativeDifferencePrior<elemT> {
public:
using RelativeDifferencePrior<elemT>::RelativeDifferencePrior;
// Name which will be used when parsing a GeneralisedPrior object
static const char* const registered_name;

// Constructors
CudaRelativeDifferencePrior();
CudaRelativeDifferencePrior(const bool only_2D, float penalization_factor, float gamma, float epsilon);

// Overridden methods
double compute_value(const DiscretisedDensity<3, elemT>& current_image_estimate) override;
void compute_gradient(DiscretisedDensity<3, elemT>& prior_gradient, const DiscretisedDensity<3, elemT>& current_image_estimate) override;

// New methods
Succeeded set_up_cuda(shared_ptr<DiscretisedDensity<3, elemT>> const& target_sptr);
protected:
int z_dim, y_dim, x_dim;
dim3 grid_dim, block_dim;
bool cuda_set_up = false;
};

END_NAMESPACE_STIR

#endif // __stir_recon_buildblock_CudaRelativeDifferencePrior_h__
10 changes: 9 additions & 1 deletion src/recon_buildblock/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -126,10 +126,14 @@ if (STIR_WITH_Parallelproj_PROJECTOR)
)
endif()

if (STIR_WITH_CUDA)
list(APPEND ${dir_LIB_SOURCES}
CUDA/CudaRelativeDifferencePrior.cu
)
endif()

include(stir_lib_target)


if (STIR_MPI)
target_link_libraries(recon_buildblock PUBLIC ${MPI_CXX_LIBRARIES})
endif()
Expand All @@ -155,3 +159,7 @@ if (STIR_WITH_Parallelproj_PROJECTOR)
target_link_libraries(recon_buildblock PUBLIC parallelproj::parallelproj_cuda)
endif()
endif()

if (STIR_WITH_CUDA)
target_link_libraries(recon_buildblock PUBLIC CUDA::cudart)
endif()
Loading