Releases: JuliaGPU/AMDGPU.jl
Releases Β· JuliaGPU/AMDGPU.jl
v0.7.1
v0.7.0
AMDGPU v0.7.0
Merged pull requests:
- Enable 5.4 JLLs on LLVM <16 (#503) (@jpsamaroo)
- Use refs instead of pointers to get a slightly friendlier abi (#504) (@gbaraldi)
- Bump actions/checkout from 3 to 4 (#506) (@dependabot[bot])
- Add ROCm mixed mode (#508) (@pxl-th)
- Do runtime ROCm discovery (#509) (@pxl-th)
- Switch tests to ReTestItems.jl (#511) (@pxl-th)
- Use non-blocking synchronization by default (#512) (@pxl-th)
- Bump GPUCompiler to 0.25 (#513) (@pxl-th)
- Add a method for getrf! (#514) (@amontoison)
- Use branches instead of 'ifelse' (#519) (@pxl-th)
- Interface getrf_batched and getri_batched (#520) (@amontoison)
- Bring back CI (#523) (@pxl-th)
- Add workgroup synchronization primitives (#524) (@pxl-th)
- Use HIP for retrieving GCN arch (#525) (@pxl-th)
- Mention Julia 1.10+ requirement for Navi 3 (#526) (@pxl-th)
Closed issues:
- Runtime Locking (#64)
- 2x slower AMDGPU.jl kernel compared to HIP (#331)
- sincos() x3.5 slower than separate sin()/cos() calls (#341)
- HSA memory fault using
AMDGPU.rand()
on device β 1 (#386) - WARNING: could not import AMDGPU.device_libs_path into Compiler (#434)
sincos
intrinsic is broken with GPUCompiler 0.24 (#502)- Navi 3 causes
malloc(): unsorted double linked list corrupted
(#518)
v0.6.1
AMDGPU v0.6.1
Merged pull requests:
- Fix rocrand rng offset (#493) (@tgymnich)
- Bump GPUCompiler to 0.24 (#495) (@pxl-th)
- [rocSOLVER] Add a method for geqrf! (#497) (@amontoison)
- [rocSOLVER] Interface omgqr! (#498) (@amontoison)
- Fix REPL display (#501) (@pxl-th)
Closed issues:
v0.6.0
AMDGPU v0.6.0
Closed issues:
- Functions to map to/from HIP agent IDs (#5)
- Use refcounting for memory management (#207)
- Make
unsafe_copy3d!
TLS compatible (#421)
Merged pull requests:
- Allow specifying buffer type in ctor (#486) (@pxl-th)
- Remove default device stuff (#487) (@pxl-th)
- [rosSPARSE] Support matrix-vector products with COO format (#488) (@amontoison)
- Add more multi-gpu tests (#489) (@pxl-th)
- SpMV supports CSC matrices (#490) (@amontoison)
- Correctly switch to TLS context (#491) (@pxl-th)
- Cleanup logging (#492) (@pxl-th)
v0.5.7
v0.5.6
AMDGPU v0.5.6
Closed issues:
- Implement exponential back-off for signal wait (#84)
- Implement occupancy estimator (#112)
- AMDGPU test errors on gfx908 (Ubuntu 20.04, ROCm 4.2, Julia 1.6.1) (#138)
randn(Float32, 111)
andrand(Float32, 111)
fail (#161)- Feature request: allow
hsa_amd_memory_copy_async
to pick a queue (#204) - HSA memory test hang the GPU in CI (#226)
- AMDGPU.agents() doesn't see GPU (#236)
Merged pull requests:
- enable dependabot for GitHub actions (#474) (@ranocha)
- Bump actions/cache from 1 to 3 (#475) (@dependabot[bot])
- Bump codecov/codecov-action from 1 to 3 (#476) (@dependabot[bot])
- Bump actions/checkout from 2 to 3 (#477) (@dependabot[bot])
- Fix typo in tests & bump GPUCompiler (#479) (@pxl-th)
- Add sorting kernels (#480) (@pxl-th)
- Switch to GPUArrays buffer management (#481) (@pxl-th)
- Julia 1.10 enablement (#482) (@pxl-th)
- Add rotate, reflect, axby functions (#484) (@pxl-th)
v0.5.5
Merged pull requests:
AMDGPU.@elapsed
by @carstenbauer in #471- Expose
hipDeviceCanAccessPeer
by @carstenbauer in #472 - Bump GPUCompiler by @pxl-th in #473
v0.5.4
v0.5.3
AMDGPU v0.5.3
Closed issues:
- AMDGPU.jl master is broken on Julia 1.7 (#372)
- Failure calling upon calling Enzyme
autodiff_deferred
(#444) - Segmentation fault on
hipStreamDestroy
(#449) - Setting HIP_VISIBLE_DEVICES to an invalid ID fails in an unhelpful way (#450)
- hipErrorSharedObjectInitFailed (#451)
- Unexpected error: ccall requires compiler when using QR (#461)
Merged pull requests:
- Add
AMDGPU.@sync
macro (#454) (@luraess) - Add rocSOLVER routines (#456) (@pxl-th)
- Add missing HIP error code (#457) (@pxl-th)
- Add env variable if Navi 2 detected (#458) (@pxl-th)
- Update docs (#459) (@pxl-th)
- Update doc (#460) (@luraess)
- blas: Improve error on missing rocBLAS (#462) (@jpsamaroo)
- rocSPARSE support (#463) (@pxl-th)
- Check libraries are functional once during init (#464) (@pxl-th)