Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cuDNN 9.0 #62498

Merged
merged 2 commits into from
Mar 11, 2024
Merged

Add cuDNN 9.0 #62498

merged 2 commits into from
Mar 11, 2024

Conversation

jeng1220
Copy link
Collaborator

@jeng1220 jeng1220 commented Mar 7, 2024

PR types

New features

PR changes

OPs

Description

This PR adds cuDNN 9.0. The cuDNN 9.0 has significantly improved the performance. For example, FP16 and BF16 fused flash attention engine performance has been significantly improved for NVIDIA GPUs:

  • Speed-up of up to 50% over cuDNN 8.9.7 on Hopper GPUs.
  • Speed-up of up to 100% over cuDNN 8.9.7 on Ampere GPUs.

See release-notes#cudnn-9-0-0 for details.

The major issues are that:

  1. The definition of CUDNN_VERSION has been changed to CUDNN_MAJOR * 10000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL from CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL.
  2. Multiple RNN API are deprecated. See cudnn/api#api-changes-900 for details.

This PR resolves above 2 issues.

Copy link

paddle-bot bot commented Mar 7, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Mar 7, 2024
@jeng1220 jeng1220 added the NVIDIA label Mar 7, 2024
@jeng1220
Copy link
Collaborator Author

jeng1220 commented Mar 7, 2024

cc @onecatcn for vis

risemeup1
risemeup1 previously approved these changes Mar 7, 2024
Copy link
Contributor

@risemeup1 risemeup1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@risemeup1
Copy link
Contributor

cc @onecatcn for vis

请问下这个PR什么时候能够合入?

@onecatcn
Copy link
Contributor

onecatcn commented Mar 7, 2024

2024-03-07 08:46:39 0. You must have one RD (XiaoguangHu01,chenwhql,zhiqiu,Xreki,luotao1,qili93,Aurelius84) approval for the usage of const_cast.
2024-03-07 08:46:39 1. You must have one RD (jiahy0825, zyfncg, chenwhql, YuanRisheng or heavyrain-lzy) approval for including "gflags/gflags.h" or "glog/logging.h" headerfile in paddle/phi headerfiles( paddle/phi/kernels/gpu/cudnn_lstm_cache.h paddle/phi/kernels/gpu/rnn_functor.h). Recommend including third party headers in phi source files(.cc) instead of phi headerfiles(.h). Because if phi headerfiles include third party headers like "gflags.h" or "logging.h", error might occur when outside developers use phi headerfiles directly.
2024-03-07 08:46:39 There are 2 approved errors.

@onecatcn onecatcn requested review from jiahy0825 and zhiqiu March 7, 2024 12:10
@jiahy0825
Copy link
Contributor

jiahy0825 commented Mar 7, 2024

Can you remove useless "gflags/gflags.h" or "glog/logging.h" in the header file, or move these codes to .cc file?

You can see the reason below:

Recommend including third party headers in phi source files(.cc) instead of phi headerfiles(.h). Because if phi headerfiles include third party headers like "gflags.h" or "logging.h", error might occur when outside developers use phi headerfiles directly.

Copy link
Contributor

@jiahy0825 jiahy0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for const_cast

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zyfncg zyfncg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zyfncg zyfncg merged commit c8e8be2 into PaddlePaddle:develop Mar 11, 2024
30 checks passed
hitywt pushed a commit to hitywt/Paddle that referenced this pull request Mar 11, 2024
* fix cuDNN 9 problem

* remove glog
ForFishes pushed a commit to ForFishes/Paddle that referenced this pull request Sep 3, 2024
* fix cuDNN 9 problem

* remove glog
GuoxiaWang pushed a commit that referenced this pull request Sep 3, 2024
* fix cuDNN 9 problem

* remove glog

Co-authored-by: Jeng Bai-Cheng <jeng1220@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers NVIDIA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants