-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug with O1 and FusedLayerNorm #760
Comments
Same issues here! Thanks |
I also ran into this error. It seems like in O1 FusedLayerNorm can only accept FP32 inputs. |
Same issues! |
mark |
Same issue |
This happens because O1 needs functions to be declared to be whitelisted, so that inputs are converted to half. https://github.com/NVIDIA/apex/blob/master/apex/amp/README.md The solution is to add @amp.half_function on top of the fused_layer_norm function. |
Cuda version: 10.1.243
Torch version: 1.3.1
FusedLayerNorm (patched with O1 amp) seems to not accept input with type Half.
In the following snippet:
The snippet works fine, but If I replace nn.LayerNorm with FusedLayerNorm, it will show the following bug:
Context: I have been using O2 for a long time and there is a problem with loading/saving checkpoint, thats why I want to change to O1. I could make my code work with O1 by adding several type conversion, but currently the speed isn't ideal. Is that the case (especially for Sequence2Sequence models)?
Thank you for your help!
Best
The text was updated successfully, but these errors were encountered: