You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API,FAQ,Github Issue and AI community to get the answer.Have a nice day!
Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。
需求描述 Feature Description
需求描述
对于高阶开发者,有获取Transformers中间层的attention weights(即shape=
[batch_size, num_heads, query_length, key_length]
)来进行模型设计和分析实验的需求。具体场景
目前Paddle的实现
目前是在
MultiHeadAttention
的forward()
函数中,调用paddle.nn.functional
的方法。https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/nn/layer/transformer.py#L420
因为使用了
paddle.nn.functional
的方法(不确定是否会减少内存消耗?),用hook
的方法也无法得到中间变量的结果。虽然在
MultiHeadAttention
中,可以修改need_weights
参数,输出weights。但是对于所有已经构建好的模型(尤其是Ernie模型),改变这个参数,需要重新调整模型每一层的输入输出,较为繁琐。该需求在PaddleNLP的issue里面也有提到过,但感觉这和
paddle.nn.layer.transformer
更相关。替代实现 Alternatives
一个简单的方法,就是不使用
paddle.nn.functional
的方法,而是使用Layer
的模块。这样,通过hook的方法,在后期也能很快地得到。而且也完全兼容已有代码和模型。The text was updated successfully, but these errors were encountered: