Transformer的实现是否可以考虑输出attention_weights？ #44056

holyseven · 2022-07-04T08:19:36Z

需求描述 Feature Description

需求描述

对于高阶开发者，有获取Transformers中间层的attention weights（即shape=[batch_size, num_heads, query_length, key_length]）来进行模型设计和分析实验的需求。

具体场景

EHealth中SPO任务需要获取Electra中间层结果来计算loss
使用Transformer-based模型时需要中间层attention和结果来进行分析和蒸馏
解释Transformer模型时，attention weights和其对应的梯度，都是非常重要的中间结果

目前Paddle的实现

目前是在MultiHeadAttention的forward()函数中，调用paddle.nn.functional的方法。

https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/nn/layer/transformer.py#L420

因为使用了paddle.nn.functional的方法（不确定是否会减少内存消耗？），用hook的方法也无法得到中间变量的结果。

虽然在MultiHeadAttention中，可以修改need_weights参数，输出weights。但是对于所有已经构建好的模型（尤其是Ernie模型），改变这个参数，需要重新调整模型每一层的输入输出，较为繁琐。

该需求在PaddleNLP的issue里面也有提到过，但感觉这和paddle.nn.layer.transformer更相关。

替代实现 Alternatives

一个简单的方法，就是不使用paddle.nn.functional的方法，而是使用Layer的模块。这样，通过hook的方法，在后期也能很快地得到。而且也完全兼容已有代码和模型。

The text was updated successfully, but these errors were encountered:

paddle-bot-old · 2022-07-04T08:19:38Z

您好，我们已经收到了您的问题，会安排技术人员尽快解答您的问题，请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时，您也可以通过查看官网API文档、常见问题、历史Issue、AI社区来寻求解答。祝您生活愉快～

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the API，FAQ，Github Issue and AI community to get the answer.Have a nice day!

guoshengCS · 2022-07-06T06:07:05Z

你好，我们已经开始在 PaddleNLP 的层面做了，这个是我们重点解决的问题 PaddlePaddle/PaddleNLP#2665 。因为里面会引入ModelOutput这样一些特殊的数据结构，和框架中其他layer的规范会不太一样，所以先行在PaddleNLP中解决

paddle-bot · 2023-07-11T06:32:31Z

Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复，我们将关闭这个issue/pr。
若问题未解决或有后续问题，请随时重新打开，我们会继续跟进。

holyseven added status/new-issue 新建 type/feature-request 新需求申请 labels Jul 4, 2022

paddle-bot-old bot assigned From00 Jul 4, 2022

From00 assigned guoshengCS Jul 4, 2022

paddle-bot-old bot added status/following-up 跟进中 and removed status/new-issue 新建 labels Jul 5, 2022

paddle-bot bot closed this as completed Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer的实现是否可以考虑输出attention_weights？ #44056

Transformer的实现是否可以考虑输出attention_weights？ #44056

holyseven commented Jul 4, 2022

paddle-bot-old bot commented Jul 4, 2022

guoshengCS commented Jul 6, 2022 •

edited

Loading

paddle-bot bot commented Jul 11, 2023

Transformer的实现是否可以考虑输出attention_weights？ #44056

Transformer的实现是否可以考虑输出attention_weights？ #44056

Comments

holyseven commented Jul 4, 2022

需求描述 Feature Description

需求描述

具体场景

目前Paddle的实现

替代实现 Alternatives

paddle-bot-old bot commented Jul 4, 2022

guoshengCS commented Jul 6, 2022 • edited Loading

paddle-bot bot commented Jul 11, 2023

guoshengCS commented Jul 6, 2022 •

edited

Loading