About the activation function #83

HammerWang98 · 2022-05-19T08:27:04Z

Hello , Tim. Have u experimented that replacing the sigmoid with softmax in the logits layer? I tried to run your code, but I found that I got a lower MRR score than the result your paper with sigmoid. When I changed it to softmax, I got a higher MRR score than u. I want to cite your paper in our experiments, could u tell me how to address this problem and use your result as our base. Thank u, looking forward to your reply.

TimDettmers · 2022-06-06T18:30:51Z

Hi! Thanks for raising this issue. While mathematically the logistic sigmoid should be the right thing to do, I have heard before that using a softmax actually performs better in practice. Some authors use softmax in practice. The focal loss or the enhanced version suggested by @saeedizade might be even better. What I would suggest in the experiments for your paper is to use a better framework with more baselines across different models. I would recommend PyKEEN which is actively developed and has a ConvE baseline. It should give you very robust baselines that makes it easy to compare to.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the activation function #83

About the activation function #83

HammerWang98 commented May 19, 2022

TimDettmers commented Jun 6, 2022

About the activation function #83

About the activation function #83

Comments

HammerWang98 commented May 19, 2022

TimDettmers commented Jun 6, 2022