Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the activation function #83

Open
HammerWang98 opened this issue May 19, 2022 · 2 comments
Open

About the activation function #83

HammerWang98 opened this issue May 19, 2022 · 2 comments

Comments

@HammerWang98
Copy link

Hello , Tim. Have u experimented that replacing the sigmoid with softmax in the logits layer? I tried to run your code, but I found that I got a lower MRR score than the result your paper with sigmoid. When I changed it to softmax, I got a higher MRR score than u. I want to cite your paper in our experiments, could u tell me how to address this problem and use your result as our base. Thank u, looking forward to your reply.

@TimDettmers
Copy link
Owner

Hi! Thanks for raising this issue. While mathematically the logistic sigmoid should be the right thing to do, I have heard before that using a softmax actually performs better in practice. Some authors use softmax in practice. The focal loss or the enhanced version suggested by @saeedizade might be even better. What I would suggest in the experiments for your paper is to use a better framework with more baselines across different models. I would recommend PyKEEN which is actively developed and has a ConvE baseline. It should give you very robust baselines that makes it easy to compare to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@TimDettmers @HammerWang98 and others