Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in MixtureGaussianHead? #1

Open
pdradx opened this issue Jun 20, 2024 · 0 comments
Open

Error in MixtureGaussianHead? #1

pdradx opened this issue Jun 20, 2024 · 0 comments

Comments

@pdradx
Copy link

pdradx commented Jun 20, 2024

Hi!
Thank you for your great paper.
And, trying to implement GMAC algorithm I found possible error with standart deviation in MixtureGaussianHead.
So questionable string is:
sigs = torch.sqrt(self.min_var * F.softplus(self.linear2(x)) + self.min_var)

The question is: why do we multiply softplus by minimum possible variation?
So if we chose minimum variation too low or equals to zero that effectively disable output of this network because we can't expect that linear layer can freely give us output of magnitude more than 10 or 100. And for current defaults it must output magnitudes of 10000 and more to give us variation of just 1 in magnitude. And it can't overcome min_var of zero at all!

So i think two possible correct solutions is:
sigs = torch.sqrt(F.softplus(self.linear2(x)) + self.min_var)
or
sigs = torch.sqrt(self.max_var * F.sigmoid(self.linear2(x)) + self.min_var)

The first variant is not limiting maximum sigs.
Second is clipping bouth minimum and maximum sigs.
And as a side note softplus function is poorly scales for very big amplitudes. So I recommend to parameterize sigmoids by exponent.
Something like:
sigs = torch.sqrt(F.exp(self.linear2(x)) + self.min_var)

Or we can shift start amplitudes to zeros by changing biases in linear2 or by shift in expression:
sigs = torch.sqrt(F.exp(self.linear2(x) - 2) + self.min_var)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant