-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InvalidArgumentError: Input matrix is not invertible. #39
Comments
The most likely thing is grossly misspecified hyperparameters. What's the data you're using? I generally rescale the data to unit standard deviation to avoid having to set them by hand. Another issue could be a nan in the data. Also, are you using float64 or float32? |
I have rescaled the data to unit SD and the data type is float32. And I have verified that there are no NaNs in the data. |
Could you try with tf.float64 (in gpflowrc). Sometimes that is a cause of instability. |
(and I'm assuming jitter is 1e-6) |
Thank you so much. It works! |
Also, please could you kindly tell the reasons why the variational parameters q_mu and q_sqrt could possibly turn to nan on increasing the layers? |
When using the natural gradient optimizer the actual gradient step takes place in the natural parameters, which are unconstrained. That is, not all values for the natural parameters are valid (because of positive definiteness). Sometimes gradients steps are too large and the gradient step moves to values that are invalid, resulting in a nan update to q_sqrt. It is actually possible to take natural gradient steps in other parameterization, but in practice it doesn't seem to work so well. See this paper for details. |
Thank you. It makes sense. Also, I am also facing a scenario wherein variational parameters are getting updated during the learning process, but not the kernel parameters. I wrote a new kernel by initialising the kernel parameters using gpflow.params.Parameter( ). Any pointers please to make the kernel parameters get updated while optimising? |
If you're optimizing hyperparameters then you need an additional optimizer. I tend to alternate between nat grad steps and adam steps. See, for example: https://github.com/hughsalimbeni/DGPs_with_IWVI/blob/3f6fab39586f9e45dbc26c6dec91394f9b052e9e/experiments/build_models.py#L293 |
Thank you. And, what is the intuition behind going for two optimizers? Will adam alone not suffice for learning both? |
Yes, and that is indeed what I used to do. This paper https://arxiv.org/abs/1905.03350 looks at this issue in more detail. |
Hi all,
I'm facing the foll. issue while executing Deep Gaussian Process SVI for a two-layer model.
I have tried adding jitter, centered the input data, tried various hyperparameter specifications, upgrading gpflow version, but couldn't resolve the error.
Any pointers, please! Thank you!
InvalidArgumentError (see above for traceback): Input matrix is not invertible.
[[Node: gradients/DGP-2c82c62a-25/conditional/base_conditional/Cholesky_grad/MatrixTriangularSolve = MatrixTriangularSolve[T=DT_FLOAT, adjoint=false, lower=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DGP-2c82c62a-25/conditional/base_conditional/Cholesky, gradients/DGP-2c82c62a-25/conditional/base_conditional/Cholesky_grad/eye/MatrixDiag)]]
The full error trace is as follows:
File "/home/jaya/jayashree/cdgp_experiments/wconv_rbf.py", line 112, in
m_dgp2 = make_dgp(2)
File "/home/jaya/jayashree/cdgp_experiments/wconv_rbf.py", line 103, in make_dgp
num_outputs=num_classes)
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/core/compilable.py", line 90, in init
self.build()
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/core/node.py", line 156, in build
self._build()
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/models/model.py", line 81, in _build
likelihood = self._build_likelihood()
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/decors.py", line 67, in tensor_mode_wrapper
result = method(obj, *args, **kwargs)
File "/home/jaya/jayashree/cdgp_experiments/dgp.py", line 106, in _build_likelihood
L = tf.reduce_sum(self.E_log_p_Y(self.X, self.Y))
File "/home/jaya/jayashree/cdgp_experiments/dgp.py", line 95, in E_log_p_Y
Fmean, Fvar = self._build_predict(X, full_cov=False, S=self.num_samples)
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/decors.py", line 67, in tensor_mode_wrapper
result = method(obj, *args, **kwargs)
File "/home/jaya/jayashree/cdgp_experiments/dgp.py", line 87, in _build_predict
Fs, Fmeans, Fvars = self.propagate(X, full_cov=full_cov, S=S)
File "/home/jaya/.local/lib/python3.5/site-packages/gpflow/decors.py", line 67, in tensor_mode_wrapper
result = method(obj, *args, **kwargs)
File "/home/jaya/jayashree/cdgp_experiments/dgp.py", line 76, in propagate
F, Fmean, Fvar = layer.sample_from_conditional(F, z=z, full_cov=full_cov)
File "/home/jaya/jayashree/cdgp_experiments/layers.py", line 111, in sample_from_conditional
mean, var = self.conditional(X, full_cov=full_cov)
File "/home/jaya/jayashree/cdgp_experiments/layers.py", line 96, in conditional
mean, var = single_sample_conditional(X_flat)
File "/home/jaya/jayashree/cdgp_experiments/layers.py", line 84, in single_sample_conditional
full_cov=full_cov, white=True)
InvalidArgumentError (see above for traceback): Input matrix is not invertible.
[[Node: gradients/DGP-2c82c62a-25/conditional/base_conditional/Cholesky_grad/MatrixTriangularSolve = MatrixTriangularSolve[T=DT_FLOAT, adjoint=false, lower=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](DGP-2c82c62a-25/conditional/base_conditional/Cholesky, gradients/DGP-2c82c62a-25/conditional/base_conditional/Cholesky_grad/eye/MatrixDiag)]]
The text was updated successfully, but these errors were encountered: