-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vault crashes after unseal due to uninitialised tokenStore when using a custom plug-in #3241
Comments
I'm unsure if this is going in the right direction of fixing it or just moving the problem further down: The relevant error logs are
|
This might be as simple as moving the credential mount loading to happen before normal mount loading. Haven't thought through all of the possible consequences yet, though. |
@calvn throwing this your way |
@jefferai - that's sort of what I did, but this leads to another irrecoverable situation due to the plugin not being able to initialise its TLS rpc connection I believe. In any case I think the plugin loading mechanism should probably be a bit more robust so that a failure to mount or execute the plugin doesn't break Vault. One thing I didn't mention is that this seemed to break the whole three node setup, as the mount configuration is shared among the nodes. |
@kirilmonzo Obviously we do not want the plugin loading mechanism to break Vault. Bugs exist though :-) |
@jefferai Absolutely! Just wanted to make sure that I haven't left that bit out. Also really appreciate you jumping on this so quickly - please feel free to give me a shout if you need any more details. :-) |
I was able to reproduce the panic and got a fix underway (by basically loading secret and credential plugin backends after |
Hi @calvn - Thanks! I think the |
I am able to reproduce the |
I am testing with TLS enabled, but in my case I am getting |
If you're testing with TLS enabled but your cert is from an untrusted CA you probably need to add -tls-skip-verify to the plugin arfs. If you're not using TLS you may need to add a custom -address? Possibly the address isn't being encoded in the token if there is no redirect_addr set on Vault. |
@kirilmonzo I think I got a fix, but got to clean up the branch and include tests before opening the PR. If you wan to test it out on your end in the meantime, you can give the f-setup-plugins branch a try. |
@calvn Thanks! I'll give it a go tomorrow and report back. |
@calvn the version in the branch seems to fix the initial problem, but now there's another issue due to an error mounting the plugin. Believe this might be due to there being two mount entries for the same plugin/mountpoint. As this prevents unsealing Vault I'm not sure if it's possible to modify the entries in this version. Maybe some error handling for duplicate entries might help? Logs below
Additionally there might be an issue in one of the vendor'd dependencies. Seems to be fine after commenting out the test.
|
Not returning error here https://github.com/hashicorp/vault/blob/f-setup-plugins/vault/mount.go#L777 seems to fix this, but unsure if it just masks a different problem, e.g. does that make the table consistent or just masks the inconsistency? |
How did you get it to a state where it had two mounts at the same endpoint when sealed? Regarding the build error on |
@calvn believe it may have happened whilst trying a couple of fixes locally, where I disabled plugin loading entirely, but that may have not properly updated the Output from executing
Logs:
Cool, I'll up my local version to 1.9 :-) |
I am not quite sure why the entry still exists after unmount, maybe @jefferai could chime in on this. |
Without knowing what the various fixes were that were tried locally I really can't comment other than to say that normally this is not an issue that should be encountered. |
@jefferai I believe this is due to excluding the plugins in the setupMounts call. This probably has lead to a retry as the table was not modified. Is there an API I can use to edit the table directly? If not, that's fine. If there's time I think it's still worth making this a bit more resilient, i.e. an attempt to mount on top of an existing mount should not lead to an infinite loop in the unseal routine. @calvn the fix in the branch seems to resolve the initially reported issue. About to reinitialise the Vault nodes and see if I hit any other problems. Thanks for your help! Separate to this issue, think I may have spotted another small issue and was wondering if you can reproduce it - if a non-existing plugin is mounted (e.g. type in the plugin name) it can't be unmounted. Let me know if it's more appropriate to report this is in a separate issue ticket. Command output:
Logs:
|
Hi -- can you open a separate ticket for that issue? Thanks! |
@kirilmonzo We realized that the current changes on branch is only a partial fix. Currently, it only works for secret plugin backends, since the dummy backend returned has |
@calvn Thanks for investigating this further and working out a proper full fix! |
Hi @calvn, just tried deploying the latest version (4d53b7d) from #3255 and having some trouble mounting the plugin. Logs below:
This is with a fresh installation. The previous set up was using d542582 with the above mount executing successfully. If it helps think this was due to a change somewhere between 4d53b7d and bd75790. |
Did you rebuild your plugin binary? And are you using |
@kirilmonzo also please pull from Master in case you didn't get a plugin version bump PR in your copy that came in pretty late. And as @calvn said you need to rebuild your plugin fresh. |
@kirilmonzo Also please build from master in case you're building against the final version of the PR, because the plugin bump is only in master and it's necessary. |
Great! Re-closing :-) |
Environment:
Cluster of 3 nodes running on Ubuntu 14.04 with Cassandra storage backend.
Vault Version:
Vault v0.8.1 ('a7105536d613c0dce40d5439ae88fe0c5271298e')
(Built from master, but the same issue occurred with the "stock" Vault v0.8.1 ('8d76a41854608c547a233f2e6292ae5355154695'
) )Operating System/Architecture:
Ubuntu 14.04 -
Linux host 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Vault Config File:
Startup Log Output:
Expected Behavior:
Vault should successfully shutdown, unseal and load the plug-in. The plug-in worked after when it was loaded. Additionally if there are any problematic plugins they should be skipped during the initial mount load run, retried at the end and then eventually dropped.
Actual Behavior:
The Vault service crashes due to the
tokenStore
being nil at https://github.com/hashicorp/vault/blob/master/vault/wrapping.go#L111From what I can tell this is because the setupMounts call that loads the plugin is before the setupCredentials call, which initialises the tokenStore.
Steps to Reproduce:
This issue is caused by a custom plugin. I can't share the whole plug-in, but it uses the same structure as the mock plugin inside Vault. As the error occurs during mount, rather than executing the APIs believe this should be okay. Please see source below:
main.go
plugin/backend.go
DO NOT attempt this on your production cluster, as it will brick it!
go build -o myplugin ./main.go
/opt/vault/plugins
plugin_directory
to your vault.conf/hcl, e.g.plugin_directory = "/opt/vault/plugins"
vault write sys/plugins/catalog/myplugin sha_256=$(shasum -a 256 myplugin|cut -d ' ' -f 1) command="myplugin"
executed in/opt/vault/plugins
vault mount -path=myplugir -plugin-name=myplugin plugin
Important Factoids:
The plugin has
BackendType
=logical.TypeLogical
. This crash leads to an irrecoverable failure as it's impossible to unseal Vault and remove the plugin.This issue does not occur when Vault is ran in dev mode, as it's always unsealed.
The text was updated successfully, but these errors were encountered: