Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement timeouts in checks and check hooks #893

Merged
merged 28 commits into from
Jan 18, 2018
Merged

Conversation

nikkictl
Copy link

What is this change?

Implements a timeout in check and check hook execution, adds the timeout field to CheckConfig.

Why is this change necessary?

Closes #781
Closes #879

Does your change need a Changelog entry?

Yas.

Do you need clarification on anything?

Nah.

Were there any complications while making this change?

There were several. Most of them are documented in #781.

Nikki Attea added 20 commits January 9, 2018 12:26
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
…u-go into feature/check-timeout

Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
…u-go into feature/check-timeout

Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Copy link
Contributor

@palourde palourde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!

Two small things, otherwise it seems we might have to prevent a race condition (see travis unit tests)

// Kill process and all of its children when the timeout has expired.
// context.WithTimeout will not kill child/grandchild processes
// (see issues tagged in https://github.com/sensu/sensu-go/issues/781),
// rather we will use a timer and utility_os package to perform full cleanup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the reference to utility_os 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, yeah!

SetProcessGroup(cmd)
time.AfterFunc(time.Duration(execution.Timeout)*time.Second, func() {
if err := KillProcess(cmd); err != nil {
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log something here, in case we are not able to kill the process?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that something should happen here. We would return an error if the process is already done or if a failure writing to the process file occurs. Perhaps we should return immediately from the ExecuteCommand function, since this is an unexpected case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent about whether or not we should return immediately, since the command executed successfully and it's more about cleaning up our mess 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about it some more, and on the rare instance that the process cleanly, or unexpectedly exits after the timeout but before we kill the process, we should still continue as normal with logging the error. 👍

Nikki Attea added 5 commits January 16, 2018 17:45
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Nikki Attea added 2 commits January 17, 2018 10:14
Signed-off-by: Nikki Attea <nikki@sensu.io>
Signed-off-by: Nikki Attea <nikki@sensu.io>
Copy link
Contributor

@palourde palourde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

giphy

@palourde
Copy link
Contributor

Also, if possible, try to squash the commits when merging into master?

Copy link
Contributor

@mercul3s mercul3s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍕 🌮 🍨 🍔 🌯 this looks good! a squash for some of the commits would be 💯

@nikkictl nikkictl merged commit d093909 into master Jan 18, 2018
@nikkictl nikkictl deleted the feature/check-timeout branch January 18, 2018 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants