Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handler-slack.rb: implement a retry-timeout strategy for contacting s… #83

Merged
merged 3 commits into from
May 17, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 56 additions & 21 deletions bin/handler-slack.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,21 @@ def slack_webhook_url
get_setting('webhook_url')
end

def slack_webhook_retries
# The number of retries to deliver the payload to the slack webhook
get_setting('webhook_retries') || 5
end

def slack_webhook_timeout
# The amount of time (in seconds) to give for the webhook request to complete before failing it
get_setting('webhook_timeout') || 10
end

def slack_webhook_retry_sleep
# The amount of time (in seconds) to wait in between webhook retries
get_setting('webhook_retry_sleep') || 5
end

def slack_icon_emoji
get_setting('icon_emoji')
end
Expand Down Expand Up @@ -149,27 +164,47 @@ def post_data(body)
end
http.use_ssl = true

req = Net::HTTP::Post.new("#{uri.path}?#{uri.query}", 'Content-Type' => 'application/json')

if payload_template.nil?
text = slack_surround ? slack_surround + body + slack_surround : body
req.body = payload(text).to_json
else
req.body = body
end

response = http.request(req)
verify_response(response)
end

def verify_response(response)
case response
when Net::HTTPSuccess
true
else
raise response.error!
end
end
# Implement a retry-timeout strategy to handle slack api issues like network. Should solve #15
begin # retries loop
tries = slack_webhook_retries
Timeout.timeout(slack_webhook_timeout) do
begin # main loop for trying to deliver the message to slack webhook
req = Net::HTTP::Post.new("#{uri.path}?#{uri.query}", 'Content-Type' => 'application/json')

if payload_template.nil?
text = slack_surround ? slack_surround + body + slack_surround : body
req.body = payload(text).to_json
else
req.body = body
end

http.request(req)

# replace verify_response with a rescue within the loop
rescue Net::HTTPServerException => error
if (tries -= 1) > 0
sleep slack_webhook_retry_sleep
puts "retrying incident #{incident_key}... #{tries} left"
retry
else
# raise error for sensu-server to catch and log
puts "slack api failed (retries) #{incident_key}: #{error.response.code} #{error.response.message}: channel '#{slack_channel}', message: #{body}"
exit 1
end
end # of main loop for trying to deliver the message to slack webhook
end # of timeout:do loop
# if the timeout is exceeded, consider this try failed
rescue Timeout::Error
if (tries -= 1) > 0
puts "timeout hit, retrying... #{tries} left"
retry
else
# raise error for sensu-server to catch and log
puts "slack webhook failed (timeout) #{incident_key}: channel '#{slack_channel}', message: #{body}"
exit 1
end
end # of retries loop
end # of post_data

def payload(notice)
client_fields = []
Expand Down