Exception handling with Sidekiq
Make Sidekiq play nicely with errors and error-tracking tools.
Some intermittent errors are acceptable - especially with Sidekiq having our back with its robust retry mechanism.
Any unhandled exception causes job to fail -> moved to retry queue -> exponential backoff kicks in.
With the default config, Sidekiq will perform 25 retries over the duration of 21 days.
This makes it a general practice to not
handle rare errors like external APIs failing - we just allow the job to keep retrying
until the external service comes back online. This is assuming that the job is idempotent (running it multiple times is not an issue).
However allowing these jobs to fail, registers errors in monitoring tools like Sentry leading to error-rate spike alerts, incorrect metrics, etc. So we want to use Sidekiq's retries but exclude such errors from Sentry.
We can simply skip the error in Sidekiq configuration:
Raven.configure do |config|
config.excluded_exceptions += ["NotificationWorker::Error"]
end
Make sure to create a custom error. This is global config and you don't want Net::HTTPError
being excluded
from your entire application.
However there is a small issue. Sometimes the external service could be gone for a long time. After all retries have exhausted, Sidekiq will move the job to DeadSet. Even though our original error was "acceptable", we don't want jobs to never be executed and removed from scheduled queues without even knowing. It is always a good idea to be notified of such extreme case. This can be achieved with Sidekiq's death handler:
Sidekiq.configure_server do |config|
# We exclude some worker errors from Sentry
# But if retries exhaust and jobs moved to DeadSet, we should know
config.death_handlers << ->(_job, ex) do
message = "#{job['class']} #{job["jid"]} just died with error #{ex.message}."
Sentry.capture_message(message)
end
end