Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure withSentry reports API (serverless) errors from Vercel #4044

Conversation

jmurty
Copy link

@jmurty jmurty commented Oct 11, 2021

Potential fix for #3917 to ensure Sentry error reporting clean-up is
done -- transaction finished and error flushed -- when run from a
Vercel deployment, by explicitly calling the clean-up code in two
places: directly from the withSentry handler wrapper function, as
well as the existing monkey-patched res.end().

In Vercel the res.end() function is not (or is not always) called,
which means the prior approach that relied on monkey-patching of
that function for clean-up did not work in Vercel.

Note 1: this is a naive fix: I'm not sure why res.end() isn't
called as expected or if there might be a better target for monkey-
patching.

Note 2: a new __flushed variable is used to avoid running the
clean-up code twice, should both the explicit and the monkey-patched
path be run. See the TODO asking whether this flag should be set at
the beginning, instead of the end, of the clean-up function
finishTransactionAndFlush()

Before submitting a pull request, please take a look at our
Contributing guidelines and verify:

  • If you've added code that should be tested, please add tests.
  • Ensure your code lints and the test suite passes (yarn lint) & (yarn test).

Potential fix for getsentry#3917 to ensure Sentry error reporting clean-up is
done -- transaction finished and error flushed -- when run from a
Vercel deployment, by explicitly calling the clean-up code in two
places: directly from the `withSentry` handler wrapper function, as
well as the existing monkey-patched `res.end()`.

In Vercel the `res.end()` function is not (or is not always) called,
which means the prior approach that relied on monkey-patching of
that function for clean-up did not work in Vercel.

Note 1: this is a naive fix: I'm not sure why res.end() isn't
called as expected or if there might be a better target for monkey-
patching.

Note 2: a new `__flushed` variable is used to avoid running the
clean-up code twice, should both the explicit and the monkey-patched
path be run. See the TODO asking whether this flag should be set at
the beginning, instead of the end, of the clean-up function
`finishTransactionAndFlush()`
@jmurty jmurty requested a review from kamilogorek as a code owner October 11, 2021 02:56
@jmurty
Copy link
Author

jmurty commented Oct 11, 2021

This is failing tests sorry and I'm out of my depth and out of time to figure it out right now. I'll try to get back to it.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2021

This pull request has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@jmurty
Copy link
Author

jmurty commented Nov 1, 2021

I revisited this PR to try and get it to pass tests while still collecting API errors when hosted as Vercel serverless functions. I didn't get very far. The main blocker was collecting transaction/span data from the error response, which isn't possible in Vercel because all processing is halted once a response is sent.

My feeling for now is that it's possible to collect error data from Vercel deployments with a much simpler error capture mechanism, but it is very difficult, if not infeasible, to reliably collect transaction data along with the error. Which is a shame, because transaction/span data is often most useful when an error occurs.

I'm hopeful Sentry's Next.js support will improve by using the Middleware feature that is new in Next.js 12, which seems much better suited to the purpose than the work-arounds that have been necessary so far.

@jmurty
Copy link
Author

jmurty commented Nov 2, 2021

Note that @lobsterkatie is handling this properly over in #4027 so this PR can/should probably be closed

@lobsterkatie
Copy link
Member

lobsterkatie commented Nov 9, 2021

Note that @lobsterkatie is handling this properly over in #4027 so this PR can/should probably be closed

FTR, I switched to something much closer to this approach, as it ended up working better than mine. 🙂

I am indeed going to close this in favor of mine, since mine also includes tests and a little more explanation in comments, but big thanks for helping me think about this in different (more successful!) way!

lobsterkatie added a commit that referenced this pull request Nov 15, 2021
In our nextjs API route wrapper `withSentry`, we capture any errors thrown by the original handler, and then once we've captured them, we rethrow them, so that our capturing of them doesn't interfere with whatever other error handling might go on. Until recently, that was fine, as nextjs didn't actually propagate the error any farther, and so it didn't interfere with our processing pipeline and didn't prevent `res.end()` (on which we rely for finishing the transaction and flushing events to Sentry) from running.

However, Vercel released a change[1] which caused said errors to begin propagating if the API route is running on Vercel. (Technically, it's if the server is running in minimal mode, but all API handlers on vercel do.) A side effect of this change is that when there's an error, `res.end()` is no longer called. As a result, the SDK's work is cut short, and neither errors in API route handlers nor transactions tracing such routes make it to Sentry.

This fixes that, by moving the work of finishing the transaction and flushing events into its own function and calling it not only in `res.end()` but also before we rethrow the error.

(Note: In the cases where there is an error and the server is not running in minimal mode, this means that function will be hit twice, but that's okay, since the second time around it will just no-op, since `transaction.finish()` bails immediately if the transaction is already finished, and `flush()` returns immediately if there's nothing to flush.)

H/t to @jmurty for his work in #4044, which helped me fix some problems in my first approach to solving this problem.

Fixes #3917.

[1] vercel/next.js#26875
lobsterkatie added a commit that referenced this pull request Nov 15, 2021
In our nextjs API route wrapper `withSentry`, we capture any errors thrown by the original handler, and then once we've captured them, we rethrow them, so that our capturing of them doesn't interfere with whatever other error handling might go on. Until recently, that was fine, as nextjs didn't actually propagate the error any farther, and so it didn't interfere with our processing pipeline and didn't prevent `res.end()` (on which we rely for finishing the transaction and flushing events to Sentry) from running.

However, Vercel released a change[1] which caused said errors to begin propagating if the API route is running on Vercel. (Technically, it's if the server is running in minimal mode, but all API handlers on vercel do.) A side effect of this change is that when there's an error, `res.end()` is no longer called. As a result, the SDK's work is cut short, and neither errors in API route handlers nor transactions tracing such routes make it to Sentry.

This fixes that, by moving the work of finishing the transaction and flushing events into its own function and calling it not only in `res.end()` but also before we rethrow the error.

(Note: In the cases where there is an error and the server is not running in minimal mode, this means that function will be hit twice, but that's okay, since the second time around it will just no-op, since `transaction.finish()` bails immediately if the transaction is already finished, and `flush()` returns immediately if there's nothing to flush.)

H/t to @jmurty for his work in #4044, which helped me fix some problems in my first approach to solving this problem.

Fixes #3917.

[1] vercel/next.js#26875
onurtemizkan pushed a commit that referenced this pull request Dec 19, 2021
In our nextjs API route wrapper `withSentry`, we capture any errors thrown by the original handler, and then once we've captured them, we rethrow them, so that our capturing of them doesn't interfere with whatever other error handling might go on. Until recently, that was fine, as nextjs didn't actually propagate the error any farther, and so it didn't interfere with our processing pipeline and didn't prevent `res.end()` (on which we rely for finishing the transaction and flushing events to Sentry) from running.

However, Vercel released a change[1] which caused said errors to begin propagating if the API route is running on Vercel. (Technically, it's if the server is running in minimal mode, but all API handlers on vercel do.) A side effect of this change is that when there's an error, `res.end()` is no longer called. As a result, the SDK's work is cut short, and neither errors in API route handlers nor transactions tracing such routes make it to Sentry.

This fixes that, by moving the work of finishing the transaction and flushing events into its own function and calling it not only in `res.end()` but also before we rethrow the error.

(Note: In the cases where there is an error and the server is not running in minimal mode, this means that function will be hit twice, but that's okay, since the second time around it will just no-op, since `transaction.finish()` bails immediately if the transaction is already finished, and `flush()` returns immediately if there's nothing to flush.)

H/t to @jmurty for his work in #4044, which helped me fix some problems in my first approach to solving this problem.

Fixes #3917.

[1] vercel/next.js#26875
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants