Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(destinations): Stop writing resources when channel is closed #460

Merged
merged 4 commits into from
Dec 27, 2022

Conversation

erezrokah
Copy link
Member

@erezrokah erezrokah commented Dec 5, 2022

Summary

Related to cloudquery/cloudquery#5152.

This fixes an issue when if the channel gets closed (for example there's an error during write), we report the EOF error instead of the original error.

Scenario from the issue:

  1. Users runs sync --no-migrate on a non initialized database
  2. Write fails on the first table write, returning an error from this function:
    func (s *DestinationServer) Write2(msg pb.Destination_Write2Server) error {
    , causing the channel to close
  3. The next iteration of for resource := range resources starts, receiving an EOF error since the channel is closed, and EOF error is reported.

This is only a patch. I think we should separate application level errors (e.g. write) from protocol/communication level errors. I'll open a separate issue for that

Also, another related fix coming


Use the following steps to ensure your PR is ready to be reviewed

  • Read the contribution guidelines 🧑‍🎓
  • Run go fmt to format your code 🖊
  • Lint your changes via golangci-lint run 🚨 (install golangci-lint here)
  • Update or add tests 🧪
  • Ensure the status checks below are successful ✅

@github-actions github-actions bot added fix and removed fix labels Dec 5, 2022
@github-actions
Copy link

github-actions bot commented Dec 5, 2022

⏱️ Benchmark results

  • DefaultConcurrency-2 resources/s: 11,844
  • Glob-2 ns/op: 191.6
  • TablesWithChildrenDefaultConcurrency-2 resources/s: 30,145
  • BufferedScanner-2 ns/op: 11.78
  • LogReader-2 ns/op: 38.83

Copy link
Member

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had one question given I didn't understand the fix.

Also, I don't know if possible but given that those affect all plugin can we add tests to that (e2e in memory tests available here https://github.com/cloudquery/plugin-sdk/tree/main/serve) ?

@@ -264,6 +264,10 @@ func (c *DestinationClient) Write(ctx context.Context, source string, syncTime t
Source: source,
Timestamp: timestamppb.New(syncTime),
}); err != nil {
if err == io.EOF {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure i understand what this solved. give in othercase this just returns an error and doesn't write to the channel while with break it actually sends CloseAndRecv.

Copy link
Member Author

@erezrokah erezrokah Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write works in stream mode (it's not a synchronous RPC), so if we want the correct response/error from the write request we need to call RecvMsg which is only done in CloseAndRecv (and only once actually).

So SendMsg will never return application level error as it doesn't wait for the message to be received by the server.

Technically I think we should call RecvMsg for each SendMsg per the docs:
> SendMsg does not wait until the message is received by the server. An untimely stream closure may result in lost messages. To ensure delivery, users should ensure the RPC completed successfully using RecvMsg.

We don't need ⬆️ as we don't have Bidi-streaming (I think), but still the response is received from CloseAndRecv, see https://grpc.io/docs/languages/go/generated-code/#client-streaming-methods-1

@erezrokah
Copy link
Member Author

can we add tests to that (e2e in memory tests available here https://github.com/cloudquery/plugin-sdk/tree/main/serve) ?

Sure I'll see how we can add a test for this

@erezrokah
Copy link
Member Author

@yevgenypats added a test in 03ab5d9.
Note sure how to use the memory plugin for this, as in order to reproduce the issue we need to set up both a client and a server (I guess I can create a test method that starts the memory plugin as a local binary).

Without the fix the error will just be "EOF".

Once #461 is merged the error should change from context canceled to table not exists

@erezrokah
Copy link
Member Author

@kodiakhq kodiakhq bot merged commit 5590845 into cloudquery:main Dec 27, 2022
@erezrokah erezrokah deleted the fix/stop_send_eof branch December 27, 2022 14:08
kodiakhq bot pushed a commit that referenced this pull request Dec 27, 2022
🤖 I have created a release *beep* *boop*
---


## [1.14.0](v1.13.1...v1.14.0) (2022-12-27)


### Features

* Add basic periodic metric INFO logger ([#496](#496)) ([8d1d32e](8d1d32e))


### Bug Fixes

* **destinations:** Stop writing resources when channel is closed ([#460](#460)) ([5590845](5590845))
* Don't hide errors in destination server ([#529](#529)) ([d91f94f](d91f94f))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants