Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config remote reload without rebuild the whole pipeline #3560

Conversation

buptubuntu
Copy link

@buptubuntu buptubuntu commented Jul 3, 2021

Description:
add reloadable interface for component to enable reloadService without rebuild the pipeline

Link to tracking Issue: #2374

Testing:

  • When component implements reloadable interface, the component will not be rebuild during reloadService
  • When the pipeline is changed, all components will be rebuild

@buptubuntu buptubuntu requested review from a team and tigrannajaryan July 3, 2021 11:32
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 3, 2021

CLA Signed

The committers are authorized under a signed CLA.

@tigrannajaryan
Copy link
Member

Please sign the CLA.

@buptubuntu
Copy link
Author

Please sign the CLA.

Done

@tigrannajaryan
Copy link
Member

@pjanotti will you be able to review this?

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@tigrannajaryan
Copy link
Member

@buptubuntu sorry for the delay, I finally was able to look at the PR.

First of all, I believe this is a very important feature, so the contribution is very welcome.

Unfortunately reviewing such a complicated and huge feature in a single PR is very difficult. There is a lot going on here and I have a hard time understanding all of it at once.

I would suggest to break down this work into smaller steps and move incrementally, submitting smaller PRs one by one. For example here are some possible breakdown (in no particular order):

  • Introduce reloadable components.
  • Implement wrappers for non-reloadable components.
  • Equality of component set and decision when to reload or no.
  • etc, there are likely several other parts that can be independent PRs.

Before submitting the individual PRs it would be good to have a sort of a design document that explains how this all is expected to work. We will need to review the design before we commit to it. Ideally it should be reviewed in Collector SIG meeting.

One quick advice on code style: for non-trivial feature like this I would expect to see a lot more comments explaining what is going on in the code and why.

A couple random design questions that I didn't see by just glancing at the code:

  • How do you decide to reload a non-reloadable component: do you stop, then start or start, then stop? I see some code in this starts then stops (e.g. exporters), but this can be wrong for exporters that use exclusive resources (e.g. pull-oriented exporters like Prometheus)
  • How are race conditions handled during reload (data flowing through pipeline while reload happens). I don't see it handled in this PR (maybe it is there and I am missing it).

@tigrannajaryan tigrannajaryan self-assigned this Jul 30, 2021
@buptubuntu
Copy link
Author

@buptubuntu sorry for the delay, I finally was able to look at the PR.

First of all, I believe this is a very important feature, so the contribution is very welcome.

Unfortunately reviewing such a complicated and huge feature in a single PR is very difficult. There is a lot going on here and I have a hard time understanding all of it at once.

I would suggest to break down this work into smaller steps and move incrementally, submitting smaller PRs one by one. For example here are some possible breakdown (in no particular order):

  • Introduce reloadable components.
  • Implement wrappers for non-reloadable components.
  • Equality of component set and decision when to reload or no.
  • etc, there are likely several other parts that can be independent PRs.

Before submitting the individual PRs it would be good to have a sort of a design document that explains how this all is expected to work. We will need to review the design before we commit to it. Ideally it should be reviewed in Collector SIG meeting.

One quick advice on code style: for non-trivial feature like this I would expect to see a lot more comments explaining what is going on in the code and why.

A couple random design questions that I didn't see by just glancing at the code:

  • How do you decide to reload a non-reloadable component: do you stop, then start or start, then stop? I see some code in this starts then stops (e.g. exporters), but this can be wrong for exporters that use exclusive resources (e.g. pull-oriented exporters like Prometheus)
  • How are race conditions handled during reload (data flowing through pipeline while reload happens). I don't see it handled in this PR (maybe it is there and I am missing it).

Thanks for reply

Agree to break down prs, suggestion is quite reasonable

  • Introduce reloadable components.
  • Implement wrappers for non-reloadable components
  • Equality of component set and decision when to reload or no.

The three above I think is the what we can do now.

And after the two above is finished, we can implement the reload logic is two situation:

  • If the components in the pipeline is not changed and all the component is reloadable, we reload the pipeline by calling the reload method
  • If not the situation above, just rebuild the whole pipeline

For the two questions:

  • decide to reload a non-reloadable component
   I didn't take pull-oriented exporters like Prometheus into consideration,and for receivers i just stop and start considering the receivers may use the same port to serve like otel receiver
  • How are race conditions handled during reload
   for reloadable component, the race conditions is handle by the component in the reload method,for no-reloadable processors, I will first create a new processor and start it, then make the new processor to handle new data, then shutdown the old processor.

Above all, I think the three prs below is what I will work on recently:

  • Introduce reloadable components.
  • Implement wrappers for non-reloadable components
  • Equality of component set and decision when to reload or no.

Meanwhile I will complete the design ducoment

Thanks!

@tigrannajaryan
Copy link
Member

I didn't take pull-oriented exporters like Prometheus into consideration,and for receivers i just stop and start considering the receivers may use the same port to serve like otel receiver

So, perhaps the behavior for exporters should be component-specific or maybe we always first stop then start to be safe. This needs to be called out in the design.

for reloadable component, the race conditions is handle by the component in the reload method,for no-reloadable processors, I will first create a new processor and start it, then make the new processor to handle new data, then shutdown the old processor.

That makes sense. However for wrapped components I think the wrapper needs to take care of this and I don't think I see this in the code at the moment. During reload the replacement of the reference to the wrapped component I think currently races with the ConsumeXXX call through that reference.

Meanwhile I will complete the design ducoment

@buptubuntu great!
I highly advise to first put together a design document, present it in the Collector SIG, gather consensus with Collector maintainers and approvers and only after that spend more time on writing code. This way if any maintainers/approvers suggest any changes to the design you will have less work to redo.

And thank you again for working on this. This is an important and non-trivial capability, so it will take some time to get right.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 3, 2021
@bogdandrutu bogdandrutu removed the Stale label Sep 3, 2021
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@tigrannajaryan
Copy link
Member

@buptubuntu I am closing this while awaiting for a design document. Can be reopened when it is ready for review after the design is presented and approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants