-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Pod tracing #3491
[WIP] Pod tracing #3491
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@greghaynes: 20 warnings.
In response to this:
Release Note
NONE
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
pkg/tracing/config/tracing.go
Outdated
if enable, ok := cfgMap[enableKey]; ok { | ||
if enableBool, err := strconv.ParseBool(enable); err != nil { | ||
return nil, fmt.Errorf("Failed parsing tracing config %q: %v", enableKey, err) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint indent: if block ends with a return statement, so drop this else and outdent its block (move short variable declaration to its own line if necessary). More info.
pkg/tracing/config/tracing.go
Outdated
if debug, ok := cfgMap[debugKey]; ok { | ||
if debugBool, err := strconv.ParseBool(debug); err != nil { | ||
return nil, fmt.Errorf("Failed parsing tracing config %q", debugKey) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint indent: if block ends with a return statement, so drop this else and outdent its block (move short variable declaration to its own line if necessary). More info.
pkg/tracing/tracer.go
Outdated
} | ||
} | ||
|
||
func CreateTracer(cfg *config.Config, reporter zipkinreporter.Reporter, serviceName, hostPort string) (*zipkin.Tracer, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint comments: exported function CreateTracer should have comment or be unexported. More info.
pkg/tracing/http.go
Outdated
"go.uber.org/zap" | ||
) | ||
|
||
type TracerRefGetter func(context.Context) (*TracerRef, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint comments: exported type TracerRefGetter should have comment or be unexported. More info.
pkg/tracing/cache.go
Outdated
}, nil | ||
} | ||
|
||
type ReporterFactory func(cfg *config.Config) (zipkinreporter.Reporter, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint comments: exported type ReporterFactory should have comment or be unexported. More info.
pkg/tracing/cache.go
Outdated
reporterRef *refcountedReporter | ||
} | ||
|
||
func (tr *TracerRef) Ref() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint comments: exported method TracerRef.Ref should have comment or be unexported. More info.
pkg/tracing/cache.go
Outdated
tr.reporterRef.refs.RLock() | ||
} | ||
|
||
func (tr *TracerRef) Done() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint comments: exported method TracerRef.Done should have comment or be unexported. More info.
d09b9e2
to
7eef4d1
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: greghaynes If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
06f7e61
to
d90bb9a
Compare
This span clearly shows time spent waiting for our endpoint to become ready
During cold start it is very useful to see spans on how long the k8s resources we wait for (pods) take in various states relative to our cold-start request.
The following is the coverage report on pkg/.
|
@greghaynes: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This seems vastly out of date. Any status @greghaynes? |
I've gotten a lot of mileage out of this PR which is why I've left it (lots of testing features / fixes using this). The reason it has remained WIP though is I am not sure how to do this in a way where it isn't very fragile. Right now we are manually finding a pod and tracing it. If a pod is scaling down while the request comes in to scale up from 0 it is difficult to pick the right pod. Theres also some potential races where requests come in to activator but the pod is already scaled up. My main question is - if I were to write some more code to pick the correct pod and handle this race, is this something we'd want to maintain? I have been sort of on the fence about this - its optional code and I'd worry itll be rarely run and has a somewhat complex state machine... |
Is this approach valid at this point? We're running however many requests through the activator as we want and even potentially always keep it in path. It feels like this is generating a lot of overhead and the traces cannot be mapped one to one to the requests anyway. So I think the races you mentioned are amplified at this point. I'm not sure how we'd go about making this as robust as we'd like it to be 🤔. Is this still producing very vital information for us? |
I'm inclined to close this. It's vastly outdated and if we wanted to adopt it, I think we'd need to do some designwork around this. |
I have done a rebase fairly recently that I have been using for some performance assessment: https://github.com/nimakaviani/serving/tree/pod-tracing-redux But @greghaynes 's points on the complex state machine and whether or not we want to maintain it still hold. Lets decide on its faith in the next WG meeting. I add it to the agenda. |
Going to close this as @nimakaviani seems to have a working version himself. Feel free to reopen once we determined the future of this? /close |
@markusthoemmes: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
When performing cold start the majority of our time is spent waiting for
our pod to be created. Unfortunately, because we do not control anything
beyond talking to kube-api, we have to perform our own resource tracking
to create spans which tie in to our existing cold-start tracing.
This depends on #2726 and #3881
Release Note