Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend telemetry client to pass the context #165

Merged
merged 5 commits into from
Jan 17, 2017

Conversation

SergeyKanzhelev
Copy link

@SergeyKanzhelev SergeyKanzhelev commented Jan 5, 2017

@KamilSzostak @jasongin @AlexBulankou what do you think of this change? The same way we can extend the incoming request tracking.

The biggest problem with the proposed is that we will hold on to request object for quite some time. Before - we would just read the properties we need from it and then update those properties with response information. After my change - we will need to keep the request object till the call to track

appInsights.client.addTelemetryProcessor((envelope, context) => {
    if (envelope.data.baseType === "Microsoft.ApplicationInsights.RemoteDependencyData") {
        var reqOptions = context.requestOptions;
        var customValue = reqOptions && reqOptions.headers && reqOptions.headers["my-header"];
        if (id !== undefined) {
            envelope.data.baseData.properties["my-header"] = customValue;
        }
    }
    return true;
});

@@ -234,10 +234,11 @@ class Client {
*/
public track(
data: ContractsModule.Contracts.Data<ContractsModule.Contracts.Domain>,
tagOverrides?: { [key: string]: string; }) {
tagOverrides?: { [key: string]: string; },
contextObject?: any) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make contextObject should be IContextProperties strongly types interface that we can add to typed definitions. The way it is, the signature of this object is unclear, so it is not very useful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point is to have this object weakly typed so we can pass http.ClientRequest from one auto-collection logic and http.ServerRequest from another. If strongly typed - what properties will we have?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share Alex's concern. The type of the context needs to be documented in some form to enable users to write code against it with some assurance that the context type won't change in a breaking way. Maybe union with any to keep it flexible, like: ITelemetryContext | any ?

interface ITelemetryContext { ... }
interface ITelemetryRequestContext extends ITelemetryContext {
  request: http.ClientRequest;
  response: ...
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at correlation library that will keep the context across callbacks and promises? Do you think we will need to have correlation context provided by that library passed to processors as well? Maybe ITelemetryContext can have a field for that context object?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked extensively into async context propagation libraries, for potential use for request correlation in this SDK among other things. Unfortunately there are no good solutions available now or in the near future. I think if we were to try to implement it in the SDK it would result in too many issues and a big support burden, to the extent that it wouldn't be worthwhile. Basically there is no way to make it reliable unless/until it is a JavaScript standard. My team is pursuing that angle, but that's a very long-term thing.

In this case, we are able to explicitly pass a context object, as you have done, because there is no user code in the stack between track() and the calls to the processors, so there's no need to rely on a library to implicitly forward the context.

@jasongin
Copy link
Member

jasongin commented Jan 5, 2017

I don't understand the motivation for this change. What scenario is it attempting to address?

A telemetry processor is intended to process a telemetry envelope containing data that was already extracted from the request (or from whatever the caller of track() used as the source of the data). That is to allow extensions to be able to modify or filter the telemetry records before they are uploaded. While it's conceivable that the original request object could be used by a telemetry processor, that seems to be the wrong place for that kind of work, sort of a violation of the intended layers of abstraction.

@SergeyKanzhelev
Copy link
Author

SergeyKanzhelev commented Jan 5, 2017

@jasongin the scenario I'm trying to achieve it to be able to extend the automatic data collection. Let's take an example. You want to mark server requests that have a certain header as "synthetic" traffic so this data will not be incrementing the number of user sessions in our UI.

Today you have no way to do it except writing your own data collector by forking our code.

In .NET we solve it by asking customers to use thing like HttpContext.Current to get access to those headers. Since there is no such thing as current in node.js - this approach provides a nice way to get it.

From layering perspective - in .NET we are using two separate concepts - telemetry initializer and telemetry processor. Processors run after all initializers populated the necessary data. Would it be better to introduce a concept of initializer in node.js SDK as well? Will this address your concern?

@jasongin
Copy link
Member

jasongin commented Jan 5, 2017

Thanks for the explanation. I guess I don't really see a better approach than what you've done here. Introducing telemetry initializers would be a big change that doesn't seem justified.

@SergeyKanzhelev
Copy link
Author

do you see a big problem with keeping the request object and passing it to the on(response) callback?

@jasongin
Copy link
Member

jasongin commented Jan 5, 2017

No, I can't think of any problems with holding on to the request until the response is processed.

@SergeyKanzhelev
Copy link
Author

@AlexBulankou @jasongin is this strong typing for context object better?

@SergeyKanzhelev
Copy link
Author

SergeyKanzhelev commented Jan 6, 2017

@jasongin - for the requests to dependencies correlation and this context object implementation I was thinking of something like this:

http.createServer(function (req, res) {
  var ctx = ai.getCorrelationContext(req);
  ai.client.trackEvent("eventName", ctx);
  something.on(()=>{    ai.client.trackEvent("eventNameFromCallback", ctx); } );
}).listen(...

So strongly typed context may be extended to contain the correlation information like request ID and such...

The question would be what's the best way to extend http auto collection to get this correlation data. Is something like this possible?

http.createServer(function (req, res) {
  var ctx = ai.getCorrelationContext(req);
  http.get(
    { host: "finance.google.com", 
      path: path + stock, 
      headers,
      //I doubt it, but what are my alternatives to set the context on http.ClientRequest object?
      correlationContext: ctx
 }, function (response) { });
}).listen(...

Or any other way to set context to clientRequest object so our auto-collection will read it?

Alternatively - is there a "local storage" that works only during the current "task" execution.

http.createServer(function (req, res) {
  var ctx = ai.getCorrelationContext(req);
  ai.setContextOnCurrentTask(ctx);
  // dependency magically correlated since it was started from this task
  http.get(
    { host: "finance.google.com", 
      path: path + stock, 
      headers,
 }, function (response) { 
        ai.setContextOnCurrentTask(ctx);
        //do something and it will be correlated...
        ai.client.trackEvent("event");
    });
}).listen(...

@jasongin
Copy link
Member

jasongin commented Jan 6, 2017

The strong typing looks fine, except I would drop the word "Object" from the ServerRequestContextObject and ClientRequestContextObject class names. I think the "Object" suffix is unnecessarily verbose.

For the correlation, there is simply no solution that will work for context propagation in all cases. Certainly for simple cases the app developer can pass the context explicitly as in your snippet above. But the problem is an outgoing request may be deep in an async call chain and/or in library code that the application developer cannot modify. Existing solutions such as continuation local storage, zones, and similar attempts at async context propagation all require modification (either via source code changes or monkey-patching) of any libraries that may do any kind of async queueing or callbacks, and there is no way that modification can be done 100% reliably and automatically for all possible libraries that an application might use.

@SergeyKanzhelev
Copy link
Author

@jasongin for the correlation I think my question is whether it makes sense to extend the .track API to allow manual correlation. I can change Context class to something like:

Context {
  id: //operation ID
  parentId: // ID of immediate parent
  operationName: //Name of the containing operation
   ...
  contextObject: ServerRequestContextObject | ClientRequestContextObject | any
}

and make track method extract data from the context.

This will fill the gap we have today of not being able to correlate telemetry even manually. Later if we will use any of technique you mentioned - it will just pre-populate this context or take if from some other place...

What do you think? Do you feel there will be a mix-up of concepts between the context and contextObject? How will I pass it to ClientRequest autocollection logic?

@jasongin
Copy link
Member

jasongin commented Jan 7, 2017

Certainly something like that is possible for manual correlation, assuming the different kinds of "context" are clarified or renamed to make it less confusing. I don't have a specific suggestion right now.

I had been thinking manual correlation would be too limited in practice because many outgoing requests would be made via 3rd-party libraries which would not preserve the context. Do you think it's still a valuable feature?

@SergeyKanzhelev
Copy link
Author

@jasongin I think we will need to have correlation helper functions for simple scenarios. Some of them we have internally when we simply need to correlate error with the request. Thinking about it I'm not sure it should be a part of ContextObject class.

@AlexBulankou @KamilSzostak opinion on this PR? I'd like to have it in 0.18

@SergeyKanzhelev
Copy link
Author

Discussed with @AlexBulankou offline and he gave a feedback that it will be easier to check the context objects collection if it is a dictionary with the names equal to types.

@SergeyKanzhelev SergeyKanzhelev merged commit c324154 into develop Jan 17, 2017
@SergeyKanzhelev SergeyKanzhelev deleted the sergkanz/contextMagic branch January 17, 2017 17:45
@KamilSzostak KamilSzostak added this to the v0.18.0 milestone Jan 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants