Communication with New Relic

All communication with New Relic MUST take place via the public telemetry ingest APIs. These APIs all share a common JSON format to provide a consistent experience across data types. SDK implementations MUST adhere to this common format when sending data to New Relic.

Request format

The SDK MUST use the Telemetry ingest APIs to send data to New Relic. The SDK sends all telemetry of a given type to the appropriate telemetry ingest endpoint.

SDKs MUST compress the JSON payload with gzip encoding by default.
Only send API keys as headers (not query params)

Request ID header

When communicating with data ingest services, there are 3 possible outcomes of the HTTP call:

OK status (200 <= status_code < 300), indicating data has been received and persisted.
non-OK status, indicating data has not been persisted
disconnect (either client or server). In this case, the connection is closed prior to receiving a status indication.

In case (3) above, the client should only retry the request if the request is idempotent since data may or may not have been persisted (and thus the data may get recorded twice, resulting in the data aggregates being inaccurate).

To prevent data loss while allowing clients to retransmit in the case of transient failures, the ingest service must be able to identify duplicate requests; therefore, all SDKs MUST send the following HTTP header with the request:

Header Name	Header Value	Code Example
x-request-id	A version 4 UUID string	str(uuid.uuid4())

NOTE the request ID should be generated before the first attempt to send the request is made and the value should be maintained throughout any retries which transmit the same payload. If the SDK partitions the payload in response to a 413 status code, a unique request ID should be used for the transmission of each partition.

User Agent

The User-Agent header field is used to perform analytics on requests received by New Relic. In order to enable these analytics, all SDKs MUST include a User-Agent header in requests they make to New Relic. In addition to conforming to the specification defined in RFC 7231, the User-Agent header MUST include an SDK product identifier as its first entry.

User-Agent  = sdk-id *( RWS ( product / comment ) )
sdk-id      = sdk-name "/" sdk-version
sdk-name    = "NewRelic-" language "-TelemetrySDK"
sdk-version = token

The language portion of the sdk-name needs to be the programming language the SDK is written for and the sdk-version is the version of the SDK. The rest of this syntax (RWS, product, comment, and token) all use the meanings defined in RFC 7231 and RFC 7230

Extending User Agent with Exporter Product

Understanding which exporter was used to export data is an important dimension to have analytics on as well. Exporters that use the SDK need to be able to append a product identifier of their own to the User-Agent header. Therefore, all SDKs MUST provide a method to extend the User-Agent header field-value. This method SHOULD accept the exporter determined product identifier as an argument. The exact form and the validity of this product identifier SHOULD be left to the exporter to determine.

An example of this User-Agent mutation functionality might look like the following.

class SDK(object):
    _user_agent = "NewRelic-Python-TelemetrySDK/0.1.0"

    def add_user_agent(self, product, product_version=None):
        """Add product to the User-Agent header field"""
        if product_version:
            product += "/{}".format(product_version)

        self._user_agent += " {}".format(product)
    ...

Then, when this SDK is used to build a NewRelic-Python-OpenCensus/0.2.1 exporter, the User-Agent header sent in a request would look like the following.

User-Agent: NewRelic-Python-TelemetrySDK/0.1.0 NewRelic-Python-OpenCensus/0.2.1

Payload

Payloads of different telemetry types cannot be combined.

All JSON payloads sent to New Relic MUST use the New Relic common format. This is an example of the common format:

[
  {
    "common": {
      <intrinsic attributes>
      "attributes" : {
          <custom attributes>
        }
    },
    "<spans|logs|metrics|events>" : [
      {
        <intrinsic attributes>,
        "timestamp": 1522434601409,
        "attributes" : {
          <custom attributes>
        }
      },
      {
        <intrinsic attributes>,
        "timestamp": 1522434601409,
        "attributes" : {
          <custom attributes>
        }
      } ]
  }
]

SDK implementations SHOULD use the top-level common block to reduce the size of repeated attributes in payloads when applicable.

Response codes

The telemetry ingest API validates the basic shape of the request without looking at the POST body. Its responses are documented here.

SDK implementations must perform response code error handling in the Telemetry API as documented below. The telemetry API should provide a mechanism for the consumer of this API to be notified (or react to) any error conditions that may occur rather than hiding all errors from the user.

Response code	Description	Log error	Retry behavior	Drop data	Other
`200 - 299`	Successful request
`400`	Generally invalid request	once	no	yes	See: dropping data.
`401`	Unauthorized	once	no	yes	See: dropping data.
`403`	Authentication failure	once	no	yes	See: dropping data.
`404`	Incorrect path	once	no	yes	See: dropping data.
`405`	Incorrect HTTP method (`POST` required)	once	no	yes	See: dropping data. Should never occur in the Telemetry SDK but should still be handled
`408`	Request timeout	each failure	yes	not yet
`409`	Conflict	once	no	yes	See: dropping data.
`410`	Gone	once	no	yes	See: dropping data.
`411`	Missing `Content-Length` header	once	no	yes	See: dropping data. Should never occur in the Telemetry SDK but should still be handled
`413`	Payload too large (`1 MB` limit)	each failure	`split` data and retry	no	See: splitting data
`429`	Too many requests	each failure	Retry based on `Retry-After` response header	no	`Retry-After` (`integer`) for how long wait until next retry in `seconds`
`Anything else`	Unknown	each failure	Retry with backoff	not yet	See graceful degradation.

Graceful degradation

The SDK may be unable to communicate with New Relic for a variety of reasons including network outages, misconfiguration or service outages. Telemetry SDKs must provide facilities to gracefully handle these failure cases or allow the consumer to handle them as they see fit. The SDKs must also provide functionality to make a request with no response handling or retrying.

The recommended handling of failed requests to the ingest API is to retry the request at increasing intervals and to eventually drop data if the request cannot be completed.

The amount of time to wait after a request can be computed using this logic:

MIN(backoff_max, backoff_factor * (2 ^ (number_of_retries - 1)))

For a backoff factor of 1 second, and a backoff max of 16 seconds, the retry delay interval should follow a pattern of [0, 1, 2, 4, 8, 16, 16, ...]. Subsequent retries should wait 16 seconds until the request has been retried the configured max retries number of times.

The total retry duration can be computed from the combination of backoff factor and backoff max. SDKs may provide a function to configure retry behavior by specifying the total retry duration instead of max retries.

Backoff example:

Backoff factor = 5 seconds
Backoff max = 80 seconds
Max retries = 8
Backoff sequence = [0, 5, 10, 20, 40, 80, 80, 80]

The telemetry SDK attempts to send a payload at t=13:00:00, and receives a 500 response.
The telemetry SDK attempts to send again at
- +0 : 13:00:00
- +5 : 13:00:05
- +10 : 13:00:15
- +20 : 13:00:35
- +40 : 13:01:15
- +80 : 13:02:35
- +80 : 13:03:55
- +80 : 13:05:15
- -- max retries exceeded. The data in this request should be dropped. See dropping data.

Dropping data

Whenever dropping data, the SDK must emit an error level log statement indicating the number of data points dropped.

SDKs should not attempt to merge a failed payload with the rest of the data being stored by the SDK.

SDKs may provide functionality for users to provide their own handler for dropped data, so that a user of the SDK may merge unsent data back into their own data collector in the way that makes sense for their use case.

Splitting data

The New Relic ingest API may return an HTTP 413 (payload too large). The SDK must ensure that data that is or would be rejected due to payload size is successfully sent to New Relic.

Some strategies include:

Preemptively splitting large payloads.
Splitting and retrying requests in response to an HTTP 413.

If a request results in an HTTP 413, and the payload of that request cannot be split, the SDK should drop the data. See dropping data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

communication.md

communication.md

Communication with New Relic

Request format

Request ID header

User Agent

Extending User Agent with Exporter Product

Payload

Response codes

Graceful degradation

Backoff example:

Dropping data

Splitting data

Files

communication.md

Latest commit

History

communication.md

File metadata and controls

Communication with New Relic

Request format

Request ID header

User Agent

Extending User Agent with Exporter Product

Payload

Response codes

Graceful degradation

Backoff example:

Dropping data

Splitting data