Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[http] API changes to split path header normalization and forwarding #15044

Merged
merged 21 commits into from
Mar 5, 2021

Conversation

asraa
Copy link
Contributor

@asraa asraa commented Feb 12, 2021

Signed-off-by: Asra Ali asraa@google.com

Commit Message: Configuration for path normalization so that normalization can be configured for internal only use and/or forwarding.
Additional Description: Just API changes. normalize_path and merge_slashes to be deprecated.

There will be normalizations Envoy applies to an incoming path header (forwarding_transformation) that "permanently" transform an incoming path header and affect the forwarded path header. Internal transformations (filter_transformations) apply only within Envoy processing. If an intermediate filter changes the internal path header, we don't restore the header.

image

The plan:

  • Make ConnManagerUtility and Router friend classes with HeaderMap
  • Create private HeaderMap methods: setNormalizedPath that ConnManagerUtility calls prior to filter processing and routing that sets private members original_path_ and normalized_path_. Then restoreOriginalPathIfUnchanged called by Router if we are doing internal-only normalization that restores :path with the original header only filters didn't change the path.
  • No HTTP filters have access to the original path for forwarding to prevent misuse.

Risk Level: Low, just API
Release notes: TODO on implementation
Runtime Guard: TODO on implementation. Runtime guard to overwrite forwarding http_connection_manager.forward_normalized_path (can revert right now)
Starts: #6589

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #15044 was opened by asraa.

see: more, trace.

Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

// Normalizations apply internally before any processing of requests by HTTP
// filters, routing, and matching, *and* will affect the upstream *:path* header
// as well.
FORWARD = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should FORWARD be the default behavior?
I'm not sure what the intended default should be, and in this case it might make more sense to add a new value POLICY_UNSPECIFIED = 0 if the policy should be explicit, and increase the values of FORWARD and INTERNAL by 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw in the doc update that FORWARD is the intended default behavior.

// Determines if adjacent slashes in the path are merged into one before any
// processing of requests by HTTP filters or routing. Without setting this
// option, incoming requests with path `//dir///file` will not match against
// route with `prefix` match set to `/dir`. Defaults to `false`. Note that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the API_STYLE guide, the wrapped scalar type should be used in cases that the default value isn't compatible with proto3 default (false in this case). Is there an explicit reason for using the wrapper rather than bool?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to change defaults in future..

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for tackling this important feature!

// Transformations that apply to path headers. The policy to apply these
// transformations can be tuned by the configuration field. Transformations
// can be applied internally only, so that the normalized header is used in
// matching, routing, and HTTP filters while the original path header is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about observability (logging, tracing)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the context. Only the normalized path is visible internally so if a filter logs headers, they will see the normalized header. If we finalize spans before forwarding to an upstream, we see the internal header, otherwise we see the forwarded header. This can obviously be changed in the code.

I'm not really sure what the right answer here is. If it should always be the "forwarded header" or always the original header, then we lose out on observability in some context. I think we either have to keep the logging context dependent as is, OR we allow HttpTracerUtility visibility into the original header (and for logs, I don't know -- as soon as we open the original header to logs, we open it up to filters).

e.g. if you log the request headers directly prior to forwarding, you will see the to-be-forwarded header, like here

ENVOY_STREAM_LOG(debug, "router decoding headers:\n{}", *callbacks_, headers);
. before that, you see the internal header.

// Defines a policy to apply normalization. Path rewrites :ref:`regex_rewrite
// <envoy_api_field_route.RouteAction.regex_rewrite>` or :ref:`prefix_rewrite
// <envoy_api_field_route.RouteAction.prefix_rewrite>` will apply to the
// *:path* header destined for the upstream.
Copy link
Member

@htuch htuch Feb 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only mention these two? I.e. are there other path transforms in router we should consider? What happens when a filter tries to modify the path and it's only interacting with the normalized path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are the only router transformations, but that may change in the future so I will change the language.

I mentioned that in this comment, it's only relevant for INTERNAL:

// Normalizations only apply internally. The original :path header will
// be sent upstream unless the path has been mutated by a filter.

// Determines if adjacent slashes in the path are merged into one before any
// processing of requests by HTTP filters or routing. Without setting this
// option, incoming requests with path `//dir///file` will not match against
// route with `prefix` match set to `/dir`. Defaults to `false`. Note that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to change defaults in future..

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
adisuissa
adisuissa previously approved these changes Feb 17, 2021
Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for working on this!

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
@asraa
Copy link
Contributor Author

asraa commented Feb 19, 2021

/cc @chaoqin-li1123

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks just a few API questions.

/wait

Comment on lines 276 to 294
// Should paths be normalized according to RFC 3986 before any processing of
// requests by HTTP filters or routing? Defaults to FORWARD (normalization
// applies internally and affects upstream *:path* header). When not
// specified, this value may be overridden by the runtime variable
// :ref:`http_connection_manager.normalize_path<config_http_conn_man_runtime_normalize_path>`. For
// paths that fail this check, Envoy will respond with 400 to paths that are
// malformed. See `Normalization and Comparison
// <https://tools.ietf.org/html/rfc3986#section-6>`_ for details of
// normalization. Note that Envoy does not perform `case normalization
// <https://tools.ietf.org/html/rfc3986#section-6.2.2.1>`_
NormalizationPolicy normalize_path = 1;

// Determines if adjacent slashes in the path are merged into one before any
// processing of requests by HTTP filters or routing. Defaults to
// DISABLE. Without setting this option, incoming requests with path
// `//dir///file` will not match against route with `prefix` match set to
// `/dir`. Note that slash merging is not part of `HTTP spec
// <https://tools.ietf.org/html/rfc3986>`_ and is provided for convenience.
NormalizationPolicy merge_slashes = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be getting into the implementation weeds, but do you envision any issues if someone selects INTERNAL for one and FORWARD for another? Is it going to lead to complicated state tracking?

Copy link
Contributor Author

@asraa asraa Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's open to do so.

If the HeaderMap privately stores the original path, then setting the FORWARD header could be independent of the internal transformed path. (However, we do have to store the original internal header too, to make sure we don't override the path if a filter changed it)

HeaderMap:

  • Path: visible path. during internal processing, this is INTERNAL(FORWARD(path)).
  • private: original_path: store the original path (used for access logging)
  • private: forward_path: FORWARD(path)
  • private: internal_path: INTERNAL(FORWARD(path)); the original internal path set by the HCM. the router uses this to check if the path was mutated by a filter. if so, we forward the mutated path. otherwise, we forward forward_path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment about internal_transformations:

    // These will be applied after :ref:`full_transformation
    // <envoy_api_field_extensions.filters.network.http_connection_manager.v3.HttpConnectionManager.PathNormalizationOptions.full_transformation`
    // is applied. The *:path* header before this transformation will be restored in the router
    // filter and sent upstream unless it was mutated by a filter```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chaoqin-li1123 suggested that we should clarify this too. full_transformation (previously called FORWARD) should be though of as incoming header normalization that mutates the header before processing and affects the forwarded.

internal ones only exist temporarily in Envoy during processing. we forward the path with the incoming header normalization.

I am happy to make the name change from full_transformation to incoming_transformation if that's clearer.

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome changes.
/wait

@asraa
Copy link
Contributor Author

asraa commented Feb 26, 2021

@mattklein123 could you please take another look? Controls have individual messages, hopefully the internal/full transformation interaction is more clear.

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks at a high level this LGTM with some small remaining comments.

/wait

Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. A few bike shed comments.

Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with one small comment.

/wait

Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants