Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster manager init takes minutes due to slow DNS and STRICT_DNS behaviour #14670

Open
ppadevski opened this issue Jan 12, 2021 · 7 comments
Open
Labels
area/cluster_manager area/dns enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@ppadevski
Copy link

ppadevski commented Jan 12, 2021

I noticed recently that cluster manager init can take quite some time. For example here it takes 3+ minutes

2020-11-10T16:53:13.792Z info envoy[44013] [Originator@6876 sub=upstream] cds: add 109 cluster(s), remove 1 cluster(s)
...
2020-11-10T16:53:13.943Z info envoy[44013] [Originator@6876 sub=upstream] cds: add/update cluster '/'
2020-11-10T16:56:58.923Z info envoy[44013] [Originator@6876 sub=upstream] cm init: all clusters initialized

After debugging the issue it turned out that a STRICT_DNS cluster is considered initialised once the very first DNS resolver completes either with success or failure.

This is a problem for me as during these 3+ minutes envoy wasn't working at all - it was unable to get its listeners and routes because it was stuck in CDS. I only had a few STRICT_DNS clusters and most of my other clusters 100+ are STATIC but unusable due to lack of listeners+routes.

Note that this only happens when DNS is (very) slow. I tried reproducing the issue with iptables+DROP rules but was unable to. c-ares returns failure immediately. This is the only way I was able to slow down DNS in order to reproduce the issue (10.10.10.10 is my DNS server).

tc qdisc add dev eth0 root handle 1: prio
tc qdisc add dev eth0 parent 1:3 handle 30: tbf rate 20kbit buffer 1600 limit 3000
tc qdisc add dev eth0 parent 30:1 handle 31: netem delay 20000ms 10ms distribution normal
tc filter add dev eth0 protocol ip parent 1:0 prio 3 u32 match ip dst 10.10.10.10/32 flowid 1:3

After the initial initialisation all is good as the DNS responses are cached and deduplicated. My apps and services tend to restart from time to time and having a 3+ minute gap when DNS is slow is bad. And DNS happens to sometimes be slow when running a cloud service.

I ended up implementing the following solution:

--- a/source/common/upstream/strict_dns_cluster.cc
+++ b/source/common/upstream/strict_dns_cluster.cc
@@ -59,6 +59,8 @@ void StrictDnsClusterImpl::startPreInit() {
   // resolved in failure.
   if (resolve_targets_.empty()) {
     onPreInitComplete();
+  } else if (info_->addedViaApi()) {
+    onPreInitComplete(); // see StrictDnsWarmupFilter and PR 2673888
   }
 }
--- a/source/extensions/filters/http/common/BUILD
+++ b/source/extensions/filters/http/common/BUILD
@@ -42,6 +42,17 @@ envoy_cc_library(
 )

 envoy_cc_library(
+    name = "strict_dns_warmup_filter_lib",
+    srcs = ["strict_dns_warmup_filter.cc"],
+    hdrs = ["strict_dns_warmup_filter.h"],
+    visibility = ["//visibility:public"],
+    deps = [
+        "//include/envoy/upstream:cluster_manager_interface",
+        "//source/extensions/filters/http/common:pass_through_filter_lib",
+    ],
+)
+
+envoy_cc_library(
     name = "utility_lib",
     hdrs = ["utility.h"],
     # Used by the router filter.  TODO(#9953) clean up.
--- /dev/null
+++ b/source/extensions/filters/http/common/strict_dns_warmup_filter.cc
@@ -0,0 +1,65 @@
+#include "extensions/filters/http/common/strict_dns_warmup_filter.h"
+
+namespace Envoy {
+namespace Http {
+
+Http::FilterHeadersStatus StrictDnsWarmupFilter::decodeHeaders(
+    Http::RequestHeaderMap& /* headers */, bool /* end_stream */) {
+  auto route = decoder_callbacks_->route();
+
+  if (route == nullptr) {
+    return Http::FilterHeadersStatus::Continue;
+  }
+
+  const Router::RouteEntry* entry = route->routeEntry();
+
+  if (entry == nullptr) {
+    return Http::FilterHeadersStatus::Continue;
+  }
+
+  Upstream::ThreadLocalCluster* cluster = cm_.get(entry->clusterName());
+
+  if (cluster == nullptr) {
+    return Http::FilterHeadersStatus::Continue;
+  }
+
+  if (cluster->info()->type() != envoy::config::cluster::v3::Cluster::STRICT_DNS) {
+    return Http::FilterHeadersStatus::Continue;
+  }
+
+  const auto& host_sets = cluster->prioritySet().hostSetsPerPriority();
+
+  if (std::any_of(host_sets.begin(), host_sets.end(),
+                  [](const Upstream::HostSetPtr& host_set) {
+                    const auto& hosts = host_set->hosts();
+                    return std::any_of(hosts.begin(), hosts.end(),
+                                       [](const Upstream::HostSharedPtr& host) {
+                                         return host != nullptr;
+                                       });
+                  })) {
+    return Http::FilterHeadersStatus::Continue;
+  }
+
+  timer_ = decoder_callbacks_->dispatcher().createTimer([this] {
+    host_set_member_update_cb_handle_->remove();
+    decoder_callbacks_->continueDecoding();
+  });
+
+  host_set_member_update_cb_handle_ = cluster->prioritySet().addMemberUpdateCb(
+      [this](const Upstream::HostVector&, const Upstream::HostVector&) {
+    if (timer_ != nullptr) {
+      timer_.reset();
+      decoder_callbacks_->dispatcher().post([this] {
+        host_set_member_update_cb_handle_->remove();
+        decoder_callbacks_->continueDecoding();
+      });
+    }
+  });
+
+  timer_->enableTimer(std::chrono::milliseconds(30000));
+
+  return Http::FilterHeadersStatus::StopAllIterationAndWatermark;
+}
+
+} // namespace Http
+} // namespace Envoy
--- /dev/null
+++ b/source/extensions/filters/http/common/strict_dns_warmup_filter.h
@@ -0,0 +1,32 @@
+#pragma once
+
+#include "envoy/upstream/cluster_manager.h"
+
+#include "extensions/filters/http/common/pass_through_filter.h"
+
+namespace Envoy {
+namespace Http {
+
+
+/**
+ * @class StrictDnsWarmupFilter
+ *
+ *      StrictDns warmup filter.
+ */
+class StrictDnsWarmupFilter : public PassThroughDecoderFilter {
+public:
+  StrictDnsWarmupFilter(Upstream::ClusterManager& cm) : cm_(cm) {}
+
+  // Http::StreamDecoderFilter
+  Http::FilterHeadersStatus decodeHeaders(Http::RequestHeaderMap& headers,
+                                          bool end_stream) override;
+
+private:
+  Upstream::ClusterManager& cm_;
+  Event::TimerPtr timer_;
+  Common::CallbackHandle* host_set_member_update_cb_handle_{};
+};
+
+
+} // namespace Http
+} // namespace Envoy
--- a/source/extensions/filters/http/router/BUILD
+++ b/source/extensions/filters/http/router/BUILD
@@ -24,6 +24,7 @@ envoy_cc_extension(
         "//source/common/router:shadow_writer_lib",
         "//source/extensions/filters/http:well_known_names",
         "//source/extensions/filters/http/common:factory_base_lib",
+        "//source/extensions/filters/http/common:strict_dns_warmup_filter_lib",
         "//source/extensions/filters/http/on_demand:on_demand_update_lib",
         "@envoy_api//envoy/extensions/filters/http/router/v3:pkg_cc_proto",
     ],
--- a/source/extensions/filters/http/router/config.cc
+++ b/source/extensions/filters/http/router/config.cc
@@ -1,5 +1,7 @@
 #include "extensions/filters/http/router/config.h"

+#include "extensions/filters/http/common/strict_dns_warmup_filter.h"
+
 #include "envoy/extensions/filters/http/router/v3/router.pb.h"
 #include "envoy/extensions/filters/http/router/v3/router.pb.validate.h"

@@ -19,6 +21,8 @@ Http::FilterFactoryCb RouterFilterConfig::createFilterFactoryFromProtoTyped(
       proto_config));

   return [filter_config](Http::FilterChainFactoryCallbacks& callbacks) -> void {
+    callbacks.addStreamDecoderFilter(
+        std::make_shared<Http::StrictDnsWarmupFilter>(filter_config->cm_));
     callbacks.addStreamDecoderFilter(std::make_shared<Router::ProdFilter>(*filter_config));
   };
 }

The solution does the following:

  1. All STRICT_DNS clusters received via ADS/CDS are considered initialised immediately. We do not wait for the first DNS resolve to complete. STRICT_DNS clusters in bootstrap are not affected as it is assumed that one may use a STRICT_DNS for ADS/CDS.
  2. Whenever a L7 router filter is requested we install the new StrictDnsWarmupFilter before the router filter.
  3. The StrictDnsWarmupFilter checks if the selected cluster is STRICT_DNS and if it has no hosts waits for at most 30 seconds (currently hardcoded) for any hosts to appear. If no hosts appear then either DNS is completely dead or very slow. In either case we continue to the router filter which either returns no_healthy_upstream or proxies the request.

At the moment I am not using L4 filters so StrictDnsWarmupFilter is specifically for L7 router.

I would like to ask if it would be possible to have the above use case incorporated into envoy for broader use (the patch is only for reference) - that is - initialise STRICT_DNS clusters immediately so that some routing may happen and have a grace period when DNS is slow or down.

@htuch
Copy link
Member

htuch commented Jan 13, 2021

To me it would make sense to be able to have a per-cluster configuration to disable warming. This could be coupled with a control to permit connection picking / LB to defer until a cluster is warm. That would then allow normal route timeouts to be applied to control the process. Not sure if that's exactly what's needed, but might have a fewer moving parts than an explicit warming filter.

@snowp @mattklein123 WDYT?

@htuch htuch added area/cluster_manager area/dns enhancement Feature requests. Not bugs or questions. labels Jan 13, 2021
@mattklein123
Copy link
Member

To me it would make sense to be able to have a per-cluster configuration to disable warming.

+1. I know that @lambdai has also been looking at the general problem of "too many things need warming" and might have thoughts here.

@ppadevski
Copy link
Author

ppadevski commented Jan 13, 2021

To me it would make sense to be able to have a per-cluster configuration to disable warming.

This would work for me.

This could be coupled with a control to permit connection picking / LB to defer until a cluster is warm. That would then allow normal route timeouts to be applied to control the process. Not sure if that's exactly what's needed, but might have a fewer moving parts than an explicit warming filter.

I actually use a custom LB (fifo) and think that it would be too late there as it only has chooseHost that is called from within the router and cannot reschedule and must not block. Here's the code for reference (it is always coupled with envoy.retry_host_predicates.previous_hosts)

HostConstSharedPtr FifoLoadBalancer::chooseHost(LoadBalancerContext* context) {
  for (const auto& hs : priority_set_.hostSetsPerPriority()) {
    for (const auto& host : hs->hosts()) {
      if (host == nullptr || context == nullptr || !context->shouldSelectAnotherHost(*host)) {
        return host;
      }
    }
  }

  return nullptr;
}

In any case I only shared the code as a reference. I know that it is by far not the best solution but wanted to start from somewhere.

@htuch htuch added the help wanted Needs help! label Jan 13, 2021
@htuch
Copy link
Member

htuch commented Jan 13, 2021

Yeah, we would need to block somewhere in router. I think this behavior is generic enough we could have a "block-on-cluster-warming" capability.

@lambdai
Copy link
Contributor

lambdai commented Feb 10, 2021

Drive by: per cluster warming might be good.

@ppadevski
Long time ago I remember that strict dns cluster could be warm up by accident, and no healthy upstream will be the error code for Http request. Is this fail-fast not acceptable?

Http filter is an acceptable place but we'd better put it in http router which handles general upstream failure. e.g. add a retry on no healthy upstream host if this option is not there.

Ideally I am planning to experimental asynchronous load balancer, which would return Future so that any terminated network filter, including http conn manager, tcp proxy, the existing rest and future filters can wait on cluster. This can resolve this slow dns use case, and also on demand EDS

@haoruolei
Copy link

Hi all. Is there any update for this feature?

@lukidzi
Copy link
Contributor

lukidzi commented Nov 28, 2024

I believe I encountered the same issue. What’s strange is that after the message dns resolution for external-service.gatewayapi-external-services.svc.cluster.local started, there are no further DNS-related logs for the entire 30-second test duration.

I would expect the DNS request to be canceled after 5 seconds (the default timeout as per Envoy’s documentation) and retried. Unless, of course, this DNS configuration needs to be explicitly set for it to apply.
Envoy version v1.32.1

Logs

[2024-11-28 14:30:10.178][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:363] Received gRPC message for type.googleapis.com/envoy.config.cluster.v3.Cluster at version 421cc6b7-37c9-4573-8e6b-fe5280d1c34a
[2024-11-28 14:30:10.178][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:340] Pausing discovery requests for type.googleapis.com/envoy.config.cluster.v3.Cluster (previous count 0)
[2024-11-28 14:30:10.178][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:340] Pausing discovery requests for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment (previous count 0)
[2024-11-28 14:30:10.178][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:340] Pausing discovery requests for type.googleapis.com/envoy.config.endpoint.v3.LbEndpoint (previous count 0)
[2024-11-28 14:30:10.178][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:340] Pausing discovery requests for type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret (previous count 0)
[2024-11-28 14:30:10.178][23][info][upstream] [source/common/upstream/cds_api_helper.cc:32] cds: add 3 cluster(s), remove 3 cluster(s)
[2024-11-28 14:30:10.178][23][debug][misc] [source/common/network/dns_resolver/dns_factory_util.cc:75] create DNS resolver type: envoy.network.dns_resolver.cares
[2024-11-28 14:30:10.178][23][debug][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:601] c-ares library initialized.
[2024-11-28 14:30:10.201][23][debug][config] [./source/common/http/filter_chain_helper.h:111]     upstream http filter #0
[2024-11-28 14:30:10.201][23][debug][config] [./source/common/http/filter_chain_helper.h:157]       name: envoy.filters.http.upstream_codec
[2024-11-28 14:30:10.201][23][debug][config] [./source/common/http/filter_chain_helper.h:160]     config: {"@type":"type.googleapis.com/envoy.extensions.filters.http.upstream_codec.v3.UpstreamCodec"}
[2024-11-28 14:30:10.202][23][debug][config] [source/extensions/config_subscription/grpc/grpc_mux_impl.cc:340] Pausing discovery requests for type.googleapis.com/envoy.config.cluster.v3.Cluster (previous count 1)
[2024-11-28 14:30:10.202][23][debug][upstream] [source/common/upstream/cluster_manager_impl.cc:857] add/update cluster external-service-7b2e7f0413991110 starting warming
[2024-11-28 14:30:10.202][23][debug][dns] [source/extensions/network/dns_resolver/cares/dns_impl.cc:391] dns resolution for external-service.gatewayapi-external-services.svc.cluster.local started

Config:

 "dynamic_warming_clusters": [
    {
     "version_info": "421cc6b7-37c9-4573-8e6b-fe5280d1c34a",
     "cluster": {
      "@type": "type.googleapis.com/envoy.config.cluster.v3.Cluster",
      "name": "external-service-7b2e7f0413991110",
      "type": "STRICT_DNS",
      "connect_timeout": "5s",
      "per_connection_buffer_limit_bytes": 32768,
      "circuit_breakers": {
       "thresholds": [
        {
         "max_connections": 1024,
         "max_pending_requests": 1024,
         "max_requests": 1024,
         "max_retries": 3
        }
       ]
      },
      "dns_lookup_family": "V4_ONLY",
      "load_assignment": {
       "cluster_name": "external-service",
       "endpoints": [
        {
         "lb_endpoints": [
          {
           "endpoint": {
            "address": {
             "socket_address": {
              "address": "external-service.gatewayapi-external-services.svc.cluster.local",
              "port_value": 80
             }
            }
           },
           "metadata": {
            "filter_metadata": {
             "envoy.transport_socket_match": {
              "kuma.io/protocol": "http"
             },
             "envoy.lb": {
              "kuma.io/protocol": "http"
             }
            }
           },
           "load_balancing_weight": 1
          }
         ]
        }
       ]
      },
      "typed_extension_protocol_options": {
       "envoy.extensions.upstreams.http.v3.HttpProtocolOptions": {
        "@type": "type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions",
        "common_http_protocol_options": {
         "idle_timeout": "3600s",
         "max_connection_duration": "0s",
         "max_stream_duration": "0s"
        },
        "explicit_http_config": {
         "http_protocol_options": {}
        }
       }
      }
     },
     "last_updated": "2024-11-28T14:30:10.202Z"
    }
   ]
cluster_manager.warming_clusters: 1
control_plane.connected_state: 1
control_plane.pending_requests: 0
control_plane.rate_limit_enforced: 0
dns.cares.get_addr_failure: 0
dns.cares.not_found: 0
dns.cares.pending_resolutions: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster_manager area/dns enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

No branches or pull requests

6 participants