Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xDS: Using RDS to reference RouteConfiguration in server Listener results in panic due to nil FilterChainManager #6683

Closed
halvards opened this issue Oct 4, 2023 · 6 comments
Assignees

Comments

@halvards
Copy link
Contributor

halvards commented Oct 4, 2023

What version of gRPC are you using?

gRPC-Go v1.58.2

What version of Go are you using (go version)?

go version go1.20.8 darwin/arm64

What operating system (Linux, Windows, …) and version?

macOS

What did you do?

Set up an xDS management server, populated with a server Listener and associated RouteConfiguration.

Experiment 1: The server Listener contains an inline RouteConfiguration.

Experiment 2: The server Listener specifies using RDS (via ADS) to look up the RouteConfiguration, instead of an inline config in the Listener.

What did you expect to see?

Experiment 1: The xDS-enabled gRPC server fetches the server Listener with its inline RouteConfiguration using LDS. These two resources configure the server to handle the traffic, including setting the FilterChainManager of the listenerWrapper.

Experiment 2: The xDS-enabled gRPC server fetches the server Listener using LDS and the RouteConfiguration using RDS. These two resources configure the server to handle the traffic, including setting the FilterChainManager of the listenerWrapper.

What did you see instead?

Experiment 1:

The xDS-enabled gRPC server fetches the server Listener with its inline RouteConfiguration, and the server can serve traffic.

Output of grpcdebug IP:PORT xds status:

Name                                                       Status    Version            Type                                                    LastUpdated
grpc/server?xds.resource.listening_address=0.0.0.0:50051   ACKED     1925506621691688   type.googleapis.com/envoy.config.listener.v3.Listener   29 seconds ago

Output of grpcdebug IP:PORT xds config | yq --input-format=json --prettyPrint:

config:
  - node:
      id: 41dbe59f-5e54-4453-82e9-f5a101b0e211~10.68.6.124
      cluster: greeter-leaf
      metadata:
        GCE_VM_ID: "2946063889887007962"
        GCE_VM_NAME: gke-grpc-td-cluster-1-default-pool-04f485d7-btf2
        GCP_PROJECT_NUMBER: "639413862670"
        GKE_CLUSTER_LOCATION: australia-southeast1-b
        GKE_CLUSTER_NAME: grpc-td-cluster-1
        INSTANCE_IP: 10.68.6.124
        K8S_NAMESPACE: xds
        K8S_POD: greeter-leaf-84887bf447-h6t49
        XDS_STREAM_TYPE: ADS
      locality:
        zone: australia-southeast1-b
      userAgentName: gRPC Go
      userAgentVersion: 1.58.2
      clientFeatures:
        - envoy.lb.does_not_support_overprovisioning
        - xds.config.resource-in-sotw
    genericXdsConfigs:
      - typeUrl: type.googleapis.com/envoy.config.listener.v3.Listener
        name: grpc/server?xds.resource.listening_address=0.0.0.0:50051
        versionInfo: "1925506621691688"
        xdsConfig:
          '@type': type.googleapis.com/envoy.config.listener.v3.Listener
          name: grpc/server?xds.resource.listening_address=0.0.0.0:50051
          address:
            socketAddress:
              address: 0.0.0.0
              portValue: 50051
          filterChains:
            - filters:
                - name: envoy.http_connection_manager
                  typedConfig:
                    '@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                    statPrefix: default_inbound_config
                    routeConfig:
                      name: default_inbound_config
                      virtualHosts:
                        - name: default_inbound_config
                          domains:
                            - '*'
                          routes:
                            - match:
                                prefix: /
                              nonForwardingAction: {}
                              decorator:
                                operation: default_inbound_config/*
                    httpFilters:
                      - name: envoy.filters.http.router
                        typedConfig:
                          '@type': type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                    forwardClientCertDetails: APPEND_FORWARD
                    setCurrentClientCertDetails:
                      subject: true
                      dns: true
                      uri: true
                    upgradeConfigs:
                      - upgradeType: websocket
          trafficDirection: INBOUND
          enableReusePort: true
        lastUpdated: "2023-10-04T06:52:55.289177757Z"
        clientStatus: ACKED

Experiment 2:

After fetching the resources, panic in the xDS-enabled gRPC server due to nil pointer dereference in the method Lookup() on line 695 of filter_chain.go, as the method's pointer receiver (*FilterChainManager) is nil:

func (fci *FilterChainManager) Lookup(params FilterChainLookupParams) (*FilterChain, error) {
dstPrefixes := filterByDestinationPrefixes(fci.dstPrefixes, params.IsUnspecifiedListener, params.DestAddr)

Logs from the gRPC-Go xDS client:

I1004 06:40:33.825265       1 greeter-go/main.go:30] "Creating new logger"
I1004 06:40:33.826331       1 server/server.go:154] "Creating an xDS-managed server"
I1004 06:40:33.826449       1 grpc@v1.58.2/server.go:667] "[core][Server #1] Server created"
I1004 06:40:33.826523       1 bootstrap/bootstrap.go:402] "[xds][xds-bootstrap] Using bootstrap file with name \"/etc/grpc-xds/bootstrap.json\""
I1004 06:40:33.827213       1 bootstrap/bootstrap.go:562] "[xds][xds-bootstrap] Bootstrap config for creating xds-client: {\n  \"XDSServer\": {\n    \"server_uri\": \"control-plane.xds:50051\",\n    \"channel_creds\": [\n      {\n        \"type\": \"insecure\"\n      }\n    ],\n    \"server_features\": [\n      \"xds_v3\"\n    ]\n  },\n  \"CertProviderConfigs\": null,\n  \"ServerListenerResourceNameTemplate\": \"grpc/server?xds.resource.listening_address=%s\",\n  \"ClientDefaultListenerResourceNameTemplate\": \"%s\",\n  \"Authorities\": null,\n  \"NodeProto\": {\n    \"id\": \"85e37245-9fdf-4947-b030-e142a1c42792~10.68.6.122\",\n    \"cluster\": \"greeter-leaf\",\n    \"metadata\": {\n      \"GCE_VM_ID\": \"2946063889887007962\",\n      \"GCE_VM_NAME\": \"gke-grpc-td-cluster-1-default-pool-04f485d7-btf2\",\n      \"GCP_PROJECT_NUMBER\": \"639413862670\",\n      \"GKE_CLUSTER_LOCATION\": \"australia-southeast1-b\",\n      \"GKE_CLUSTER_NAME\": \"grpc-td-cluster-1\",\n      \"INSTANCE_IP\": \"10.68.6.122\",\n      \"K8S_NAMESPACE\": \"xds\",\n      \"K8S_POD\": \"greeter-leaf-64fbd57fdb-gr8mv\",\n      \"XDS_STREAM_TYPE\": \"ADS\"\n    },\n    \"locality\": {\n      \"zone\": \"australia-southeast1-b\"\n    },\n    \"user_agent_name\": \"gRPC Go\",\n    \"UserAgentVersionType\": {\n      \"UserAgentVersion\": \"1.58.2\"\n    },\n    \"client_features\": [\n      \"envoy.lb.does_not_support_overprovisioning\",\n      \"xds.config.resource-in-sotw\"\n    ]\n  }\n}"
I1004 06:40:33.827315       1 xdsclient/client_new.go:80] "[xds][xds-client 0xc00028f9f0] Created client to xDS management server: control-plane.xds:50051-insecure-xds_v3"
I1004 06:40:33.827363       1 xdsclient/singleton.go:97] "[xds]xDS node ID: 85e37245-9fdf-4947-b030-e142a1c42792~10.68.6.122"
I1004 06:40:33.827410       1 xds/server.go:152] "[xds][xds-server 0xc00028f900] Created xds.GRPCServer"
I1004 06:40:33.827440       1 xds/server.go:153] "[xds][xds-server 0xc00028f900] xDS credentials in use: false"
I1004 06:40:33.827472       1 server/server.go:82] "Adding leaf Greeter service, as NEXT_HOP is not provided"
I1004 06:40:33.827551       1 csds/csds.go:76] "[xds][csds-server 0xc0002dfda0] Created CSDS server, with xdsClient 0xc0002cbe30"
I1004 06:40:33.827808       1 server/server.go:118] "Greeter service listening" port=50051 nextHop=""
I1004 06:40:33.827842       1 xds/server.go:212] "[xds][xds-server 0xc00028f900] Serve() passed a net.Listener on 0.0.0.0:50051"
I1004 06:40:33.827983       1 grpc@v1.58.2/clientconn.go:318] "[core][Channel #2] Channel created"
I1004 06:40:33.828036       1 grpc@v1.58.2/clientconn.go:1839] "[core][Channel #2] original dial target is: \"control-plane.xds:50051\""
I1004 06:40:33.828103       1 grpc@v1.58.2/clientconn.go:1846] "[core][Channel #2] parsed dial target is: {URL:{Scheme:control-plane.xds Opaque:50051 User: Host: Path: RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}"
I1004 06:40:33.828134       1 grpc@v1.58.2/clientconn.go:1860] "[core][Channel #2] fallback to scheme \"passthrough\""
I1004 06:40:33.828179       1 grpc@v1.58.2/clientconn.go:1868] "[core][Channel #2] parsed dial target is: {URL:{Scheme:passthrough Opaque: User: Host: Path:/control-plane.xds:50051 RawPath: OmitHost:false ForceQuery:false RawQuery: Fragment: RawFragment:}}"
I1004 06:40:33.828214       1 grpc@v1.58.2/clientconn.go:2001] "[core][Channel #2] Channel authority set to \"control-plane.xds:50051\""
I1004 06:40:33.828523       1 grpc@v1.58.2/balancer_conn_wrappers.go:180] "[core][Channel #2] Channel switches to new LB policy \"pick_first\""
I1004 06:40:33.828666       1 grpc@v1.58.2/pickfirst.go:141] "[core][pick-first-lb 0xc0003a46c0] Received new config {\n  \"shuffleAddressList\": false\n}, resolver state {\n  \"Addresses\": [\n    {\n      \"Addr\": \"control-plane.xds:50051\",\n      \"ServerName\": \"\",\n      \"Attributes\": null,\n      \"BalancerAttributes\": null,\n      \"Metadata\": null\n    }\n  ],\n  \"Endpoints\": [\n    {\n      \"Addresses\": [\n        {\n          \"Addr\": \"control-plane.xds:50051\",\n          \"ServerName\": \"\",\n          \"Attributes\": null,\n          \"BalancerAttributes\": null,\n          \"Metadata\": null\n        }\n      ],\n      \"Attributes\": null\n    }\n  ],\n  \"ServiceConfig\": null,\n  \"Attributes\": null\n}"
I1004 06:40:33.828727       1 grpc@v1.58.2/clientconn.go:956] "[core][Channel #2 SubChannel #3] Subchannel created"
I1004 06:40:33.828800       1 grpc@v1.58.2/clientconn.go:592] "[core][Channel #2] Channel Connectivity change to CONNECTING"
I1004 06:40:33.829052       1 grpc@v1.58.2/clientconn.go:1338] "[core][Channel #2 SubChannel #3] Subchannel Connectivity change to CONNECTING"
I1004 06:40:33.829296       1 grpc@v1.58.2/clientconn.go:1453] "[core][Channel #2 SubChannel #3] Subchannel picks a new address \"control-plane.xds:50051\" to connect"
I1004 06:40:33.830452       1 grpc@v1.58.2/pickfirst.go:184] "[core][pick-first-lb 0xc0003a46c0] Received SubConn state update: 0xc0003a4840, {ConnectivityState:CONNECTING ConnectionError:<nil>}"
I1004 06:40:33.830734       1 transport/transport.go:238] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Created transport to server \"control-plane.xds:50051\""
I1004 06:40:33.830824       1 xdsclient/authority.go:454] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] New watch for type \"ListenerResource\", resource name \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\""
I1004 06:40:33.830876       1 xdsclient/authority.go:471] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] First watch for type \"ListenerResource\", resource name \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\""
I1004 06:40:33.837750       1 grpc@v1.58.2/clientconn.go:1338] "[core][Channel #2 SubChannel #3] Subchannel Connectivity change to READY"
I1004 06:40:33.838076       1 grpc@v1.58.2/pickfirst.go:184] "[core][pick-first-lb 0xc0003a46c0] Received SubConn state update: 0xc0003a4840, {ConnectivityState:READY ConnectionError:<nil>}"
I1004 06:40:33.838117       1 grpc@v1.58.2/clientconn.go:592] "[core][Channel #2] Channel Connectivity change to READY"
I1004 06:40:33.838343       1 transport/transport.go:347] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS stream created"
I1004 06:40:33.839083       1 transport/transport.go:299] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS request sent: {\n  \"node\": {\n    \"id\": \"85e37245-9fdf-4947-b030-e142a1c42792~10.68.6.122\",\n    \"cluster\": \"greeter-leaf\",\n    \"metadata\": {\n        \"GCE_VM_ID\": \"2946063889887007962\",\n        \"GCE_VM_NAME\": \"gke-grpc-td-cluster-1-default-pool-04f485d7-btf2\",\n        \"GCP_PROJECT_NUMBER\": \"639413862670\",\n        \"GKE_CLUSTER_LOCATION\": \"australia-southeast1-b\",\n        \"GKE_CLUSTER_NAME\": \"grpc-td-cluster-1\",\n        \"INSTANCE_IP\": \"10.68.6.122\",\n        \"K8S_NAMESPACE\": \"xds\",\n        \"K8S_POD\": \"greeter-leaf-64fbd57fdb-gr8mv\",\n        \"XDS_STREAM_TYPE\": \"ADS\"\n      },\n    \"locality\": {\n      \"zone\": \"australia-southeast1-b\"\n    },\n    \"userAgentName\": \"gRPC Go\",\n    \"userAgentVersion\": \"1.58.2\",\n    \"clientFeatures\": [\n      \"envoy.lb.does_not_support_overprovisioning\",\n      \"xds.config.resource-in-sotw\"\n    ]\n  },\n  \"resourceNames\": [\n    \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\"\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.listener.v3.Listener\"\n}"
I1004 06:40:33.866634       1 transport/transport.go:313] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS response received: {\n  \"versionInfo\": \"1924756587086136\",\n  \"resources\": [\n    {\n      \"@type\": \"type.googleapis.com/envoy.config.listener.v3.Listener\",\n      \"name\": \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\",\n      \"address\": {\n        \"socketAddress\": {\n          \"address\": \"0.0.0.0\",\n          \"portValue\": 50051\n        }\n      },\n      \"filterChains\": [\n        {\n          \"filters\": [\n            {\n              \"name\": \"envoy.http_connection_manager\",\n              \"typedConfig\": {\n                \"@type\": \"type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager\",\n                \"statPrefix\": \"default_inbound_config\",\n                \"rds\": {\n                  \"configSource\": {\n                    \"ads\": {\n\n                    },\n                    \"resourceApiVersion\": \"V3\"\n                  },\n                  \"routeConfigName\": \"default_inbound_config\"\n                },\n                \"httpFilters\": [\n                  {\n                    \"name\": \"envoy.filters.http.router\",\n                    \"typedConfig\": {\n                      \"@type\": \"type.googleapis.com/envoy.extensions.filters.http.router.v3.Router\"\n                    }\n                  }\n                ],\n                \"forwardClientCertDetails\": \"APPEND_FORWARD\",\n                \"setCurrentClientCertDetails\": {\n                  \"subject\": true,\n                  \"dns\": true,\n                  \"uri\": true\n                },\n                \"upgradeConfigs\": [\n                  {\n                    \"upgradeType\": \"websocket\"\n                  }\n                ]\n              }\n            }\n          ]\n        }\n      ],\n      \"trafficDirection\": \"INBOUND\",\n      \"enableReusePort\": true\n    }\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.listener.v3.Listener\",\n  \"nonce\": \"0\"\n}"
I1004 06:40:33.867963       1 xdsclient/authority.go:454] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] New watch for type \"RouteConfigResource\", resource name \"default_inbound_config\""
I1004 06:40:33.868306       1 xdsclient/authority.go:231] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Resource type \"ListenerResource\" with name \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\" added to cache"
I1004 06:40:33.868516       1 xdsclient/authority.go:471] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] First watch for type \"RouteConfigResource\", resource name \"default_inbound_config\""
I1004 06:40:33.868806       1 transport/transport.go:299] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS request sent: {\n  \"resourceNames\": [\n    \"default_inbound_config\"\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.route.v3.RouteConfiguration\"\n}"
I1004 06:40:33.869213       1 transport/transport.go:533] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Sending ACK for resource type: \"type.googleapis.com/envoy.config.listener.v3.Listener\", version: \"1924756587086136\", nonce: \"0\""
I1004 06:40:33.869538       1 transport/transport.go:299] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS request sent: {\n  \"versionInfo\": \"1924756587086136\",\n  \"resourceNames\": [\n    \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\"\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.listener.v3.Listener\",\n  \"responseNonce\": \"0\"\n}"
I1004 06:40:33.882832       1 transport/transport.go:313] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS response received: {\n  \"versionInfo\": \"1924756587086136\",\n  \"resources\": [\n    {\n      \"@type\": \"type.googleapis.com/envoy.config.route.v3.RouteConfiguration\",\n      \"name\": \"default_inbound_config\",\n      \"virtualHosts\": [\n        {\n          \"name\": \"default_inbound_config\",\n          \"domains\": [\n            \"*\"\n          ],\n          \"routes\": [\n            {\n              \"match\": {\n                \"prefix\": \"/\"\n              },\n              \"nonForwardingAction\": {\n\n              },\n              \"decorator\": {\n                \"operation\": \"default_inbound_config/*\"\n              }\n            }\n          ]\n        }\n      ]\n    }\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.route.v3.RouteConfiguration\",\n  \"nonce\": \"1\"\n}"
I1004 06:40:33.883340       1 grpc@v1.58.2/server.go:855] "[core][Server #1 ListenSocket #5] ListenSocket created"
E1004 06:40:33.883687       1 xds/server.go:185] "[xds][xds-server 0xc00028f900] Listener \"0.0.0.0:50051\" entering mode: \"SERVING\""
I1004 06:40:33.883255       1 xdsclient/authority.go:231] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Resource type \"RouteConfigResource\" with name \"default_inbound_config\" added to cache"
I1004 06:40:33.885124       1 transport/transport.go:533] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Sending ACK for resource type: \"type.googleapis.com/envoy.config.route.v3.RouteConfiguration\", version: \"1924756587086136\", nonce: \"1\""
I1004 06:40:33.885563       1 transport/transport.go:299] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] ADS request sent: {\n  \"versionInfo\": \"1924756587086136\",\n  \"resourceNames\": [\n    \"default_inbound_config\"\n  ],\n  \"typeUrl\": \"type.googleapis.com/envoy.config.route.v3.RouteConfiguration\",\n  \"responseNonce\": \"1\"\n}"
I1004 06:40:43.330272       1 xdsclient/authority.go:508] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Removing last watch for type \"ListenerResource\", resource name \"grpc/server?xds.resource.listening_address=0.0.0.0:50051\""
I1004 06:40:43.330343       1 xdsclient/authority.go:508] "[xds][xds-client 0xc00028f9f0] [control-plane.xds:50051] Removing last watch for type \"RouteConfigResource\", resource name \"default_inbound_config\""
I1004 06:40:43.330418       1 grpc@v1.58.2/server.go:806] "[core][Server #1 ListenSocket #5] ListenSocket deleted"
I1004 06:40:43.330457       1 server/server.go:105] "Cleaning up admin services as the server is stopping"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xbd7b9c]

goroutine 1 [running]:
google.golang.org/grpc/xds/internal/xdsclient/xdsresource.(*FilterChainManager).Lookup(0x0, {0x1, {0xc00035a408, 0x4, 0x4}, {0xc00035a3e8, 0x4, 0x4}, 0xd3b2})
	google.golang.org/grpc@v1.58.2/xds/internal/xdsclient/xdsresource/filter_chain.go:695 +0x5c
google.golang.org/grpc/xds/internal/server.(*listenerWrapper).Accept(0xc000279340)
	google.golang.org/grpc@v1.58.2/xds/internal/server/listener_wrapper.go:244 +0x32b
google.golang.org/grpc.(*Server).Serve(0xc0002c8960, {0x13e1620?, 0xc000279340})
	google.golang.org/grpc@v1.58.2/server.go:859 +0x475
google.golang.org/grpc/xds.(*GRPCServer).Serve(0xc00028f900, {0x13e2670, 0xc0002bf200})
	google.golang.org/grpc@v1.58.2/xds/server.go:266 +0x3f3
example.com/greeter-go/pkg/server.Run({0x13e3400, 0xc0002990b0}, {0xc383, {0xc0002a0500, 0x35}, {0x0, 0x0}, 0x1, 0x0})
	example.com/greeter-go/pkg/server/server.go:119 +0xb0c
example.com/greeter-go/cmd.Run({0x13e3390?, 0xc000062010?}, 0x60?, {0xc000050190, 0x2, 0x2})
	example.com/greeter-go/cmd/cmd.go:48 +0x35b
main.main()
	example.com/greeter-go/main.go:30 +0x6c

What do you think is happening?

After some interactive debugging:

In the handleLDSUpdate() method in listener_wrapper.go, there is a conditional step that only updates the listenerWrapper's FilterChainManager (l.filterChains) if the InboundListenerConfig's FilterChainManager does not contain the names of any RouteConfigurations to be fetched via RDS:

// If there are no dynamic RDS Configurations still needed to be received
// from the management server, this listener has all the configuration
// needed, and is ready to serve.
if len(ilc.FilterChains.RouteConfigNames) == 0 {
l.switchMode(ilc.FilterChains, connectivity.ServingModeServing, nil)
l.goodUpdate.Fire()
}

Experiment 1: When the RouteConfiguration is inlined in the server Listener, this condition is true, and the call to switchMode() to set the serving mode to SERVING results in the listenerWrapper's FilterChainManager being set based on the value of the InboundListenerConfig's FilterChainManager, and all is good.

Experiment 2: When the RouteConfiguration is fetched dynamically using RDS, this condition is false, and the listenerWrapper's FilterChainManager remains nil. Later, in handleRDSUpdate(), the call to switchMode() to set the serving mode to SERVING provides a nil *FilterChainManager argument - l.filterChains - because handleLDSUpdate() didn't set the listenerWrapper's FilterChainManager:

l.switchMode(l.filterChains, connectivity.ServingModeServing, nil)

What else did you try?

  1. I also tried setting GRPC_XDS_EXPERIMENTAL_RBAC=false. This resulted in the xDS-enabled gRPC server not fetching the RouteConfiguration via RDS at all - which also doesn't seem quite right. However, the gRPC server could still serve requests. I believe the reason for not fetching the RouteConfiguration dynamically via RDS is this code block:

    if !envconfig.XDSRBAC {
    continue
    }
    switch hcm.RouteSpecifier.(type) {
    case *v3httppb.HttpConnectionManager_Rds:

  2. I created an xDS-enabled gRPC server using gRPC-Java v1.58.0 and connected it to the same xDS management server with the server Listener's RouteConfiguration to be fetched using RDS (via ADS). This implementation correctly fetched both the server Listener and the associated RouteConfiguration via RDS. I verified this using grpcdebug:

    Output of grpcdebug IP:PORT xds status for the gRPC-Java xDS client:

    Name                                                       Status    Version            Type                                                           LastUpdated
    grpc/server?xds.resource.listening_address=0.0.0.0:50051   ACKED     1923876079868142   type.googleapis.com/envoy.config.listener.v3.Listener          3 minutes ago
    default_inbound_config                                     ACKED     1923876079868142   type.googleapis.com/envoy.config.route.v3.RouteConfiguration   3 minutes ago
    

    Output of grpcdebug IP:PORT xds config | yq --input-format=json --prettyPrint for the gRPC-Java xDS client:

    config:
      - node:
          id: 7c175a15-dcc0-4a44-9059-609999c36e81~10.68.3.12
          cluster: greeter-leaf
          metadata:
            GCE_VM_ID: "4586267245454432102"
            GCE_VM_NAME: gke-grpc-td-cluster-1-default-pool-04f485d7-b9ak
            GCP_PROJECT_NUMBER: "639413862670"
            GKE_CLUSTER_LOCATION: australia-southeast1-b
            GKE_CLUSTER_NAME: grpc-td-cluster-1
            INSTANCE_IP: 10.68.3.12
            K8S_NAMESPACE: xds
            K8S_POD: greeter-leaf-5b6bbcd6d5-cl4gk
            XDS_STREAM_TYPE: ADS
          locality:
            zone: australia-southeast1-b
          userAgentName: gRPC Java
          userAgentVersion: 1.58.0
          clientFeatures:
            - envoy.lb.does_not_support_overprovisioning
            - xds.config.resource-in-sotw
        genericXdsConfigs:
          - typeUrl: type.googleapis.com/envoy.config.listener.v3.Listener
            name: grpc/server?xds.resource.listening_address=0.0.0.0:50051
            versionInfo: "1923876079868142"
            xdsConfig:
              '@type': type.googleapis.com/envoy.config.listener.v3.Listener
              name: grpc/server?xds.resource.listening_address=0.0.0.0:50051
              address:
                socketAddress:
                  address: 0.0.0.0
                  portValue: 50051
              filterChains:
                - filters:
                    - name: envoy.http_connection_manager
                      typedConfig:
                        '@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                        statPrefix: default_inbound_config
                        rds:
                          configSource:
                            ads: {}
                            resourceApiVersion: V3
                          routeConfigName: default_inbound_config
                        httpFilters:
                          - name: envoy.filters.http.router
                            typedConfig:
                              '@type': type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
                        forwardClientCertDetails: APPEND_FORWARD
                        setCurrentClientCertDetails:
                          subject: true
                          dns: true
                          uri: true
                        upgradeConfigs:
                          - upgradeType: websocket
              trafficDirection: INBOUND
              enableReusePort: true
            lastUpdated: "2023-10-04T06:25:44.751Z"
            clientStatus: ACKED
          - typeUrl: type.googleapis.com/envoy.config.route.v3.RouteConfiguration
            name: default_inbound_config
            versionInfo: "1923876079868142"
            xdsConfig:
              '@type': type.googleapis.com/envoy.config.route.v3.RouteConfiguration
              name: default_inbound_config
              virtualHosts:
                - name: default_inbound_config
                  domains:
                    - '*'
                  routes:
                    - match:
                        prefix: /
                      nonForwardingAction: {}
                      decorator:
                        operation: default_inbound_config/*
            lastUpdated: "2023-10-04T06:25:44.760Z"
            clientStatus: ACKED
@zasweq
Copy link
Contributor

zasweq commented Oct 24, 2023

Hello, thank you for bringing this issue up. Do you happen to have reproducible code (only locally). I'm assuming not since I think this uses a legitimate management server, but if you had a local reproducible way that would be super helpful. This looks like a legitimate bug, I will work on writing a local test case that reproduces this and the fix.

@zasweq
Copy link
Contributor

zasweq commented Oct 24, 2023

It's weird that this hasn't come up yet. It seems like it would be hit by any LDS + RDS on the server :/.

@zasweq
Copy link
Contributor

zasweq commented Oct 24, 2023

#6726 this PR changes the logic for LDS + RDS, but it looks like it would still be an issue. Can you try it on master?

halvards added a commit to GoogleCloudPlatform/solutions-workshops that referenced this issue Oct 30, 2023
This version updates the control plane implementations to use a single
xDS resource snapshot for all nodes, instead of separate snapshots for
each node.

This commit also includes the following fixes and changes:

- Fix sending `xds:///` requests from the bastion troubleshooting pod.
  This is fixed by the use of a single xDS resource snapshot.
- Handle missing zone and node name in EndpointSlice
  (`control-plane-go`).
- Remove hardcoded server listener address and port. xDS-enabled
  gRPC servers can now use any address and port. New addresses and
  ports are captured on LDS stream creation.
- Use inline RouteConfiguration for server Listener instead of RDS in
  `control-plane-go`. This is a workaround for
  grpc/grpc-go#6683
- Enabled additional experimental gRPC xDS flags via envvars.
- Added make targets for mixing Go and Java control plane and greeter.
- Added steps for running kind on ChromeOS.
- Updated dependencies.
@halvards
Copy link
Contributor Author

Hi @zasweq, thanks for looking into this!

#6726 this PR changes the logic for LDS + RDS, but it looks like it would still be an issue. Can you try it on master?

I tried using both v1.59.0 and current HEAD of master (commit 8cb9846), with the same result. I see that the envvar flags for experimental features such as GRPC_XDS_EXPERIMENTAL_RBAC have also been removed on master, so the first option from What else did you try? above can't be exercised. (I understand the desire to remove the flags.)

Do you happen to have reproducible code

I do :-D https://github.com/GoogleCloudPlatform/solutions-workshops/tree/grpc-xds/v0.0.2/grpc-xds-workshop

If you have recent versions of Skaffold and kustomize and access to a Kubernetes cluster, you can build and deploy the control plane and sample app simply with make run-go.

And if you want to deploy with delve, and port forwarding set up for remote interactive debugging of both the control plane and the xDS-enabled gRPC server, you can run make debug-go. The repo contains debug launch configurations for VS Code and Goland.

If you want to mix and match Go and Java, there are no convenient make targets available, but here's how to run the Go control plane implementation with the Java xDS-enabled gRPC server:

skaffold run --build-concurrency=0 --detect-minikube=false --module=go-java --port-forward=user --skip-tests

Use --module=java-go for the inverse combination.

The Go implementation uses klog for logging, and you can adjust the file-filtered log level verbosity of the xDS-enabled gRPC server here:
https://github.com/GoogleCloudPlatform/solutions-workshops/blob/499cb0f90c47ce0c761ce94c8f41413f34a59f45/grpc-xds-workshop/greeter-go/k8s/base/patch-go-log-flags.yaml#L28

The envvars controlling experimental features can be tweaked here:
https://github.com/GoogleCloudPlatform/solutions-workshops/blob/499cb0f90c47ce0c761ce94c8f41413f34a59f45/grpc-xds-workshop/k8s/greeter/overlays/diy-xds/patch-xds-init-diy.yaml#L29-L47

It's weird that this hasn't come up yet. It seems like it would be hit by any LDS + RDS on the server :/.

Traffic Director inlines the server Listener's RouteConfiguration, so users of that control plane wouldn't encounter this issue. I'm not sure what other common control plane implementations do.

@zasweq
Copy link
Contributor

zasweq commented Nov 1, 2023

Hello, thank you for the reply. I attempted to fix this issue in #6755, however in the processing of trying to fix this nil panic we discovered server side lds with rds not inlined is entirely broken in Go, and we are working on a fix now. Thus, it is safe to say lds + rds not inlined is currently unsupported by our Go server (we will not backport the fix to any release branch, as the layering changed for the Server side and it is also entirely broken in the first place). As a temporary workaround, perhaps switch the RDS to be inlined in the LDS. I will update this once the fix is in place.

@arvindbr8
Copy link
Member

Closing this, and tracking it in #6788

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants