-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xDS: Using RDS to reference RouteConfiguration in server Listener results in panic due to nil FilterChainManager #6683
Comments
Hello, thank you for bringing this issue up. Do you happen to have reproducible code (only locally). I'm assuming not since I think this uses a legitimate management server, but if you had a local reproducible way that would be super helpful. This looks like a legitimate bug, I will work on writing a local test case that reproduces this and the fix. |
It's weird that this hasn't come up yet. It seems like it would be hit by any LDS + RDS on the server :/. |
#6726 this PR changes the logic for LDS + RDS, but it looks like it would still be an issue. Can you try it on master? |
This version updates the control plane implementations to use a single xDS resource snapshot for all nodes, instead of separate snapshots for each node. This commit also includes the following fixes and changes: - Fix sending `xds:///` requests from the bastion troubleshooting pod. This is fixed by the use of a single xDS resource snapshot. - Handle missing zone and node name in EndpointSlice (`control-plane-go`). - Remove hardcoded server listener address and port. xDS-enabled gRPC servers can now use any address and port. New addresses and ports are captured on LDS stream creation. - Use inline RouteConfiguration for server Listener instead of RDS in `control-plane-go`. This is a workaround for grpc/grpc-go#6683 - Enabled additional experimental gRPC xDS flags via envvars. - Added make targets for mixing Go and Java control plane and greeter. - Added steps for running kind on ChromeOS. - Updated dependencies.
Hi @zasweq, thanks for looking into this!
I tried using both v1.59.0 and current HEAD of master (commit 8cb9846), with the same result. I see that the envvar flags for experimental features such as
I do :-D https://github.com/GoogleCloudPlatform/solutions-workshops/tree/grpc-xds/v0.0.2/grpc-xds-workshop If you have recent versions of Skaffold and kustomize and access to a Kubernetes cluster, you can build and deploy the control plane and sample app simply with And if you want to deploy with delve, and port forwarding set up for remote interactive debugging of both the control plane and the xDS-enabled gRPC server, you can run If you want to mix and match Go and Java, there are no convenient skaffold run --build-concurrency=0 --detect-minikube=false --module=go-java --port-forward=user --skip-tests Use The Go implementation uses The envvars controlling experimental features can be tweaked here:
Traffic Director inlines the server Listener's RouteConfiguration, so users of that control plane wouldn't encounter this issue. I'm not sure what other common control plane implementations do. |
Hello, thank you for the reply. I attempted to fix this issue in #6755, however in the processing of trying to fix this nil panic we discovered server side lds with rds not inlined is entirely broken in Go, and we are working on a fix now. Thus, it is safe to say lds + rds not inlined is currently unsupported by our Go server (we will not backport the fix to any release branch, as the layering changed for the Server side and it is also entirely broken in the first place). As a temporary workaround, perhaps switch the RDS to be inlined in the LDS. I will update this once the fix is in place. |
Closing this, and tracking it in #6788 |
What version of gRPC are you using?
gRPC-Go v1.58.2
What version of Go are you using (
go version
)?go version go1.20.8 darwin/arm64
What operating system (Linux, Windows, …) and version?
macOS
What did you do?
Set up an xDS management server, populated with a server Listener and associated RouteConfiguration.
Experiment 1: The server Listener contains an inline RouteConfiguration.
Experiment 2: The server Listener specifies using RDS (via ADS) to look up the RouteConfiguration, instead of an inline config in the Listener.
What did you expect to see?
Experiment 1: The xDS-enabled gRPC server fetches the server Listener with its inline RouteConfiguration using LDS. These two resources configure the server to handle the traffic, including setting the
FilterChainManager
of thelistenerWrapper
.Experiment 2: The xDS-enabled gRPC server fetches the server Listener using LDS and the RouteConfiguration using RDS. These two resources configure the server to handle the traffic, including setting the
FilterChainManager
of thelistenerWrapper
.What did you see instead?
Experiment 1:
The xDS-enabled gRPC server fetches the server Listener with its inline RouteConfiguration, and the server can serve traffic.
Output of
grpcdebug IP:PORT xds status
:Output of
grpcdebug IP:PORT xds config | yq --input-format=json --prettyPrint
:Experiment 2:
After fetching the resources, panic in the xDS-enabled gRPC server due to nil pointer dereference in the method
Lookup()
on line 695 offilter_chain.go
, as the method's pointer receiver (*FilterChainManager
) isnil
:grpc-go/xds/internal/xdsclient/xdsresource/filter_chain.go
Lines 694 to 695 in c0aa20a
Logs from the gRPC-Go xDS client:
What do you think is happening?
After some interactive debugging:
In the
handleLDSUpdate()
method inlistener_wrapper.go
, there is a conditional step that only updates thelistenerWrapper
'sFilterChainManager
(l.filterChains
) if theInboundListenerConfig
'sFilterChainManager
does not contain the names of any RouteConfigurations to be fetched via RDS:grpc-go/xds/internal/server/listener_wrapper.go
Lines 413 to 419 in c0aa20a
Experiment 1: When the RouteConfiguration is inlined in the server Listener, this condition is true, and the call to
switchMode()
to set the serving mode toSERVING
results in thelistenerWrapper
'sFilterChainManager
being set based on the value of theInboundListenerConfig
'sFilterChainManager
, and all is good.Experiment 2: When the RouteConfiguration is fetched dynamically using RDS, this condition is false, and the
listenerWrapper
'sFilterChainManager
remainsnil
. Later, inhandleRDSUpdate()
, the call toswitchMode()
to set the serving mode toSERVING
provides a nil*FilterChainManager
argument -l.filterChains
- becausehandleLDSUpdate()
didn't set thelistenerWrapper
'sFilterChainManager
:grpc-go/xds/internal/server/listener_wrapper.go
Line 367 in c0aa20a
What else did you try?
I also tried setting
GRPC_XDS_EXPERIMENTAL_RBAC=false
. This resulted in the xDS-enabled gRPC server not fetching the RouteConfiguration via RDS at all - which also doesn't seem quite right. However, the gRPC server could still serve requests. I believe the reason for not fetching the RouteConfiguration dynamically via RDS is this code block:grpc-go/xds/internal/xdsclient/xdsresource/filter_chain.go
Lines 632 to 636 in c0aa20a
I created an xDS-enabled gRPC server using gRPC-Java v1.58.0 and connected it to the same xDS management server with the server Listener's RouteConfiguration to be fetched using RDS (via ADS). This implementation correctly fetched both the server Listener and the associated RouteConfiguration via RDS. I verified this using
grpcdebug
:Output of
grpcdebug IP:PORT xds status
for the gRPC-Java xDS client:Output of
grpcdebug IP:PORT xds config | yq --input-format=json --prettyPrint
for the gRPC-Java xDS client:The text was updated successfully, but these errors were encountered: