-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document network diagnostic tool #5558
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,219 @@ | ||
--- | ||
description: Learn to use the built-in network debugger to debug overlay networking problems | ||
keywords: network, troubleshooting, debug | ||
title: Debug overlay or swarm networking issues | ||
--- | ||
|
||
Docker CE 17.12 and higher introduce a network debugging tool designed to help | ||
debug issues with overlay networks and swarm services running on Linux hosts. | ||
When enabled, a network diagnostic server listens on the specified port and | ||
provides diagnostic information. The network debugging tool should only be | ||
started to debug specific issues, and should not be left running all the time. | ||
|
||
Information about networks is stored in a database, which can be examined using | ||
the API. | ||
|
||
The Docker API exposes endpoints to query and control the network debugging | ||
tool. CLI integration is provided as a preview, but the implementation is not | ||
yet considered stable and commands and options may change without notice. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should add a big fat warning that this tool should be used with care, as (IIUC) incorrect use of the tool can destroy/damage the network (it's not a read-only tool), and also expose information about your cluster's configuration that should be kept private (so don't expose this API outside of the host). |
||
|
||
## Enable the diagnostic tool | ||
|
||
The tool currently only works on Docker hosts running on Linux. Repeat these | ||
steps for each node participating in the swarm. | ||
|
||
1. Set the `network-diagnostic-port` to a port which is free on the Docker | ||
host, in the `/etc/docker/daemon.json` configuration file. | ||
|
||
```json | ||
“network-diagnostic-port”: <port> | ||
``` | ||
|
||
2. Get the process ID (PID) of the `dockerd` process. It is the second field in | ||
the output, and is typically a number from 2 to 6 digits long. | ||
|
||
```bash | ||
$ ps aux |grep dockerd | grep -v grep | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If running with systemd (perhaps it's ok to assume that's the case), it's possible to use |
||
``` | ||
|
||
3. Reload the Docker configuration without restarting Docker, by sending the | ||
`HUP` signal to the PID you found in the previous step. | ||
|
||
```bash | ||
kill -HUP <pid-of-dockerd> | ||
``` | ||
|
||
A message like the following will appear in the Docker host logs: | ||
|
||
```none | ||
Starting the diagnose server listening on <port> for commands | ||
``` | ||
|
||
## Disable the diagnostic tool | ||
|
||
Repeat these steps for each node participating in the swarm. | ||
|
||
1. Remove the `network-diagnostic-port` key from the `/etc/docker/daemon.json` | ||
configuration file. | ||
|
||
2. Get the process ID (PID) of the `dockerd` process. It is the second field in | ||
the output, and is typically a number from 2 to 6 digits long. | ||
|
||
```bash | ||
$ ps aux |grep dockerd | grep -v grep | ||
``` | ||
|
||
3. Reload the Docker configuration without restarting Docker, by sending the | ||
`HUP` signal to the PID you found in the previous step. | ||
|
||
```bash | ||
kill -HUP <pid-of-dockerd> | ||
``` | ||
|
||
A message like the following will appear in the Docker host logs: | ||
|
||
```none | ||
Disabling the diagnose server | ||
``` | ||
|
||
## Access the diagnostic tool's API | ||
|
||
The network diagnostic tool exposes its own RESTful API. To access the API, | ||
send a HTTP request to the port where the tool is listening. The following | ||
commands assume the tool is listening on port 2000. | ||
|
||
Examples are not given for every endpoint. | ||
|
||
### Get help | ||
|
||
```bash | ||
$ curl localhost:2000/help | ||
|
||
OK | ||
/updateentry | ||
/getentry | ||
/gettable | ||
/leavenetwork | ||
/createentry | ||
/help | ||
/clusterpeers | ||
/ready | ||
/joinnetwork | ||
/deleteentry | ||
/networkpeers | ||
/ | ||
/join | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't the |
||
|
||
### Join or leave the network database cluster | ||
|
||
```bash | ||
$ curl localhost:2000/join?members=ip1,ip2,... | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be good to clarify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think that's useful (at least for one of the examples) |
||
``` | ||
|
||
```bash | ||
$ curl localhost:2000/leave?members=ip1,ip2,... | ||
``` | ||
|
||
### Join or leave a network | ||
|
||
```bash | ||
$ curl localhost:2000/joinnetwork?nid=<network id> | ||
``` | ||
|
||
```bash | ||
$ curl localhost:2000/leavenetwork?nid=<network id> | ||
``` | ||
|
||
### List cluster peers | ||
|
||
```bash | ||
$ curl localhost:2000/clusterpeers | ||
``` | ||
|
||
### List nodes connected to a given network | ||
|
||
```bash | ||
$ curl localhost:2000/networkpeers?nid=<network id> | ||
``` | ||
|
||
### Dump database tables | ||
|
||
The tables are called `endpoint_table` and `overlay_peer_table`. These names may | ||
change. | ||
|
||
```bash | ||
$ curl localhost:2000/gettable?nid=<network id>&tname=<table name> | ||
``` | ||
|
||
### Interact with a specific database table | ||
|
||
The tables are called `endpoint_table` and `overlay_peer_table`. These names may | ||
change. | ||
|
||
```bash | ||
$ curl localhost:2000/<method>?nid=<network id>&tname=<table name>&key=<key>[&value=<value>] | ||
``` | ||
|
||
## Access the diagnostic tool's CLI | ||
|
||
The CLI is provided as a preview and is not yet stable. Commands or options may | ||
change at any time. | ||
|
||
The CLI executable is called `diagnosticClient` and is made available using a | ||
standalone container. | ||
|
||
The following flags are supported: | ||
|
||
| Flag | Description | | ||
|---------------|-------------------------------------------------| | ||
| -c <string> | Command to run. One of `sd` or `overlay`. | | ||
| -ip <string> | The IP address to query. Defaults to 127.0.0.1. | | ||
| -net <string> | The target network ID. | | ||
| -port <int> | The target port. | | ||
| -v | Enable verbose output. | | ||
|
||
### Access the CLI | ||
|
||
The CLI is provided as a container that needs to run using privileged mode. | ||
|
||
1. To run the container, use a command like the following: | ||
|
||
```bash | ||
$ docker container run --name net-diagnostic -d --privileged --network host fcrisciani/network-diagnostic | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about we push the containerized tool to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed; but that can be handled separate from this PR (and this reference updated in a follow-up) |
||
``` | ||
|
||
2. Connect to the container using `docker container attach <container-ID>`, | ||
and start the server using the following command: | ||
|
||
```bash | ||
$ kill -HUP 1 | ||
``` | ||
|
||
3. If you have not already done so, join the Docker host to the swarm, then | ||
run the diagnostic CLI within the container. | ||
|
||
```bash | ||
$ ./diagnosticClient <flags>... | ||
``` | ||
|
||
4. When finished debugging, stop the container. | ||
|
||
### Examples | ||
|
||
The following commands dump the service discovery table and verify node | ||
ownership. | ||
|
||
**Standalone network:** | ||
|
||
```bash | ||
$ debugClient -c sd -v -net n8a8ie6tb3wr2e260vxj8ncy4 | ||
``` | ||
|
||
**Overlay network:** | ||
|
||
```bash | ||
$ debugClient -port 2001 -c overlay -v -net n8a8ie6tb3wr2e260vxj8ncy4 | ||
``` | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stored in a database
: how about a little more details around the networkdb: