NetworkDB qlen optimization #2216

fcrisciani · 2018-07-03T00:00:56Z

Added some optimizations to reduce the messages in the queue:

on join network the node execute a tcp sync with all the nodes that it is aware part of the specific network. During this time before the node was redistributing all the entries. This meant that if the network had 10K entries the queue of the joining node will jump to 10K. The fix adds a flag on the network that would avoid to insert any entry in the queue till the sync happens. Note that right now the flag is set in a best effort way, there is no real check if at least one of the nodes succeed.
limit the number of messages to redistribute coming from a TCP sync. Introduced a threshold that limit the number of messages that are propagated, this will disable this optimization in case of heavy load.

The other commit is mainly testing infrastructure to expose the queue length and to allow to write a fixed number of entries.

Allow to write and delete X number of entries Allow to query the queue length Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

Added some optimizations to reduce the messages in the queue: 1) on join network the node execute a tcp sync with all the nodes that it is aware part of the specific network. During this time before the node was redistributing all the entries. This meant that if the network had 10K entries the queue of the joining node will jump to 10K. The fix adds a flag on the network that would avoid to insert any entry in the queue till the sync happens. Note that right now the flag is set in a best effort way, there is no real check if at least one of the nodes succeed. 2) limit the number of messages to redistribute coming from a TCP sync. Introduced a threshold that limit the number of messages that are propagated, this will disable this optimization in case of heavy load. Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

thaJeztah · 2018-07-05T13:22:42Z

ping @ctelfer @dani-docker PTAL

ctelfer · 2018-07-05T14:36:24Z

cmd/networkdb-test/dbclient/ndbClient.go

+		case "lt":
+			if node.result > size {
+				log.Fatalf("Expected queue size from %s to be %d < %d", node.id, node.result, size)
+			}


I understand what this is doing, but not why. Why is an assertion necessary for what looks like a query. This command computes an average queue size. Why should it fail if the size for one of the nodes queues is larger or smaller than some threshold and, moreover, why is necessary that such a test be present?

taken comments offline. the short answer is that this is to understand if any of the node has a bigger queue on the testing infra

Ah, I understand now that this is a test, not an information gathering function despite its debug output at the end. Please disregard this comment and the one below.

ctelfer · 2018-07-05T14:42:52Z

cmd/networkdb-test/dbclient/ndbClient.go

+	// Enable watch of tables from clients
+	for i := 0; i < parallelWriters; i++ {
+		go clientWatchTable(ips[i], servicePort, networkName, tableName, doneCh)
+	}


Something seems wrong here. What happens if the # of parallel writers is less than the number of IPs specified in the command? Seems like some nodes would go "unwatched". More importantly, what happens if the number of parallel watchers is grater than the number of IPs. Seems like it will panic!

Ok. So @fcrisciani explained that it was intended to be able to write with less than the full # of nodes in order to ensure that the writes get propagated. Having more parallel writers than nodes would, of course, cause a panic. However it was designed for use in more constrained (testing) environments.

ctelfer · 2018-07-05T14:44:09Z

cmd/networkdb-test/dbclient/ndbClient.go

+		key := "key-" + strconv.Itoa(i) + "-"
+		logrus.Infof("Spawn worker: %d on IP:%s", i, ips[i])
+		go deleteKeysNumber(ips[i], servicePort, networkName, tableName, key, numberOfKeys, doneCh)
+	}


ditto here. I'm probably not understanding how this is supposed to be invoked. But from the code it seems like parallelWriters should just be derived from len(ips).

dani-docker · 2018-07-05T18:08:00Z

networkdb/broadcast.go

@@ -168,7 +167,6 @@ func (nDB *NetworkDB) sendTableEvent(event TableEvent_Type, nid string, tname st
 		id:    nid,
 		tname: tname,
 		key:   key,
-		node:  nDB.config.NodeID,


Just to confirm, the node attribute is just redundant, I don't see us reading anywhere. Is this the reason for removing it? And slightly reducing the message size?

yes correct, today it has no use, so I just removed it. This header also is not being sent out so will save memory locally not in the messages sent

ctelfer

LGTM

dani-docker · 2018-07-05T18:27:54Z

networkdb/networkdb.go

+	// note this is a best effort, we are not checking the result of the bulk sync
+	nDB.Lock()
+	n.inSync = true
+	nDB.Unlock()


So what will happen if we got a false positive here, in other words, one node in bulkSync failed and the others succeeded, causing the err to be nil? Do we just count on periodic sync? would this affect convergence time ?

this flag is mainly to reduce the number of messages sent around during the join network. The current logic tries to improve convergence readvertising also messages that are coming from tcp sync. Before this change the behavior was such that if the table had 10K entries the node was starting with 10K messages in the queue because was trying to propagate all that entries that came to know during the join network tcp sync. This change instead make sure that during the first tcp sync no entries are being resent around. This way a node will join and will have close to 0 entries in the queue and full state synced.

Now to answer your question if the bulkSync succeed with at least 1 node, that will be enough to have the internal state "enough in sync" in term of entries, because in theory all the nodes on the same network holds the same content. The difference will only be on the entries still "in flight", but the number of them should be negligible compared with the number of entries coming from the bulk sync.
This flag does not really affect convergence logic because anyway this node when it joins start with an empty table and empty queue, so was already not helping in propagating the messages around.

let me know if I addressed the question properly

yes, that explains it!

dani-docker

LGTM 👍

Flavio Crisciani added 2 commits July 2, 2018 16:47

Enhance testing infra

dd5a9ea

Allow to write and delete X number of entries Allow to query the queue length Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

fcrisciani requested review from ctelfer and dani-docker July 3, 2018 00:00

ctelfer reviewed Jul 5, 2018

View reviewed changes

dani-docker reviewed Jul 5, 2018

View reviewed changes

ctelfer approved these changes Jul 5, 2018

View reviewed changes

dani-docker reviewed Jul 5, 2018

View reviewed changes

dani-docker approved these changes Jul 5, 2018

View reviewed changes

fcrisciani merged commit 3ac297b into moby:master Jul 5, 2018

fcrisciani deleted the netdb-qlen-issue branch July 5, 2018 22:03

thaJeztah mentioned this pull request Jul 7, 2018

[18.03] NetworkDB qlen optimization #2225

Merged

ctelfer mentioned this pull request Jul 9, 2018

Improve scalability of the Linux load balancing moby/moby#37372

Merged

thaJeztah mentioned this pull request Jul 9, 2018

[18.06] Improve scalability of the Linux load balancing docker-archive/engine#16

Merged

trapier mentioned this pull request May 17, 2019

[Backport 17.06] NetworkDB qlen optimization #2379

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NetworkDB qlen optimization #2216

NetworkDB qlen optimization #2216

fcrisciani commented Jul 3, 2018

thaJeztah commented Jul 5, 2018

ctelfer Jul 5, 2018

fcrisciani Jul 5, 2018

ctelfer Jul 5, 2018

ctelfer Jul 5, 2018

ctelfer Jul 5, 2018

ctelfer Jul 5, 2018

dani-docker Jul 5, 2018

fcrisciani Jul 5, 2018

dani-docker Jul 5, 2018

ctelfer left a comment

dani-docker Jul 5, 2018

fcrisciani Jul 5, 2018

fcrisciani Jul 5, 2018

dani-docker Jul 5, 2018

dani-docker left a comment

NetworkDB qlen optimization #2216

NetworkDB qlen optimization #2216

Conversation

fcrisciani commented Jul 3, 2018

thaJeztah commented Jul 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ctelfer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dani-docker left a comment

Choose a reason for hiding this comment