Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetworkDB qlen optimization #2216

Merged
merged 2 commits into from
Jul 5, 2018
Merged

Conversation

fcrisciani
Copy link

Added some optimizations to reduce the messages in the queue:

  1. on join network the node execute a tcp sync with all the nodes that it is aware part of the specific network. During this time before the node was redistributing all the entries. This meant that if the network had 10K entries the queue of the joining node will jump to 10K. The fix adds a flag on the network that would avoid to insert any entry in the queue till the sync happens. Note that right now the flag is set in a best effort way, there is no real check if at least one of the nodes succeed.
  2. limit the number of messages to redistribute coming from a TCP sync. Introduced a threshold that limit the number of messages that are propagated, this will disable this optimization in case of heavy load.

The other commit is mainly testing infrastructure to expose the queue length and to allow to write a fixed number of entries.

Flavio Crisciani added 2 commits July 2, 2018 16:47
Allow to write and delete X number of entries
Allow to query the queue length

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
Added some optimizations to reduce the messages in the queue:
1) on join network the node execute a tcp sync with all the nodes that
it is aware part of the specific network. During this time before the
node was redistributing all the entries. This meant that if the network
had 10K entries the queue of the joining node will jump to 10K. The fix
adds a flag on the network that would avoid to insert any entry in the
queue till the sync happens. Note that right now the flag is set in
a best effort way, there is no real check if at least one of the nodes
succeed.
2) limit the number of messages to redistribute coming from a TCP sync.
Introduced a threshold that limit the number of messages that are
propagated, this will disable this optimization in case of heavy load.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
@fcrisciani fcrisciani requested review from ctelfer and dani-docker July 3, 2018 00:00
@thaJeztah
Copy link
Member

ping @ctelfer @dani-docker PTAL

case "lt":
if node.result > size {
log.Fatalf("Expected queue size from %s to be %d < %d", node.id, node.result, size)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what this is doing, but not why. Why is an assertion necessary for what looks like a query. This command computes an average queue size. Why should it fail if the size for one of the nodes queues is larger or smaller than some threshold and, moreover, why is necessary that such a test be present?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken comments offline. the short answer is that this is to understand if any of the node has a bigger queue on the testing infra

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I understand now that this is a test, not an information gathering function despite its debug output at the end. Please disregard this comment and the one below.

// Enable watch of tables from clients
for i := 0; i < parallelWriters; i++ {
go clientWatchTable(ips[i], servicePort, networkName, tableName, doneCh)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something seems wrong here. What happens if the # of parallel writers is less than the number of IPs specified in the command? Seems like some nodes would go "unwatched". More importantly, what happens if the number of parallel watchers is grater than the number of IPs. Seems like it will panic!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So @fcrisciani explained that it was intended to be able to write with less than the full # of nodes in order to ensure that the writes get propagated. Having more parallel writers than nodes would, of course, cause a panic. However it was designed for use in more constrained (testing) environments.

key := "key-" + strconv.Itoa(i) + "-"
logrus.Infof("Spawn worker: %d on IP:%s", i, ips[i])
go deleteKeysNumber(ips[i], servicePort, networkName, tableName, key, numberOfKeys, doneCh)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here. I'm probably not understanding how this is supposed to be invoked. But from the code it seems like parallelWriters should just be derived from len(ips).

@@ -168,7 +167,6 @@ func (nDB *NetworkDB) sendTableEvent(event TableEvent_Type, nid string, tname st
id: nid,
tname: tname,
key: key,
node: nDB.config.NodeID,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, the node attribute is just redundant, I don't see us reading anywhere. Is this the reason for removing it? And slightly reducing the message size?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes correct, today it has no use, so I just removed it. This header also is not being sent out so will save memory locally not in the messages sent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@ctelfer ctelfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// note this is a best effort, we are not checking the result of the bulk sync
nDB.Lock()
n.inSync = true
nDB.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what will happen if we got a false positive here, in other words, one node in bulkSync failed and the others succeeded, causing the err to be nil? Do we just count on periodic sync? would this affect convergence time ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this flag is mainly to reduce the number of messages sent around during the join network. The current logic tries to improve convergence readvertising also messages that are coming from tcp sync. Before this change the behavior was such that if the table had 10K entries the node was starting with 10K messages in the queue because was trying to propagate all that entries that came to know during the join network tcp sync. This change instead make sure that during the first tcp sync no entries are being resent around. This way a node will join and will have close to 0 entries in the queue and full state synced.

Now to answer your question if the bulkSync succeed with at least 1 node, that will be enough to have the internal state "enough in sync" in term of entries, because in theory all the nodes on the same network holds the same content. The difference will only be on the entries still "in flight", but the number of them should be negligible compared with the number of entries coming from the bulk sync.
This flag does not really affect convergence logic because anyway this node when it joins start with an empty table and empty queue, so was already not helping in propagating the messages around.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if I addressed the question properly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that explains it!

Copy link
Contributor

@dani-docker dani-docker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants