-
Notifications
You must be signed in to change notification settings - Fork 233
disable GooglePubSubSender if pubsub cannot be reached #1030
Conversation
@davidxia @rohansingh @lndbrg @zalenski this aims to avoid issues like googleapis/google-cloud-java#1432 |
b6efcf2
to
82bd16c
Compare
Adds a `isHealthy()` method to the EventSender interface that is used to filter out any unhealthy senders at startup. This logic is done in a new class named EventSenderFactory, which extracts the identical logic for constructing Lists of EventSenders from the MasterService and the AgentService. To make this common class possible, also extracted a class for the common configuration fields between AgentConfig and MasterConfig. This new CommonConfiguration class is incomplete - more fields can be moved to this class to avoid duplication, but I left that for future commits. I created EventSenderFactory in hopes of writing a test for the logic that unhealthy senders are removed from the List, but the nature of this class makes this test to impractical - since the List-of-EventSender-building involves constructing new instances of KafkaProvider/GooglePubSubProvider/etc and calling methods on instances that those instances return, which is not really feasible to be mocked out in a test.
82bd16c
to
f14c677
Compare
Current coverage is 51.25% (diff: 40.81%)@@ master #1030 diff @@
==========================================
Files 274 276 +2
Lines 13132 13135 +3
Methods 0 0
Messages 0 0
Branches 1700 1700
==========================================
+ Hits 6725 6733 +8
+ Misses 5903 5899 -4
+ Partials 504 503 -1
|
@mattnworb i believe it would be beneficial to have a cache of senders that expires after n minutes and retries connection, in case the targeted event bus is down intermittently. |
I think there are two approaches we could take to improve this so that an intermittent network problem at startup does not cause the pubsub sender to be disabled for as long as the agent is up:
I am leaning towards 1 as 2 would still cause publishing attempts if the agent had total unconnectivity to the pubsub service, and so the ThresholdBundler will still fill up with requests, but just at a slower rate. |
1 sounds like a good idea. |
public void send(final KafkaRecord kafkaRecord) { | ||
@Override | ||
public boolean isHealthy() { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to PR, but should we always return true here? Is there value in checking we can talk to kafka by using `KafkaProducer.waitOnMetadata() for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we don't have any issues with the kafka sender and bad behavior when messages can't be published, I opted to leave that out here to avoid changing anything that is working ok today.
Adds a
isHealthy()
method to the EventSender interface that is used tofilter out any unhealthy senders at startup.
This logic is done in a new class named EventSenderFactory, which
extracts the identical logic for constructing Lists of EventSenders from
the MasterService and the AgentService. To make this common class
possible, also extracted a class for the common configuration fields
between AgentConfig and MasterConfig. This new CommonConfiguration class
is incomplete - more fields can be moved to this class to avoid
duplication, but I left that for future commits.
I created EventSenderFactory in hopes of writing a test for the logic
that unhealthy senders are removed from the List, but the nature of this
class makes this test to impractical - since the
List-of-EventSender-building involves constructing new instances of
KafkaProvider/GooglePubSubProvider/etc and calling methods on instances
that those instances return, which is not really feasible to be mocked
out in a test.