-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198
base: trunk
Are you sure you want to change the base?
Conversation
be55528
to
bfa7acc
Compare
@@ -8,7 +9,7 @@ | |||
* Sidecar schema initialization can be executed on multiple thread (CASSSIDECAR-200) | |||
* Make sidecar operations resilient to down Cassandra nodes (CASSSIDECAR-201) | |||
* Fix Cassandra instance not found error (CASSSIDECAR-192) | |||
* Implemented Schema Reporter for Integration with DataHub (CASSSIDECAR-191) | |||
* Implement Schema Reporter for Integration with DataHub (CASSSIDECAR-191) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not change the existing entries in the CHANGES.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Is there a specific reason you'd like me to keep this typo I myself accidentally introduced in there forever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is mainly avoid amending the changes log. It is more important to have the git history of each line linked to the correct commit.
Btw, it is not really a typo. "Implemented something" described the change as clear.
If you look at the log, there are places where "Adds" and "Add", as well as "Adding" are used. There is no strict rule on the verb's form.
@@ -63,6 +63,9 @@ public class BasicPermissions | |||
public static final Permission READ_OPERATIONAL_JOB = new DomainAwarePermission("OPERATIONAL_JOB:READ", OPERATION_SCOPE); | |||
public static final Permission DECOMMISSION_NODE = new DomainAwarePermission("NODE:DECOMMISSION", OPERATION_SCOPE); | |||
|
|||
// Permissions related to Schema Reporting | |||
public static final Permission REPORT_SCHEMA = new DomainAwarePermission("SCHEMA:REPORT", CLUSTER_SCOPE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think an extra permission is required.
The permission needed in order to publish to DataHub is SCHEMA:READ
, which already exists. cc: @sarankk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The permission is not for someone to read the schema, but for someone to trigger the schema report on demand. So I think the permission is different. Would love to hear input from @sarankk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentiment here is to have restraint in adding new verbs. Ideally, it should be a fixed set of verb to avoid operational pain.
The reason that READ should work here is that the reporter is reading the cassandra schema. When it publishes (i.e. sends requests to DataHub), the authorization should be enforced by the server (DataHub), not the client (Sidecar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel having 2 different permissions is better in general, with 1 granting read permission automatically grants them report permission. But in this particular case, since we already have a periodic task to report schema, irrespective of this endpoint, we can use the enable
flag for schema reporting to control whether we want to allow reporting or not without adding a separate permission for reporting control?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to get to this tomorrow, but see there's confusion about this one.
Changing permissions to the existing SCHEMA:READ
will be logically equivalent to the following statement:
"Everyone who is allowed to see cluster schema is also allowed to perform DoS attacks on Sidecar."
Is that actually true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a separate permission to allow a user trigger a schema reporting. We should not conflate the SCHEMA:READ
permission with the ability to report schemas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also think about the future. We do not want to go down on the route of introducing new verbs for new actions. And this is triggering my concern on it.
I am fine with letting PUBLISH
pass. But we should really think about it and be mindful with introducing new verbs.
* | ||
* @param metadata the metadata fetcher | ||
* @param executor executor pools for blocking executions | ||
* @param reporter executor pools for blocking executions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-paste error
executorPools.service() | ||
.runBlocking(() -> metadataFetcher.runOnFirstAvailableInstance(instance -> | ||
schemaReporter.process(instance.delegate().metadata()))) | ||
.onSuccess(context::json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous block does not return any value. What does the response json look like?
* Iterate through the local instances and run the {@link Consumer} on the first available one, | ||
* so no {@link CassandraUnavailableException} or {@link OperationUnavailableException} is thrown for the operations | ||
* | ||
* @param consumer a {@link Consumer} that processes {@link InstanceMetadata} and returns no result | ||
* @throws CassandraUnavailableException if all local instances were exhausted | ||
*/ | ||
public void runOnFirstAvailableInstance(Consumer<InstanceMetadata> consumer) throws CassandraUnavailableException | ||
{ | ||
callOnFirstAvailableInstance(metadata -> | ||
{ | ||
consumer.accept(metadata); | ||
return null; | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this method? It is unnecessary. It does not retrieve anything from instance metadata fetcher, who is supposed to "fetch something". I have suggested a different implementation in the new handler w/o using this method.
} | ||
|
||
@Test | ||
@SuppressWarnings("deprecation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
} | ||
|
||
@Test | ||
@SuppressWarnings("deprecation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
CountDownLatch latch = new CountDownLatch(1); | ||
server.close() | ||
.onSuccess(future -> latch.countDown()); | ||
latch.await(TIMEOUT.toMillis(), TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close the client too.
String expected = IOUtils.readFully("/datahub/empty_cluster.json"); | ||
emitter = new JsonEmitter(); | ||
|
||
client.get(server.actualPort(), LOCALHOST, ENDPOINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an assertion that emitter.content
is empty, before making the http request.
@@ -21,7 +21,7 @@ | |||
import java.nio.file.Path; | |||
import java.util.Arrays; | |||
import java.util.List; | |||
|
|||
import com.google.common.collect.ImmutableList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert the changes in this file once you remove runOnFirstAvailableInstance
Let's keep the patch succinct, i.e. no unrelated changes and add code only when they are definitely required. |
// Schema Reporting | ||
protectedRouteBuilderFactory.get() | ||
.router(router) | ||
.method(HttpMethod.GET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The verb should not be GET. This should either be PUT or POST:
- PUT is more suitable for idempotent operations
- POST is more suitable for operations that are not idempotent
Also, this endpoint might be suitable for the operations framework. Something to consider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments. I think we also need to add support on the client side to be able to run this new endpoint
https://issues.apache.org/jira/browse/CASSSIDECAR-203