CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198

5 · 2025-02-19T02:11:27Z

https://issues.apache.org/jira/browse/CASSSIDECAR-203

yifan-c · 2025-02-20T02:26:34Z

CHANGES.txt

@@ -8,7 +9,7 @@
 * Sidecar schema initialization can be executed on multiple thread (CASSSIDECAR-200)
 * Make sidecar operations resilient to down Cassandra nodes (CASSSIDECAR-201)
 * Fix Cassandra instance not found error (CASSSIDECAR-192)
- * Implemented Schema Reporter for Integration with DataHub (CASSSIDECAR-191)
+ * Implement Schema Reporter for Integration with DataHub (CASSSIDECAR-191)


Do not change the existing entries in the CHANGES.txt

Sure.

Is there a specific reason you'd like me to keep this typo I myself accidentally introduced in there forever?

It is mainly avoid amending the changes log. It is more important to have the git history of each line linked to the correct commit.
Btw, it is not really a typo. "Implemented something" described the change as clear.
If you look at the log, there are places where "Adds" and "Add", as well as "Adding" are used. There is no strict rule on the verb's form.

yifan-c · 2025-02-20T02:30:00Z

server/src/main/java/org/apache/cassandra/sidecar/acl/authorization/BasicPermissions.java

@@ -63,6 +63,9 @@ public class BasicPermissions
    public static final Permission READ_OPERATIONAL_JOB = new DomainAwarePermission("OPERATIONAL_JOB:READ", OPERATION_SCOPE);
    public static final Permission DECOMMISSION_NODE = new DomainAwarePermission("NODE:DECOMMISSION", OPERATION_SCOPE);

+    // Permissions related to Schema Reporting
+    public static final Permission REPORT_SCHEMA = new DomainAwarePermission("SCHEMA:REPORT", CLUSTER_SCOPE);


I do not think an extra permission is required.
The permission needed in order to publish to DataHub is SCHEMA:READ, which already exists. cc: @sarankk

The permission is not for someone to read the schema, but for someone to trigger the schema report on demand. So I think the permission is different. Would love to hear input from @sarankk

The sentiment here is to have restraint in adding new verbs. Ideally, it should be a fixed set of verb to avoid operational pain.
The reason that READ should work here is that the reporter is reading the cassandra schema. When it publishes (i.e. sends requests to DataHub), the authorization should be enforced by the server (DataHub), not the client (Sidecar).

I feel having 2 different permissions is better in general, with 1 granting read permission automatically grants them report permission. But in this particular case, since we already have a periodic task to report schema, irrespective of this endpoint, we can use the enable flag for schema reporting to control whether we want to allow reporting or not without adding a separate permission for reporting control?

I was planning to get to this tomorrow, but see there's confusion about this one.

Changing permissions to the existing SCHEMA:READ will be logically equivalent to the following statement:

"Everyone who is allowed to see cluster schema is also allowed to perform DoS attacks on Sidecar."

Is that actually true?

I think we need a separate permission to allow a user trigger a schema reporting. We should not conflate the SCHEMA:READ permission with the ability to report schemas

Let's also think about the future. We do not want to go down on the route of introducing new verbs for new actions. And this is triggering my concern on it.

I am fine with letting PUBLISH pass. But we should really think about it and be mindful with introducing new verbs.

yifan-c · 2025-02-20T02:31:07Z

server/src/main/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandler.java

+     *
+     * @param metadata the metadata fetcher
+     * @param executor executor pools for blocking executions
+     * @param reporter executor pools for blocking executions


copy-paste error

yifan-c · 2025-02-20T02:33:41Z

server/src/main/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandler.java

+        executorPools.service()
+                     .runBlocking(() -> metadataFetcher.runOnFirstAvailableInstance(instance ->
+                            schemaReporter.process(instance.delegate().metadata())))
+                     .onSuccess(context::json)


The previous block does not return any value. What does the response json look like?

yifan-c · 2025-02-20T02:36:59Z

server/src/main/java/org/apache/cassandra/sidecar/utils/InstanceMetadataFetcher.java

+     * Iterate through the local instances and run the {@link Consumer} on the first available one,
+     * so no {@link CassandraUnavailableException} or {@link OperationUnavailableException} is thrown for the operations
+     *
+     * @param consumer a {@link Consumer} that processes {@link InstanceMetadata} and returns no result
+     * @throws CassandraUnavailableException if all local instances were exhausted
+     */
+    public void runOnFirstAvailableInstance(Consumer<InstanceMetadata> consumer) throws CassandraUnavailableException
+    {
+        callOnFirstAvailableInstance(metadata ->
+        {
+            consumer.accept(metadata);
+            return null;
+        });
+    }


Can you remove this method? It is unnecessary. It does not retrieve anything from instance metadata fetcher, who is supposed to "fetch something". I have suggested a different implementation in the new handler w/o using this method.

yifan-c · 2025-02-20T02:42:14Z

server/src/test/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandlerTest.java

+    }
+
+    @Test
+    @SuppressWarnings("deprecation")


yifan-c · 2025-02-20T02:42:53Z

server/src/test/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandlerTest.java

+    }
+
+    @Test
+    @SuppressWarnings("deprecation")


yifan-c · 2025-02-20T02:44:53Z

server/src/test/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandlerTest.java

+        CountDownLatch latch = new CountDownLatch(1);
+        server.close()
+              .onSuccess(future -> latch.countDown());
+        latch.await(TIMEOUT.toMillis(), TimeUnit.MILLISECONDS);


close the client too.

yifan-c · 2025-02-20T02:48:46Z

server/src/test/java/org/apache/cassandra/sidecar/routes/ReportSchemaHandlerTest.java

+        String expected = IOUtils.readFully("/datahub/empty_cluster.json");
+        emitter = new JsonEmitter();
+
+        client.get(server.actualPort(), LOCALHOST, ENDPOINT)


add an assertion that emitter.content is empty, before making the http request.

yifan-c · 2025-02-20T02:50:11Z

server/src/test/java/org/apache/cassandra/sidecar/utils/InstanceMetadataFetcherTest.java

@@ -21,7 +21,7 @@
 import java.nio.file.Path;
 import java.util.Arrays;
 import java.util.List;
-
+import com.google.common.collect.ImmutableList;


Revert the changes in this file once you remove runOnFirstAvailableInstance

yifan-c · 2025-02-20T02:52:47Z

Let's keep the patch succinct, i.e. no unrelated changes and add code only when they are definitely required.

frankgh · 2025-02-20T16:57:17Z

server/src/main/java/org/apache/cassandra/sidecar/server/MainModule.java

+        // Schema Reporting
+        protectedRouteBuilderFactory.get()
+                                    .router(router)
+                                    .method(HttpMethod.GET)


The verb should not be GET. This should either be PUT or POST:

PUT is more suitable for idempotent operations

POST is more suitable for operations that are not idempotent

Also, this endpoint might be suitable for the operations framework. Something to consider

frankgh

I left a few comments. I think we also need to add support on the client side to be able to run this new endpoint

5 force-pushed the trunk branch 3 times, most recently from be55528 to bfa7acc Compare February 19, 2025 03:25

yifan-c changed the title ~~Created Endpoint that Triggers an Immediate Schema Report~~ CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report Feb 20, 2025

yifan-c requested changes Feb 20, 2025

View reviewed changes

frankgh reviewed Feb 20, 2025

View reviewed changes

Created Endpoint that Triggers an Immediate Schema Report

2d39bb8

5 force-pushed the trunk branch from bfa7acc to 2d39bb8 Compare February 24, 2025 21:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198

CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198

5 commented Feb 19, 2025

yifan-c Feb 20, 2025

5 Feb 20, 2025

yifan-c Feb 25, 2025 •

edited

Loading

yifan-c Feb 20, 2025

frankgh Feb 20, 2025

yifan-c Feb 20, 2025

sarankk Feb 20, 2025 •

edited

Loading

5 Feb 20, 2025

frankgh Feb 20, 2025

yifan-c Feb 21, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c Feb 20, 2025

yifan-c commented Feb 20, 2025

frankgh Feb 20, 2025

frankgh left a comment

CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198

Are you sure you want to change the base?

CASSSIDECAR-203: Created Endpoint that Triggers an Immediate Schema Report #198

Conversation

5 commented Feb 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yifan-c Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sarankk Feb 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yifan-c commented Feb 20, 2025

Choose a reason for hiding this comment

frankgh left a comment

Choose a reason for hiding this comment

yifan-c Feb 25, 2025 •

edited

Loading

sarankk Feb 20, 2025 •

edited

Loading