Skip to content
This repository was archived by the owner on Feb 8, 2019. It is now read-only.

Cassandra integration #66

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

zapletal-martin
Copy link

Cassandra database integration

  • CassandraSource
  • CassandraSink
  • CassandraStore

Reuses some Spark-Cassandra connector files and follows how that works. The intent is to allow the connector to be reused when version for other processing systems is available. The Source looks up token ranges in the desired table, splits to independent sets of partitions and assigns those to available number of source tasks, allowing very good parallelism. All fetches of data except the first one are asynchronous. The Sink can be trivially parallelised by the user where different writes are assigned to different tasks.

The Source scans a current table snapshot and does not currently honour updates (so not a continuous stream). The source is not time replayable. There are options how to handle both these, but must be properly thought through. The test coverage is poor at the moment. but this first attempt will allow iteration and continuous improvement of the code and adding features.

// This is for DFSJarStore
"${PROG_HOME}/lib/yarn/*"
// "${PROG_HOME}/lib/yarn/*"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to comment out to avoid runtime issues. When I try to submit an example job to Gearpump I get "java.lang.NoSuchMethodError: com.google.common.util.concurrent.Futures.withFallback(". I believe that happens because Gearpump pulls com.google.guava:guava version 11.0.2 from Hadoop dependencies, but Cassandra Java driver I am using needs version 16.0.1. Need to figure out a solution to this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate help here as I may not understand exactly what my changes may cause elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this once 0.8.1 is out. Sorry we may need hold this for a while.

@manuzhang
Copy link
Contributor

@zapletal-martin Thanks for your contribution. I'll pull your branch and try playing with it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants