big-data
Here are 268 public repositories matching this topic...
PredictionIO, a machine learning server for developers and ML engineers.
-
Updated
Jan 9, 2021 - Scala
CMAK is a tool for managing Apache Kafka clusters
-
Updated
Aug 2, 2023 - Scala
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
-
Updated
Feb 20, 2025 - Scala
Simple and Distributed Machine Learning
-
Updated
Feb 6, 2025 - Scala
High performance data store solution
-
Updated
Feb 22, 2025 - Scala
GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs
-
Updated
Feb 21, 2025 - Scala
Sparkling Water provides H2O functionality inside Spark cluster
-
Updated
Nov 19, 2024 - Scala
An open protocol for secure data sharing
-
Updated
Feb 20, 2025 - Scala
Low-code tool for automating actions on real time data | Stream processing for the users.
-
Updated
Feb 21, 2025 - Scala
A simplified, lightweight ETL Framework based on Apache Spark
-
Updated
Jan 24, 2024 - Scala
Geo Spatial Data Analytics on Spark
-
Updated
Aug 26, 2021 - Scala
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
-
Updated
Jan 24, 2025 - Scala
A simple Spark-powered ETL framework that just works 🍺
-
Updated
Feb 3, 2025 - Scala
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."