This project is a demo for several Beam techniques to do streaming analytics.
- Create a GCP project
- Create a file in
terraform
directory namedterraform.tfvars
with the following content:There are additional Terraform variables that can be overwritten; see variables.tf for details.project_id = "<GCP Project Id>"
- Run the following commands:
export PROJECT_ID=<project-id> export GCP_REGION=us-central1 export BIGQUERY_REGION=us-central1
- Create BigQuery tables, Pub/Sub topics and subscriptions, and GCS buckets by running this script:
source ./setup-env.sh
- Start event generation process:
./start-event-generation.sh
- Start the event processing pipeline:
(cd pipeline; ./run-streaming-pipeline.sh)
- Optionally, start the pipeline which will ingest the findings sent as pubsub messages into BigQuery:
./start-findings-to-bigquery-pipeline.sh
- Shutdown the pipelines via GCP console (TODO: add scripts)
- Run this command:
cd terraform; terraform destroy
Alternatively, delete the project you created.
The techniques and code contained here are not supported by Google and is provided as-is (under Apache license). This repo provides some options you can investigate, evaluate and employ if you choose to.