Skip to content

Latest commit

 

History

History
29 lines (18 loc) · 1.25 KB

File metadata and controls

29 lines (18 loc) · 1.25 KB

Week 6 Project:

Tweet stream data pipeline for a Slackbot

This project was completed in week 6 of the Data Science Bootcamp at Spiced Academy in Berlin.

pipeline

This is a simple implementation of a dockerized data pipeline that sends randomized tweets about politics together with their sentiment scores.

The Docker-Compose pipeline includes five containers. With the following folder structure the data pipeline... FolderTree

  • collects tweets with the Twitter API and tweepy

  • stores the tweets in a MongoDB

  • applies an ETL job that

    • extracts the tweets from MongoDB
    • gets the sentiments of the texts with VADERSentiment
  • loads the tweets and their sentiment scores in a Postgres database

  • creates a Slackbot that post a randomly selected anonymized tweet from the Postgres database into a Slack channel.

Pipeline folder including the docker-compose.yml is here.

Acknowledgements

The tweet_collector.py is taken from Paul Wlodkowski's twitter-mongoDB repository.

Various code snipplets in tweet_collector.py and slackbot.py are adopted from Krystana Föh's code.