Skip to content

Latest commit

 

History

History
60 lines (47 loc) · 2.72 KB

README.md

File metadata and controls

60 lines (47 loc) · 2.72 KB

Spotify API ETL of Track Information to PostgreSQL

This project connects to the Spotify API to collect all useful track, album, and artist information about The Beatles. An ETL pipeline then loads all track information by The Beatles to PostgreSQL, in which the data is normalized utilizing a star schema.

While this project was originally focused on creating an ETL pipeline for The Beatles as a band, this project can be configured for any artist.

Methods Used

  • ETL
  • Data Modeling
  • Normalization
  • API Connection

Technologies Used

  • Python
  • PostgreSQL
  • pgAdmin

Packages Used

  • Psycopg2
  • Spotipy
  • Pandas

How To Run

Adjust Configurations

Configurations in the config.ini file will need to be adjusted per local PostGres settings (username, password, host, and database). The config.ini file is also where the artist name can be changed should that be desired.

Obtain Spotify API Tokens

In order to use the Spotipy package, API tokens will need to be obtained directly from Spotify. Click here for more information on this process.

Set Environment Variables

For this project to process, the Spotify API access key needs to be set as an environment variable called "spotify_id", and secret key needs to be set as an environment variable called "spotify_secret". These environment variables will need to be set on the operating system this project is to be run on.

Install Requirements and Run

On the command line of your operating system, navigate to the repository directory (ideally using a Python virtual environment).

Run the following code on the command line to install requirements:

pip install -r requirements.txt 

Run the following code on the command line to run this project:

Python run.py

Featured Scripts or Deliverables

Other Repository Contents

  • Modules
    • main.py - Organizes execution of all modules
    • queries.py - Queries to drop, create, and insert data into tables
    • setup.py - Creates connection to spotify
    • spotify.py - Pulls album, artist, and track information to create tables
    • sql.py - Connects to PostGreSQL, and executes queries to create star schema structure
  • config.ini - Configurations for PostGreSQL connection and Spotify artist
  • requirements.txt - Python package requirements
  • schema_design.jpg - image of star schema

Sources