In this two-part workshop series, you will learn to collect posts from Reddit - using the very sparkly example of the Eurovision Song Contest - and apply a range of text analytic techniques to them. Some experience with R coding preferred, no experience with social media data collection or text analysis required.
Development and testing was done with Python version 3.12 and R version 4.4.
TO WRITE
These instructions assume a little familiarity with installing and running Python packages from the command line, and that Python and R are already installed. All shell commands provided are run from this repository root directory.
We recommend creating and activating a new Python virtual environment.
Install Python requirements:
pip install -r python_requirements.txt
Open an R shell with:
R
Install R notebook kernel:
install.packages('IRkernel')
IRkernel::installspec()
To quit the R shell, use q()
or the keyboard shortcut Ctrl+D (Cmd+D on Mac).
You may now as desired run individual notebooks with:
jupyter notebook 01_data_collection/explore.ipynb
Or you may as desired run a Jupyter Lab server for the whole project with:
jupyter lab
For additional setup steps for contributing code to this repository, see CONTRIBUTING.md.
Code and documentation by QUT Digital Observatory and the Language Technology and Data Analysis Laboratory (LADAL) and licensed under CC BY 4.0
Support provided by the Language Data Commons of Australia (LDACA) (and therefore ARDC) and the QUT Library.