Climate Insights highlights distributed processing systems capabilities (Apache Spark) for real-time analysis performed on meteorological data,aiming to thoroughly understand and predict atmospheric phenomena. This work employs PySpark, a Python API for Apache Spark, Spark Streaming and cloud computing (Google Cloud) to process large meteorological data in real-time. The analysis, performed in a mutli-cluster environment, delivers comprehensive and useful insights accessible via a user-friendly web interface powered by Streamlit. This abstract highlights the project’s importance and its potential impact on improving weather forecast techniques while relying on distributed processing systems.
-
Real-time data acquisition: We continously scrape and update the model with data from different resources.
-
Data Processing: Deploying Apache Spark on Google Cloud for data processing. Additionally, we perform time series analysis.
-
Weather Forecasting: Meta's open-source model Prophet is used for the forecasting phase.
Your feedback and contributions are welcome to further enhance the capabilities of this impactful project. The code for this project is available upon request. All requests are to be made to my email: bacem.etteib.001@student.uni.lu