-
Install the requirements first by entering in the terminal:
pip install -r requirements.txt
-
Run the app through this command:
python app.py
Video Demo: https://www.youtube.com/watch?v=Z0ti3OmsPOE
The PH News PDF Generator aims to simulate somehow the delivery of latest news article from ABS CBN straight to the reader straight from a PDF (a news PDF, if you will). You open the app, and then asks you to select a news report category. Once inputted, voila! A PDF containing the latest news is instantly generated for your reading consumption.
Behind the scene, this Python-based app has two (2) main components to deliver the news:
Enabling Library: Beautiful Soup (bs4)
This library enables the app to do the web scraping functions:
- Extract all the links of news articles related to the selected news category;
- Extract the contents of a news article and store it into string for loading to PDF;
Supporting Libraries Used:
request
- Access to the website for web scraping is made possible through the use of modififed `header;re
(regex) - While content is extracted largely by bs4, there.match
function allows cleaning of the content before its finalization;
Enabling Library: Report Lab
This library gets the title and finalized content, and puts them into PDF in an orderly fashion. Other functions used to achieve the desired PDF product includes:
- Use of
frame
andParagraph
to load and format the news contents therein for proper wrapping and presentation. This function also handles generation of multi-page PDF for voluminous contents. - Use of
imagereader
to load and properly the logo image in the PDF. - Use of
drawString
to place the title, date and disclaimer footers to appropriate location. - Use of
TTfont
to load true fonts for styling contents save
generates the PDF!
Supporting Libraries Used:
datetime
- converts date from the link into a string format that is used to load and print to the title;titlecase
- formats title for styling purposes
Disclaimer: All news articles generated by this app are owned and published by ABS CBN Corporation, a media and entertainment organization in the Philippines (PH). All rights and credits go directly to ABS CBN. No copyright infringement intended.