Serious studies in the past 5 years, have demonstrated big correlations between the spread of false information.
A fake news story is one in which the information is entirely made up, with no verified facts, sources, or quotes. There are countless sources of fake news nowadays.
Fake news occurs when someone (or something like a bot) impersonates someone or a credible source in order to propagate misleading information. Most of the time, the people spreading false information have a political, economic, or any other purpose in mind.
The problem is serious and difficult to fix. It's not always easy to tell if information is accurate or not, thus we need better tools to assist us analyse the patterns of fake news in order to improve our social media, communication, and avert global turmoil.
To build a model to accurately classify a piece of news as REAL or FAKE.
In this python project , several ways of modelling technique with different accuracy result have been used to detect fake or real news using collected data. And the best technique’s result has been shown as confusion matrix and saved into submission.csv file.
Algorithm/Approach that are used in this project:
Logistic Regression
Random Forest Classifier
Decision Tree Classifier
This ‘Fake News Detection’ project explain the Python code to load, clean, and analyse data using Kaggle notebook/Jupyter Notebook. Then the detection result will be displayed in the form of confusion matrix.(fake or not).
The data comes from Kaggle in two files:
• Fake.csv
• True.csv
You can download it here:
https://www.kaggle.com/clmentbisaillon/fake-and-real-news-dataset
There are two files, one for real news and one for fake news (both in English) with a total of 23481 "fake" tweets and 21417 "real" articles.
Output of the best prediction result can be found in the submission.csv file-
0: False
1: true
-https://github.com/davreign-dav/Fake-News-Detection/blob/main/submission.csv
All of the analysis and it’s code can be found in the notebook: