This repository contains a Jupyter Notebook that demonstrates the implementation of Support Vector Machines (SVM) for spam detection. The notebook includes data preprocessing, feature extraction, training the SVM model, and evaluating its performance.
Support Vector Machines.ipynb
: The Jupyter Notebook containing the implementation of SVM for spam detection.
- Python 3.x
- Jupyter Notebook
- NumPy
- Matplotlib
- SciPy
- scikit-learn
- NLTK
- stemming
- Clone the repository:
git clone https://github.com/cizodevahm/Support-Vector-Machines-for-Spam-Detection
- Navigate to the repository directory:
cd Support-Vector-Machines-for-Spam-Detection
- Open the Jupyter Notebook:
jupyter notebook Support Vector Machines.ipynb
- Data Preprocessing: The notebook includes functions to preprocess emails by converting them to lowercase, removing HTML tags, replacing numbers, URLs, email addresses, and dollar signs with placeholders.
- Feature Extraction: The notebook extracts features from emails by tokenizing, stemming, and mapping words to a vocabulary list.
- Training the SVM Model: The SVM model is trained using a linear kernel on a dataset of spam and non-spam emails.
- Evaluating Performance: The notebook evaluates the model's performance by calculating training and test set accuracy and identifying the top predictors for spam.
This project is licensed under the MIT License.