Skip to content

This repository contains the implementation and dataset for the paper "Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing". It focuses on improving Marathi plagiarism detection using a weighted ensemble of TF-IDF and BERT embeddings.

License

Notifications You must be signed in to change notification settings

aditya-choudhary599/Marathi-Plagiarism-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PWC

Marathi-Plagiarism-Detection

This repository contains the implementation and dataset for the paper "Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing". It focuses on improving Marathi plagiarism detection using a weighted ensemble of TF-IDF and BERT embeddings.

Please Cite our Work:

If you use any datasets or refer to our methodology please cite our work via the following BibTeX citation:

@misc{mutsaddi2025enhancingplagiarismdetectionmarathi,
      title={Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing}, 
      author={Atharva Mutsaddi and Aditya Choudhary},
      year={2025},
      eprint={2501.05260},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.05260}, 
}

About

This repository contains the implementation and dataset for the paper "Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing". It focuses on improving Marathi plagiarism detection using a weighted ensemble of TF-IDF and BERT embeddings.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published