GitHub - joannamagdalena/nlp-for-papers: NLP for Scientific Papers

NLP for Scientific Papers is a project focused on analysis of scientific articles using natural language processing techniques. The goal is to extract key insights from research papers, explore word frequency patterns, and apply LDA modeling to identify underlying topics, with visualizations of the results. The project currently supports both unigrams and bigrams analysis.

Languages and packages used:

R (pdftools, tm, textstem, dplyr, tidytext, ggplot2, topicmodels, tidyverse, textmineR, Matrix, slam, wordcloud, RColorBrewer, wordcloud2, widyr, ggraph, igraph, tibble)

Repository structure:

loading_files.R - data loading from PDF files
main.R - NPL model implementation
preprocessing.R - data preprocessing
LDA.R - LDA model, with visualization
tf-idf.R - visualization of high tf-idf words
README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
LDA.R		LDA.R
README.md		README.md
loading_files.R		loading_files.R
main.R		main.R
preprocessing.R		preprocessing.R
tf-idf.R		tf-idf.R

joannamagdalena/nlp-for-papers

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages