GitHub

BioSEPBERT

This repository provides the code for BioSEPBERT, a neuroscience representation model designed for Brain Region text mining tasks such as named entity recognition and relation extraction.

Install

You can use requirements.txt to install BioSEPBERT as follows (Python version >= 3.8):

pip install -r requirements.txt

If you want to install the cuda version, do the following:

pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

Models

We provide two versions of pre-trained weights.

BioSEPBERT-NER - fine-tuning on WhiteText corpus
BioSEPBERT-RE - fine-tuning on WhiteText connectivity corpus

You can also use other pre-trained weights as follows:

BioBERT - fine-tuning on biomedical corpus
PubMedBERT - fine-tuning on biomedical corpus

Datasets

We provide a pre-processed version of benchmark datasets for each task as follows:
Named Entity Recognition: (36.3 MB), a dataset on brain region named entity recognition
Relation Extraction: (46.6 MB), a dataset on brain region connectivity relation extraction
You can get all datasets on the dataset folder.

Fine-tuning

After downloading one of the pre-trained weights, unpack it to /model.
You can change the model_name and model_type to PubMedBERT or biobert to use other pre-trained weights, the other parameters can also be changed refer to our paper.

Named Entity Recognition (NER)

Following command runs fine-tuning code on NER with default arguments.

python run_ner.py --task_name=BioSEPBERT --data_dir=../dataset/NER/1 --model_dir=../model/ --model_name=BioSEPBERT --model_type=BioSEPBERT --output_dir=../ --max_length=512 --train_batch_size=16 --eval_batch_size=16 --learning_rate=5e-5 --epochs=3 --logging_steps=-1 --save_steps=10 --seed=2022 --do_train --do_predict

Relation Extraction (RE)

Following command runs fine-tuning code on RE with default arguments.

python run_re.py --task_name=BioSEPBERT --data_dir=../dataset/RE/1 --model_dir=../model/ --model_name=BioSEPBERT --model_type=BioSEPBERT --output_dir=../ --max_length=512 --train_batch_size=16 --eval_batch_size=16 --learning_rate=5e-5 --epochs=3 --warmup_proportion=0.1 --earlystop_patience=100 --max_grad_norm=0.0 --logging_steps=-1 --save_steps=1 --seed=2021 --do_train --do_predict

Simply Evaluation

Named Entity Recognition (NER)

The following command runs on simple evaluation.

python run_ner.py --task_name=BioSEPBERT --data_dir=../dataset/NER/1 --model_dir=../model/ --model_name=BioSEPBERT --model_type=BioSEPBERT --output_dir=../ --max_length=512 --train_batch_size=16 --eval_batch_size=16 --learning_rate=5e-5 --epochs=3 --logging_steps=-1 --save_steps=10 --seed=2022 --do_predict

Contact

For questions and additional information, or would like to give us any suggestions, please contact us. The email address: aali@hust.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
NER		NER
NRE		NRE
baselines		baselines
dataset		dataset
model		model
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BioSEPBERT

Install

Models

Datasets

Fine-tuning

Named Entity Recognition (NER)

Relation Extraction (RE)

Simply Evaluation

Named Entity Recognition (NER)

Contact

About

Releases

Packages

Languages

License

Brainsmatics/BioSEPBERT

Folders and files

Latest commit

History

Repository files navigation

BioSEPBERT

Install

Models

Datasets

Fine-tuning

Named Entity Recognition (NER)

Relation Extraction (RE)

Simply Evaluation

Named Entity Recognition (NER)

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages