Skip to content

A simple script to convert XML Named Entity Recognition annotations to the CONLL format.

License

Notifications You must be signed in to change notification settings

bryanoliveira/xml2conll

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xml2conll

A simple script to convert XML Named Entity Recognition annotations to the CONLL format. This script was made to convert XMLs from the project HAREM, a famous Portuguese NER dataset. An example of input file can be found in example.xml or here.

Install

Dependencies

You will need the following dependencies:

  • Python 3

Requirements

The following requirements will be needed. They can be installed mannually using the following list:

  • nltk

Or, just by running the following command:

pip3 install -r requirements.txt

Usage

Run python3 xml2conll.py --input [XML FILE PATH] to convert the XML file to CONLL format. If needed, you can specify the output file name with --output [CONLL FILE PATH].

About

A simple script to convert XML Named Entity Recognition annotations to the CONLL format.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages