Skip to content
Gus Hahn-Powell edited this page Oct 13, 2016 · 8 revisions

What do I need to compile and run Reach from source?

  1. Java 8
  2. sbt (any version will do, as the proper version will be retrieved at compile time)
  3. At least 5G of RAM (see the .sbtopts file)

What license does Reach use?

We are moving to a dual license (free for research, not-so-free for commercial use). More to come soon...

What formats can Reach read?

You can find a description our supported input formats here: https://github.com/clulab/reach/wiki/Supported-Input-Formats

How can I download an nxml file from the OpenAccess subset of PubMed?

Here are two solutions:

  1. Use a url of this format: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=<pmc id sans pmc goes here>&retmode=xml
  • If we wanted to retrieve PMC26816343, this would be the formatted url:
    • http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=26816343&retmode=xml
  1. You can run this Python (2.7 or 3.x) script: https://gist.github.com/myedibleenso/f233359445461a71ad37017393fe921f

How do I cite Reach?

If you use Reach, please cite this paper:

@inproceedings{Valenzuela+:2015aa,
  author    = {Valenzuela-Esc\'{a}rcega, Marco A. and Gustave Hahn-Powell and Thomas Hicks and Mihai Surdeanu},
  title     = {A Domain-independent Rule-based Framework for Event Extraction},
  organization = {ACL-IJCNLP 2015},
  booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing: Software Demonstrations (ACL-IJCNLP)},
  url = {http://www.aclweb.org/anthology/P/P15/P15-4022.pdf},
  year      = {2015},
  pages = {127--132},
  Note = {Paper available at \url{http://www.aclweb.org/anthology/P/P15/P15-4022.pdf}},
}

If you use the sieve-based assembly system for deduplication or event ordering (by causality), please cite this paper:

@inproceedings{GHP+:2016aa,
  author       = {Gus Hahn-Powell and Dane Bell and Marco A. Valenzuela-Esc\'{a}rcega and Mihai Surdeanu},
  title        = {This before That: Causal Precedence in the Biomedical Domain},
  booktitle    = {Proceedings of the 2016 Workshop on Biomedical Natural Language Processing},
  organization = {Association for Computational Linguistics}
  year         = {2016}
  Note         = {Paper available at \url{https://arxiv.org/abs/1606.08089}}
}

If your work makes use of the coreference resolution system (used by default in Reach), please cite this paper:

@InProceedings{Bell:16,
    title     = {{Sieve-based coreference resolution in the biomedical domain}},
    author    = {{Bell, Dane and Gus Hahn-Powell and Marco A. Valenzuela-Esc\'{a}rcega and Mihai Surdeanu}},
    booktitle = {Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC)},
    year      = {2016},
}

I heard Reach is a rule-based system. What language are the rules written in?

While Reach makes use of machine learning for aspects of ner, context, and causal assembly, event detection is done using rules written in Odin.