No description
Find a file
2020-12-22 16:34:12 +00:00
.github/ISSUE_TEMPLATE Update issue templates 2018-12-17 18:46:33 -05:00
docs Change version 2019-10-31 13:08:41 -04:00
guide Removed references to deprecated content 2020-05-20 12:41:39 -04:00
medacy This will be part of a bugfix release 2020-05-27 13:29:11 -04:00
.coveragerc Adding test coverage configurations 2019-01-01 12:01:15 -05:00
.gitattributes Marked .pkl files as binary 2019-12-19 22:00:44 -05:00
.gitignore removed config.json, I found a git command to ignore subsequent changes to it 2019-12-19 18:12:09 -05:00
.travis.yml Remove 3.6, add 3.8 2020-05-19 13:50:17 -04:00
config.json cuda device should be -2 by default 2020-01-20 10:57:52 -05:00
CONTRIBUTING.md rewording, spacing, added note about skipable unit tests. 2020-01-22 08:35:35 -05:00
LICENSE Initial commit 2018-09-28 10:48:37 -04:00
MANIFEST.in added tests for predict; if this test causes a new directory to be created, the manifest will ignore it. 2019-12-19 13:05:21 -05:00
README.md Update README.md 2020-11-15 21:10:18 +09:00
setup.cfg Added pytest configurations to setup.cfg, Wrote testing documention in CONTRIBUTION.md, Fixed typo in README 2018-12-20 20:40:02 -05:00
setup.py Restored Python3.6 support with a backport for dataclasses as a conditional dependency 2020-12-11 20:24:33 -05:00
Vagrantfile Gave the Vagrant box more memory and disk space, updated user guide accordingly 2019-11-27 16:56:06 -05:00

spaCy

medaCy

🏥 Medical Text Mining and Information Extraction with spaCy 🏥

MedaCy is a text processing and learning framework built over spaCy to support the lightning fast prototyping, training, and application of highly predictive medical NLP models. It is designed to streamline researcher workflow by providing utilities for model training, prediction and organization while insuring the replicability of systems.

alt text

🌟 Features

  • Highly predictive, shared-task dominating out-of-the-box trained models for medical named entity recognition.
  • Customizable pipelines with detailed development instructions and documentation.
  • Allows the designing of replicable NLP systems for reproducing results and encouraging the distribution of models whilst still allowing for privacy.
  • Active community development spearheaded and maintained by NLP@VCU.
  • Detailed API.

💭 Where to ask questions

MedaCy is actively maintained by a team of researchers at Virginia Commonwealth University. The best way to receive immediate responses to any questions is to raise an issue. Make sure to first consult the API. See how to formulate a good issue or feature request in the Contribution Guide.

💻 Installation Instructions

MedaCy can be installed for general use or for pipeline development / research purposes.

Application Run
Prediction and Model Training (stable) pip install git+https://github.com/NLPatVCU/medaCy.git
Prediction and Model Training (latest) pip install git+https://github.com/NLPatVCU/medaCy.git@development
Pipeline Development and Contribution See Contribution Instructions

📚 Power of medaCy

After installing medaCy and medaCy's clinical model, simply run:

from medacy.model.model import Model

model = Model.load_external('medacy_model_clinical_notes')
annotation = model.predict("The patient was prescribed 1 capsule of Advil for 5 days.")
print(annotation)

and receive instant predictions:

[
    ('Drug', 40, 45, 'Advil'),
    ('Dosage', 27, 28, '1'), 
    ('Form', 29, 36, 'capsule'),
    ('Duration', 46, 56, 'for 5 days')
]

MedaCy can also be used through its command line interface, documented here

To explore medaCy's other models or train your own, visit the examples section.

Reference

@ARTICLE {
    author  = "Andriy Mulyar, Natassja Lewinski and Bridget McInnes",
    title   = "TAC SRIE 2018: Extracting Systematic Review Information with MedaCy",
    journal = "National Institute of Standards and Technology (NIST) 2018 Systematic Review Information Extraction (SRIE) > Text Analysis Conference",
    year    = "2018",
    month   = "nov"
}

License

This package is licensed under the GNU General Public License.

Authors

Current contributors: Steele Farnsworth, Anna Conte, Gabby Gurdin, Aidan Kierans, Aidan Myers, and Bridget T. McInnes

Former contributors: Andriy Mulyar, Jorge Vargas, Corey Sutphin, and Bobby Best

Acknowledgments