Getting started
EDS-NLP provides a set of spaCy components that are used to extract information from clinical notes written in French.
If it's your first time with spaCy, we recommend you familiarise yourself with some of their key concepts by looking at the "spaCy 101" page.
Quick start
Installation
You can install EDS-NLP via pip
:
We recommend pinning the library version in your projects, or use a strict package manager like Poetry.
pip install edsnlp==0.9.0
A first pipeline
Once you've installed the library, let's begin with a very simple example that extracts mentions of COVID19 in a text, and detects whether they are negated.
import spacy
nlp = spacy.blank("eds") #
terms = dict(
covid=["covid", "coronavirus"], #
)
# Sentencizer component, needed for negation detection
nlp.add_pipe("eds.sentences") #
# Matcher component
nlp.add_pipe("eds.matcher", config=dict(terms=terms)) #
# Negation detection
nlp.add_pipe("eds.negation")
# Process your text in one call !
doc = nlp("Le patient est atteint de covid")
doc.ents #
# Out: (covid,)
doc.ents[0]._.negation #
# Out: False
This example is complete, it should run as-is. Check out the spaCy 101 page if you're not familiar with spaCy.
Available pipeline components
See the Core components overview for more information.
Component | Description |
---|---|
eds.normalizer | Non-destructive input text normalisation |
eds.sentences | Better sentence boundary detection |
eds.matcher | A simple yet powerful entity extractor |
eds.terminology | A simple yet powerful terminology matcher |
eds.contextual_matcher | A conditional entity extractor |
eds.endlines | An unsupervised model to classify each end line |
See the Qualifiers overview for more information.
Pipeline | Description |
---|---|
eds.negation | Rule-based negation detection |
eds.family | Rule-based family context detection |
eds.hypothesis | Rule-based speculation detection |
eds.reported_speech | Rule-based reported speech detection |
eds.history | Rule-based medical history detection |
See the Miscellaneous components overview for more information.
Component | Description |
---|---|
eds.dates | Date extraction and normalisation |
eds.consultation_dates | Identify consultation dates |
eds.measurements | Measure extraction and normalisation |
eds.sections | Section detection |
eds.reason | Rule-based hospitalisation reason detection |
eds.tables | Tables detection |
See the NER overview for more information.
Component | Description |
---|---|
eds.covid | A COVID mentions detector |
eds.charlson | A Charlson score extractor |
eds.sofa | A SOFA score extractor |
eds.elston_ellis | An Elston & Ellis code extractor |
eds.emergency_priority | A priority score extractor |
eds.emergency_ccmu | A CCMU score extractor |
eds.emergency_gemsa | A GEMSA score extractor |
eds.tnm | A TNM score extractor |
eds.adicap | A ADICAP codes extractor |
eds.drugs | A drug mentions extractor |
eds.cim10 | A CIM10 terminology matcher |
eds.umls | An UMLS terminology matcher |
eds.ckd | CKD extractor |
eds.copd | COPD extractor |
eds.cerebrovascular_accident | Cerebrovascular accident extractor |
eds.congestive_heart_failure | Congestive heart failure extractor |
eds.connective_tissue_disease | Connective tissue disease extractor |
eds.dementia | Dementia extractor |
eds.diabetes | Diabetes extractor |
eds.hemiplegia | Hemiplegia extractor |
eds.leukemia | Leukemia extractor |
eds.liver_disease | Liver disease extractor |
eds.lymphoma | Lymphoma extractor |
eds.myocardial_infarction | Myocardial infarction extractor |
eds.peptic_ulcer_disease | Peptic ulcer disease extractor |
eds.peripheral_vascular_disease | Peripheral vascular disease extractor |
eds.solid_tumor | Solid tumor extractor |
eds.alcohol | Alcohol consumption extractor |
eds.tobacco | Tobacco consumption extractor |
Pipeline | Description |
---|---|
eds.nested-ner | A trainable component for nested (and classic) NER |
eds.span-qualifier | A trainable component for multi-class multi-label span qualification |
Disclaimer
The performances of an extraction pipeline may depend on the population and documents that are considered.
Contributing to EDS-NLP
We welcome contributions ! Fork the project and propose a pull request. Take a look at the dedicated page for detail.
Citation
If you use EDS-NLP, please cite us as below.
@misc{edsnlp,
author = {Dura, Basile and Wajsburt, Perceval and Petit-Jean, Thomas and Cohen, Ariel and Jean, Charline and Bey, Romain},
doi = {10.5281/zenodo.6424993},
title = {EDS-NLP: efficient information extraction from French clinical notes},
url = {http://aphp.github.io/edsnlp}
}