Skip to content

Getting started

EDS-NLP provides a set of spaCy components that are used to extract information from clinical notes written in French.

If it's your first time with spaCy, we recommend you familiarise yourself with some of their key concepts by looking at the "spaCy 101" page.

Quick start

Installation

You can install EDS-NLP via pip:

fast →pip install edsnlpSuccessfully installed!

restart ↻

We recommend pinning the library version in your projects, or use a strict package manager like Poetry.

pip install edsnlp==0.9.0

A first pipeline

Once you've installed the library, let's begin with a very simple example that extracts mentions of COVID19 in a text, and detects whether they are negated.

import spacy

nlp = spacy.blank("eds")  # 

terms = dict(
    covid=["covid", "coronavirus"],  # 
)

# Sentencizer component, needed for negation detection
nlp.add_pipe("eds.sentences")  # 
# Matcher component
nlp.add_pipe("eds.matcher", config=dict(terms=terms))  # 
# Negation detection
nlp.add_pipe("eds.negation")

# Process your text in one call !
doc = nlp("Le patient est atteint de covid")

doc.ents  # 
# Out: (covid,)

doc.ents[0]._.negation  # 
# Out: False

This example is complete, it should run as-is. Check out the spaCy 101 page if you're not familiar with spaCy.

Available pipeline components

See the Core components overview for more information.

Component Description
eds.normalizer Non-destructive input text normalisation
eds.sentences Better sentence boundary detection
eds.matcher A simple yet powerful entity extractor
eds.terminology A simple yet powerful terminology matcher
eds.contextual_matcher A conditional entity extractor
eds.endlines An unsupervised model to classify each end line

See the Qualifiers overview for more information.

Pipeline Description
eds.negation Rule-based negation detection
eds.family Rule-based family context detection
eds.hypothesis Rule-based speculation detection
eds.reported_speech Rule-based reported speech detection
eds.history Rule-based medical history detection

See the Miscellaneous components overview for more information.

Component Description
eds.dates Date extraction and normalisation
eds.consultation_dates Identify consultation dates
eds.measurements Measure extraction and normalisation
eds.sections Section detection
eds.reason Rule-based hospitalisation reason detection
eds.tables Tables detection

See the NER overview for more information.

Component Description
eds.covid A COVID mentions detector
eds.charlson A Charlson score extractor
eds.sofa A SOFA score extractor
eds.elston_ellis An Elston & Ellis code extractor
eds.emergency_priority A priority score extractor
eds.emergency_ccmu A CCMU score extractor
eds.emergency_gemsa A GEMSA score extractor
eds.tnm A TNM score extractor
eds.adicap A ADICAP codes extractor
eds.drugs A drug mentions extractor
eds.cim10 A CIM10 terminology matcher
eds.umls An UMLS terminology matcher
eds.ckd CKD extractor
eds.copd COPD extractor
eds.cerebrovascular_accident Cerebrovascular accident extractor
eds.congestive_heart_failure Congestive heart failure extractor
eds.connective_tissue_disease Connective tissue disease extractor
eds.dementia Dementia extractor
eds.diabetes Diabetes extractor
eds.hemiplegia Hemiplegia extractor
eds.leukemia Leukemia extractor
eds.liver_disease Liver disease extractor
eds.lymphoma Lymphoma extractor
eds.myocardial_infarction Myocardial infarction extractor
eds.peptic_ulcer_disease Peptic ulcer disease extractor
eds.peripheral_vascular_disease Peripheral vascular disease extractor
eds.solid_tumor Solid tumor extractor
eds.alcohol Alcohol consumption extractor
eds.tobacco Tobacco consumption extractor
Pipeline Description
eds.nested-ner A trainable component for nested (and classic) NER
eds.span-qualifier A trainable component for multi-class multi-label span qualification

Disclaimer

The performances of an extraction pipeline may depend on the population and documents that are considered.

Contributing to EDS-NLP

We welcome contributions ! Fork the project and propose a pull request. Take a look at the dedicated page for detail.

Citation

If you use EDS-NLP, please cite us as below.

@misc{edsnlp,
  author = {Dura, Basile and Wajsburt, Perceval and Petit-Jean, Thomas and Cohen, Ariel and Jean, Charline and Bey, Romain},
  doi    = {10.5281/zenodo.6424993},
  title  = {EDS-NLP: efficient information extraction from French clinical notes},
  url    = {http://aphp.github.io/edsnlp}
}