Skip to content

CIM10

The eds.cim10 pipeline component matches the CIM10 (French-language ICD) terminology.

Very low recall

This component uses exact match on CIM10 labels, which leads to extremely low performances. It is intended as a simplistic baseline.

Usage

import spacy

nlp = spacy.blank("fr")
nlp.add_pipe("eds.cim10", config=dict(term_matcher="simstring"))

text = "Le patient est suivi pour fièvres typhoïde et paratyphoïde."

doc = nlp(text)

doc.ents
# Out: (fièvres typhoïde et paratyphoïde,)

ent = doc.ents[0]

ent.label_
# Out: cim10

ent.kb_id_
# Out: A01

Configuration

The pipeline can be configured using the following parameters :

Parameter Description Default
term_matcher Which algorithm should we use : exact or simstring "LOWER"
term_matcher_config Config of the algorithm (SimstringMatcher's for simstring) "LOWER"
attr spaCy attribute to match on (eg NORM, TEXT, LOWER) "LOWER"
ignore_excluded Whether to ignore excluded tokens for matching False

Authors and citation

The eds.cim10 pipeline was developed by AP-HP's Data Science team.