CIM10
The eds.cim10
pipeline component matches the CIM10 (French-language ICD) terminology.
Very low recall
When using the exact' matching mode, this component has a very poor recall performance.
We can use the
simstring` mode to retrieve approximate matches, albeit at the cost of a significantly higher computation time.
Usage
import spacy
nlp = spacy.blank("fr")
nlp.add_pipe("eds.cim10", config=dict(term_matcher="simstring"))
text = "Le patient est suivi pour fièvres typhoïde et paratyphoïde."
doc = nlp(text)
doc.ents
# Out: (fièvres typhoïde et paratyphoïde,)
ent = doc.ents[0]
ent.label_
# Out: cim10
ent.kb_id_
# Out: A01
Configuration
The pipeline can be configured using the following parameters :
PARAMETER | DESCRIPTION |
---|---|
attr |
Attribute to match on, eg
TYPE:
|
ignore_excluded |
Whether to skip excluded tokens during matching.
TYPE:
|
ignore_space_tokens |
Whether to skip space tokens during matching.
TYPE:
|
term_matcher |
The term matcher to use, either
TYPE:
|
term_matcher_config |
The configuration for the term matcher
TYPE:
|
Authors and citation
The eds.cim10
pipeline was developed by AP-HP's Data Science team.