Skip to content

Named Entity Recognition Components

We provide several Named Entity Recognition (NER) components. Named Entity Recognition is the task of identifying short relevant spans of text, named entities, and classifying them into pre-defined categories. In the case of clinical documents, these entities can be scores, disorders, behaviors, codes, dates, measurements, etc.

Span setters: where are stored extracted entities ?

A component assigns entities to a document by adding them to the doc.ents or doc.spans[group] attributes. doc.ents only supports non overlapping entities, therefore, if two entities overlap, the longest one will be kept. doc.spans[group] on the other hand, can contain overlapping entities. To control where entities are added, you can use the span_setter argument in any of these component.

Valid values for the span_setter argument of a component can be :

  • a (doc, matches) -> None callable
  • a span group name
  • a list of span group names
  • a dict of group name to True or list of labels

The group name "ents" is a special case, and will add the matches to doc.ents

Examples

  • span_setter=["ents", "ckd"] will add the matches to both doc.ents and doc.spans["ckd"]. It is equivalent to {"ents": True, "ckd": True}.
  • span_setter={"ents": ["foo", "bar"]} will add the matches with label "foo" and "bar" to doc.ents.
  • span_setter="ents" will add all matches only to doc.ents.
  • span_setter="ckd" will add all matches only to doc.spans["ckd"].

Available components

Component Description
eds.covid A COVID mentions detector
eds.charlson A Charlson score extractor
eds.sofa A SOFA score extractor
eds.elston_ellis An Elston & Ellis code extractor
eds.emergency_priority A priority score extractor
eds.emergency_ccmu A CCMU score extractor
eds.emergency_gemsa A GEMSA score extractor
eds.tnm A TNM score extractor
eds.adicap A ADICAP codes extractor
eds.drugs A drug mentions extractor
eds.cim10 A CIM10 terminology matcher
eds.umls An UMLS terminology matcher
eds.ckd CKD extractor
eds.copd COPD extractor
eds.cerebrovascular_accident Cerebrovascular accident extractor
eds.congestive_heart_failure Congestive heart failure extractor
eds.connective_tissue_disease Connective tissue disease extractor
eds.dementia Dementia extractor
eds.diabetes Diabetes extractor
eds.hemiplegia Hemiplegia extractor
eds.leukemia Leukemia extractor
eds.liver_disease Liver disease extractor
eds.lymphoma Lymphoma extractor
eds.myocardial_infarction Myocardial infarction extractor
eds.peptic_ulcer_disease Peptic ulcer disease extractor
eds.peripheral_vascular_disease Peripheral vascular disease extractor
eds.solid_tumor Solid tumor extractor
eds.alcohol Alcohol consumption extractor
eds.tobacco Tobacco consumption extractor