Skip to content

Drugs

The eds.drugs pipeline component detects mentions of French drugs (brand names and active ingredients) and adds them to doc.ents. Each drug is mapped to an ATC code through the Romedi terminology (Cossin et al., 2019). The ATC classifies drugs into groups.

Examples

In this example, we are looking for an oral antidiabetic medication (ATC code: A10B).

import edsnlp

nlp = edsnlp.blank("eds")
nlp.add_pipe("eds.normalizer")
nlp.add_pipe("eds.drugs", config=dict(term_matcher="exact"))

text = "Traitement habituel: Kardégic, cardensiel (bisoprolol), glucophage, lasilix"

doc = nlp(text)

drugs_detected = [(x.text, x.kb_id_) for x in doc.ents]

drugs_detected[0]
# Out: ('Kardégic', 'B01AC06')

len(drugs_detected)
# Out: 5

oral_antidiabetics_detected = list(
    filter(lambda x: (x[1].startswith("A10B")), drugs_detected)
)
oral_antidiabetics_detected
# Out: [('glucophage', 'A10BA02')]

Glucophage is the brand name of a medication that contains metformine, the first-line medication for the treatment of type 2 diabetes.

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline object

TYPE: PipelineProtocol

name

The name of the component

TYPE: str DEFAULT: 'eds.drugs'

attr

The default attribute to use for matching.

TYPE: str DEFAULT: 'NORM'

ignore_excluded

Whether to skip excluded tokens (requires an upstream pipeline to mark excluded tokens).

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to skip space tokens during matching.

TYPE: bool DEFAULT: False

term_matcher

The matcher to use for matching phrases ? One of (exact, simstring)

TYPE: Literal['exact', 'simstring'] DEFAULT: 'exact'

term_matcher_config

Parameters of the matcher term matcher

TYPE: Dict[str, Any] DEFAULT: {}

label

Label name to use for the Span object and the extension

TYPE: str DEFAULT: 'drug'

span_setter

How to set matches on the doc

TYPE: SpanSetterArg DEFAULT: {'ents': True, 'drug': True}

RETURNS DESCRIPTION
TerminologyMatcher

Authors and citation

The eds.drugs pipeline was developed by the IAM team and CHU de Bordeaux's Data Science team.


  1. Cossin S., Lebrun L., Lobre G., Loustau R., Jouhet V., Griffier R., Mougin F., Diallo G. and Thiessard F., 2019. Romedi: An Open Data Source About French Drugs on the Semantic Web. {Studies in Health Technology and Informatics}. 264, pp.79-82. 10.3233/SHTI190187