Skip to content

Drugs

The eds.drugs pipeline component detects mentions of French drugs (brand names and active ingredients) and adds them to doc.ents. Each drug is mapped to an ATC code through the Romedi terminology 1. The ATC classifies drugs into groups.

Usage

In this example, we are looking for an oral antidiabetic medication (ATC code: A10B).

from edsnlp.pipelines.core.terminology import TerminologyTermMatcher
import spacy

nlp = spacy.blank("fr")
nlp.add_pipe("eds.normalizer")
nlp.add_pipe("eds.drugs", config=dict(term_matcher=TerminologyTermMatcher.exact))

text = "Traitement habituel: Kardégic, cardensiel (bisoprolol), glucophage, lasilix"

doc = nlp(text)

drugs_detected = [(x.text, x.kb_id_) for x in doc.ents]

drugs_detected
# Out: [('Kardégic', 'B01AC06'), ('cardensiel', 'C07AB07'), ('bisoprolol', 'C07AB07'), ('glucophage', 'A10BA02'), ('lasilix', 'C03CA01')]

oral_antidiabetics_detected = list(
    filter(lambda x: (x[1].startswith("A10B")), drugs_detected)
)
oral_antidiabetics_detected
# Out: [('glucophage', 'A10BA02')]

Glucophage is the brand name of a medication that contains metformine, the first-line medication for the treatment of type 2 diabetes.

Configuration

The pipeline can be configured using the following parameters :

PARAMETER DESCRIPTION
attr

Attribute to match on, eg TEXT, NORM, etc.

TYPE: str DEFAULT: 'NORM'

ignore_excluded

Whether to skip excluded tokens during matching.

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to skip space tokens during matching.

TYPE: bool DEFAULT: False

term_matcher

The term matcher to use, either TerminologyTermMatcher.exact or TerminologyTermMatcher.simstring

TYPE: TerminologyTermMatcher DEFAULT: TerminologyTermMatcher.exact

term_matcher_config

The configuration for the term matcher

TYPE: Dict[str, Any] DEFAULT: {}

Authors and citation

The eds.drugs pipeline was developed by the IAM team and CHU de Bordeaux's Data Science team.


  1. Sébastien Cossin, Luc Lebrun, Grégory Lobre, Romain Loustau, Vianney Jouhet, Romain Griffier, Fleur Mougin, Gayo Diallo, and Frantz Thiessard. Romedi: An Open Data Source About French Drugs on the Semantic Web. Studies in Health Technology and Informatics, 264:79–82, August 2019. URL: https://hal.archives-ouvertes.fr/hal-02987843, doi:10.3233/SHTI190187