Skip to content

Alcohol consumption

The eds.alcohol pipeline component extracts mentions of alcohol consumption. It won't match occasional consumption, nor acute intoxication.

Details of the used patterns
# fmt: off
default_patterns = dict(
    source="alcohol",
    regex=[
        r"\balco[ol]",
        r"\bethyl",
        r"(?<!(25.?)|(sevrage)).?\boh\b",
        r"exogenose",
        r"delirium.tremens",
    ],
    exclude=[
        dict(
            regex=[
                "occasion",
                "episod",
                "festi",
                "rare",
                "libre",  # OH-libres
                "aigu",
            ],
            window=(-3, 5),
        ),
        dict(
            regex=["pansement", "compress"],
            window=-3,
        ),
    ],
    regex_attr="NORM",
    assign=[
        dict(
            name="stopped",
            regex=r"(\bex\b|sevr|arret|stop|ancien)",
            window=(-3, 15),
            reduce_mode="keep_first",
        ),
        dict(
            name="zero_after",
            regex=r"(?=^[a-z]*\s*:?[\s-]*(0|non|aucun|jamais))",
            window=3,
            reduce_mode="keep_first",
        ),
    ],
)
# fmt: on

Extensions

On each span span that match, the following attributes are available:

  • span._.detailed_status: either None or "ABSTINENCE" if the patient stopped its consumption
  • span._.negation: set to True when a mention such as "alcool: 0" is found

Use qualifiers !

Although the alcohol pipe sometime sets value for the negation attribute, generic qualifier should still be used after the pipe.

Examples

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
    eds.normalizer(
        accents=True,
        lowercase=True,
        quotes=True,
        spaces=True,
        pollution=dict(
            information=True,
            bars=True,
            biology=True,
            doctors=True,
            web=True,
            coding=True,
            footer=True,
        ),
    ),
)
nlp.add_pipe(f"eds.alcohol")

Below are a few examples:

text = "Patient alcoolique."
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [alcoolique]
text = "OH chronique."
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [OH]
text = "Prise d'alcool occasionnelle"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: []
text = "Application d'un pansement alcoolisé"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: []
text = "Alcoolisme sevré"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [Alcoolisme sevré]

span = spans[0]

span._.detailed_status
# Out: ABSTINENCE

span._.assigned
# Out: {'stopped': sevré}
text = "Alcoolisme non sevré"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [Alcoolism non sevré]

span = spans[0]

span._.detailed_status
# Out: None # "sevré" is negated, so no "ABTINENCE" status
text = "Alcool: 0"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [Alcool: 0]

span = spans[0]

span._.negation
# Out: True

span._.assigned
# Out: {'zero_after': 0}
text = "Le patient est en cours de sevrage éthylotabagique"
doc = nlp(text)
spans = doc.spans["alcohol"]

spans
# Out: [sevrage éthylotabagique]

span = spans[0]

span._.detailed_status
# Out: ABSTINENCE

span._.assigned
# Out: {'stopped': sevrage}

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline object

TYPE: Optional[PipelineProtocol]

name

The name of the component

TYPE: Optional[str]

patterns

The patterns to use for matching

TYPE: Union[Dict[str, Any], List[Dict[str, Any]]] DEFAULT: {'source': 'alcohol', 'regex': ['\\balco[ol]', ...

label

The label to use for the Span object and the extension

TYPE: str DEFAULT: alcohol

span_setter

How to set matches on the doc

TYPE: SpanSetterArg DEFAULT: {'ents': True, 'alcohol': True}

Authors and citation

The eds.alcohol component was developed by AP-HP's Data Science team with a team of medical experts, following the insights of the algorithm proposed by Petit-Jean et al., 2024.


  1. Petit-Jean T., Gérardin C., Berthelot E., Chatellier G., Frank M., Tannier X., Kempf E. and Bey R., 2024. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. Journal of the American Medical Informatics Association. 31, pp.1280-1290. 10.1093/jamia/ocae069