Peptic ulcer disease[source]

The eds.peptic_ulcer_disease pipeline component extracts mentions of peptic ulcer disease.

Details of the used patterns

# fmt: off
main_pattern = dict(
    source="main",
    regex=[
        r"ulcere?.{1,10}gastr",
        r"ulcere?.{1,10}duoden",
        r"ulcere?.{1,10}antra",
        r"ulcere?.{1,10}pept",
        r"ulcere?.{1,10}estomac?",
        r"ulcere?.{1,10}curling",
        r"ulcere?.{1,10}bulb",
        r"(œ|oe)sophagites?.{1,5}pepti.{1,10}ulcer",
        r"gastrite.{1,20}ulcer",
        r"antrite.{1,5}ulcer",
    ],
    regex_attr="NORM",
)

acronym = dict(
    source="acronym",
    regex=[
        r"\bUGD\b",
    ],
    regex_attr="TEXT",
)

generic = dict(
    source="generic",
    regex=r"ulcere?",
    regex_attr="NORM",
    assign=dict(
        name="is_peptic",
        regex=r"\b(gastr|digest)",
        window=(-20, 20),
        limit_to_sentence=False,
    ),
)

default_patterns = [
    main_pattern,
    acronym,
    generic,
]

# fmt: on

Extensions

On each span span that matches, the following attributes are available:

span._.detailed_status: set to None

Examples

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
    eds.normalizer(
        accents=True,
        lowercase=True,
        quotes=True,
        spaces=True,
        pollution=dict(
            information=True,
            bars=True,
            biology=True,
            doctors=True,
            web=True,
            coding=True,
            footer=True,
        ),
    ),
)
nlp.add_pipe(eds.peptic_ulcer_disease())

Below are a few examples:

1234

text = "Beaucoup d'ulcères gastriques"
doc = nlp(text)
spans = doc.spans["peptic_ulcer_disease"]

spans
# Out: [ulcères gastriques]

text = "Présence d'UGD"
doc = nlp(text)
spans = doc.spans["peptic_ulcer_disease"]

spans
# Out: [UGD]

text = "La patient à des ulcères"
doc = nlp(text)
spans = doc.spans["peptic_ulcer_disease"]

spans
# Out: []

text = "Au niveau gastrique: blabla blabla blabla blabla blabla quelques ulcères"
doc = nlp(text)
spans = doc.spans["peptic_ulcer_disease"]

spans
# Out: [gastrique: blabla blabla blabla blabla blabla quelques ulcères]

span = spans[0]

span._.assigned
# Out: {'is_peptic': [gastrique]}

Parameters

PARAMETER	DESCRIPTION
`nlp`	The pipeline object TYPE: `Optional[PipelineProtocol]`
`name`	The name of the component TYPE: `Optional[str]`
`patterns`	The patterns to use for matching TYPE: `FullConfig` DEFAULT: `[{'source': 'main', 'regex': ['ulcere?.{1,10}ga...`
`label`	The label to use for the `Span` object and the extension TYPE: `str` DEFAULT: `peptic_ulcer_disease`
`span_setter`	How to set matches on the doc TYPE: `SpanSetterArg` DEFAULT: `{'ents': True, 'peptic_ulcer_disease': True}`

Authors and citation

The eds.peptic_ulcer_disease component was developed by AP-HP's Data Science team with a team of medical experts, following the insights of the algorithm proposed by Petit-Jean et al., 2024.

Petit-Jean T., Gérardin C., Berthelot E., Chatellier G., Frank M., Tannier X., Kempf E. and Bey R., 2024. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. Journal of the American Medical Informatics Association. 31, pp.1280-1290. 10.1093/jamia/ocae069