Skip to content

CKD[source]

The eds.CKD pipeline component extracts mentions of CKD (Chronic Kidney Disease). It will notably match:

  • Mentions of various diseases (see below)
  • Kidney transplantation
  • Chronic dialysis
  • Renal failure from stage 3 to 5. The stage is extracted by trying 3 methods:
    • Extracting the mentioned stage directly ("IRC stade IV")
    • Extracting the severity directly ("IRC terminale")
    • Extracting the mentioned GFR (DFG in french) ("IRC avec DFG estimé à 30 mL/min/1,73m2)")
Details of the used patterns
# fmt: off
main_pattern = dict(
    source="main",
    regex=[
        r"glomerulonephrite",
        r"(?<!pyelo)nephrite.{1,10}chronique",
        r"glomerulopathie",
        r"\bGNIgA",
        r"syndrome.{1,5}nephrotique",
        r"nephroangiosclerose",
        r"mal.de.bright",
        r"(maladie|syndrome).{1,7}berger",
        r"(maladie|syndrome).{1,7}bright",
        r"rachitisme.{1,5}renal",
        r"sydrome.{1,5}alport",
        r"good.?pasture",
        r"siadh",
        r"tubulopathie",
    ],
    exclude=dict(
        regex=[
            "aigu",
        ],
        window=4,
    ),
    regex_attr="NORM",
)

transplantation = dict(
    source="transplantation",
    regex=[
        r"transplant.{1,15}(rein|renal)",
        r"greff.{1,10}(rein|renal)",
    ],
    regex_attr="NORM",
)

acute_on_chronic = dict(
    source="acute_on_chronic",
    regex=[
        r"insuffisan.{1,10}(rein|renal).{1,5}aig.{1,10}chron",
    ],
    regex_attr="NORM",
)

dialysis = dict(
    source="dialysis",
    regex=[
        r"\beer\b",
        r"epuration extra.*renale",
        r"dialys",
    ],
    regex_attr="NORM",
    assign=[
        dict(
            name="chronic",
            regex=r"("
            + r"|".join(
                [
                    "long",
                    "chronique",
                    "peritoneal",
                    "depuis",
                    "intermitten",
                    "quotidien",
                    "hebdo",
                    "seances",
                    "reprise",
                    "poursuite",
                    "programme",
                    r"\blun",
                    r"\bmar",
                    r"\bmer",
                    r"\bjeu",
                    r"\bven",
                    r"\bsam",
                    r"\bdim",
                ]
            )
            + r")",
            window=(-5, 5),
        ),
    ],
)

general = dict(
    source="general",
    regex=[
        r"(insuffisan|fonction|malad).{1,10}\b(rein|rena)",
        r"\bmrc[^a-z]",
        r"\birc[^a-z]",
        r"nephropathie",
    ],
    regex_attr="NORM",
    assign=[
        dict(
            name="stage",
            regex=r"\b(iii|iv|v|3|4|5)\b",
            window=7,
            reduce_mode="keep_first",
        ),
        dict(
            name="status",
            regex=r"\b(moder|sever|terminal|pre.?greffe|post.?greffe|dialys|pre.?terminal)",  # noqa
            window=7,
            reduce_mode="keep_first",
        ),
        dict(
            name="dfg",
            regex=r"(?:dfg|debit.{1,10}filtration.{1,5}glomerulaire).*?(\d+[\.,]?\d+)",
            window=20,
            reduce_mode="keep_first",
        ),
    ],
)

acronym = dict(
    source="acronym",
    regex=[
        r"\bDPCA\b",
        r"\bGNMP\b",
        r"\bGEM\b",
        r"\bNCM\b",
    ],
    regex_attr="TEXT",
)

default_patterns = [
    main_pattern,
    transplantation,
    dialysis,
    general,
    acronym,
    acute_on_chronic,
]
# fmt: on

Extensions

On each span span that match, the following attributes are available:

  • span._.detailed_status: set to None
  • span._.assigned: dictionary with the following keys, if relevant:
    • stage: mentioned renal failure stage
    • status: mentioned renal failure severity (e.g. modérée, sévère, terminale, etc.)
    • dfg: mentioned DFG

Examples

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
    eds.normalizer(
        accents=True,
        lowercase=True,
        quotes=True,
        spaces=True,
        pollution=dict(
            information=True,
            bars=True,
            biology=True,
            doctors=True,
            web=True,
            coding=True,
            footer=True,
        ),
    ),
)
nlp.add_pipe(eds.ckd())

Below are a few examples:

text = "Patient atteint d'une glomérulopathie."
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [glomérulopathie]
text = "Patient atteint d'une tubulopathie aigüe."
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: []
text = "Patient transplanté rénal"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [transplanté rénal]
text = "Présence d'une insuffisance rénale aigüe sur chronique"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [insuffisance rénale aigüe sur chronique]
text = "Le patient a été dialysé"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: []
text = "Le patient est dialysé chaque lundi"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [dialysé chaque lundi]

span = spans[0]

span._.assigned
# Out: {'chronic': [lundi]}
text = "Présence d'une IRC"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: []
text = "Présence d'une IRC sévère"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [IRC sévère]

span = spans[0]

span._.assigned
# Out: {'status': sévère}
text = "Présence d'une IRC au stade IV"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [IRC au stade IV]

span = spans[0]

span._.assigned
# Out: {'stage': IV}
text = "Présence d'une IRC avec DFG à 30"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: [IRC avec DFG à 30]

span = spans[0]

span._.assigned
# Out: {'dfg': 30}
text = "Présence d'une maladie rénale avec DFG à 110"
doc = nlp(text)
spans = doc.spans["ckd"]

spans
# Out: []

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline

TYPE: Optional[PipelineProtocol]

name

The name of the component

TYPE: Optional[str] DEFAULT: 'ckd'

patterns

The patterns to use for matching

TYPE: Union[Dict[str, Any], List[Dict[str, Any]]] DEFAULT: [{'source': 'main', 'regex': ['glomerulonephrit...

label

The label to use for the Span object and the extension

TYPE: str DEFAULT: ckd

span_setter

How to set matches on the doc

TYPE: SpanSetterArg DEFAULT: {'ents': True, 'ckd': True}

Authors and citation

The eds.ckd component was developed by AP-HP's Data Science team with a team of medical experts, following the insights of the algorithm proposed by Petit-Jean et al., 2024.


  1. Petit-Jean T., Gérardin C., Berthelot E., Chatellier G., Frank M., Tannier X., Kempf E. and Bey R., 2024. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. Journal of the American Medical Informatics Association. 31, pp.1280-1290. 10.1093/jamia/ocae069