CKD[source]
The eds.CKD
pipeline component extracts mentions of CKD (Chronic Kidney Disease). It will notably match:
- Mentions of various diseases (see below)
- Kidney transplantation
- Chronic dialysis
- Renal failure from stage 3 to 5. The stage is extracted by trying 3 methods:
- Extracting the mentioned stage directly ("IRC stade IV")
- Extracting the severity directly ("IRC terminale")
- Extracting the mentioned GFR (DFG in french) ("IRC avec DFG estimé à 30 mL/min/1,73m2)")
Details of the used patterns
# fmt: off
main_pattern = dict(
source="main",
regex=[
r"glomerulonephrite",
r"(?<!pyelo)nephrite.{1,10}chronique",
r"glomerulopathie",
r"\bGNIgA",
r"syndrome.{1,5}nephrotique",
r"nephroangiosclerose",
r"mal.de.bright",
r"(maladie|syndrome).{1,7}berger",
r"(maladie|syndrome).{1,7}bright",
r"rachitisme.{1,5}renal",
r"sydrome.{1,5}alport",
r"good.?pasture",
r"siadh",
r"tubulopathie",
],
exclude=dict(
regex=[
"aigu",
],
window=4,
),
regex_attr="NORM",
)
transplantation = dict(
source="transplantation",
regex=[
r"transplant.{1,15}(rein|renal)",
r"greff.{1,10}(rein|renal)",
],
regex_attr="NORM",
)
acute_on_chronic = dict(
source="acute_on_chronic",
regex=[
r"insuffisan.{1,10}(rein|renal).{1,5}aig.{1,10}chron",
],
regex_attr="NORM",
)
dialysis = dict(
source="dialysis",
regex=[
r"\beer\b",
r"epuration extra.*renale",
r"dialys",
],
regex_attr="NORM",
assign=[
dict(
name="chronic",
regex=r"("
+ r"|".join(
[
"long",
"chronique",
"peritoneal",
"depuis",
"intermitten",
"quotidien",
"hebdo",
"seances",
"reprise",
"poursuite",
"programme",
r"\blun",
r"\bmar",
r"\bmer",
r"\bjeu",
r"\bven",
r"\bsam",
r"\bdim",
]
)
+ r")",
window=(-5, 5),
),
],
)
general = dict(
source="general",
regex=[
r"(insuffisan|fonction|malad).{1,10}\b(rein|rena)",
r"\bmrc[^a-z]",
r"\birc[^a-z]",
r"nephropathie",
],
regex_attr="NORM",
assign=[
dict(
name="stage",
regex=r"\b(iii|iv|v|3|4|5)\b",
window=7,
reduce_mode="keep_first",
),
dict(
name="status",
regex=r"\b(moder|sever|terminal|pre.?greffe|post.?greffe|dialys|pre.?terminal)", # noqa
window=7,
reduce_mode="keep_first",
),
dict(
name="dfg",
regex=r"(?:dfg|debit.{1,10}filtration.{1,5}glomerulaire).*?(\d+[\.,]?\d+)",
window=20,
reduce_mode="keep_first",
),
],
)
acronym = dict(
source="acronym",
regex=[
r"\bDPCA\b",
r"\bGNMP\b",
r"\bGEM\b",
r"\bNCM\b",
],
regex_attr="TEXT",
)
default_patterns = [
main_pattern,
transplantation,
dialysis,
general,
acronym,
acute_on_chronic,
]
# fmt: on
Extensions
On each span span
that match, the following attributes are available:
span._.detailed_status
: set to Nonespan._.assigned
: dictionary with the following keys, if relevant:stage
: mentioned renal failure stagestatus
: mentioned renal failure severity (e.g. modérée, sévère, terminale, etc.)dfg
: mentioned DFG
Examples
import edsnlp, edsnlp.pipes as eds
nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
eds.normalizer(
accents=True,
lowercase=True,
quotes=True,
spaces=True,
pollution=dict(
information=True,
bars=True,
biology=True,
doctors=True,
web=True,
coding=True,
footer=True,
),
),
)
nlp.add_pipe(eds.ckd())
Below are a few examples:
text = "Patient atteint d'une glomérulopathie."
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [glomérulopathie]
text = "Patient atteint d'une tubulopathie aigüe."
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: []
text = "Patient transplanté rénal"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [transplanté rénal]
text = "Présence d'une insuffisance rénale aigüe sur chronique"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [insuffisance rénale aigüe sur chronique]
text = "Le patient a été dialysé"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: []
text = "Le patient est dialysé chaque lundi"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [dialysé chaque lundi]
span = spans[0]
span._.assigned
# Out: {'chronic': [lundi]}
text = "Présence d'une IRC"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: []
text = "Présence d'une IRC sévère"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [IRC sévère]
span = spans[0]
span._.assigned
# Out: {'status': sévère}
text = "Présence d'une IRC au stade IV"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [IRC au stade IV]
span = spans[0]
span._.assigned
# Out: {'stage': IV}
text = "Présence d'une IRC avec DFG à 30"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: [IRC avec DFG à 30]
span = spans[0]
span._.assigned
# Out: {'dfg': 30}
text = "Présence d'une maladie rénale avec DFG à 110"
doc = nlp(text)
spans = doc.spans["ckd"]
spans
# Out: []
Parameters
PARAMETER | DESCRIPTION |
---|---|
nlp | The pipeline TYPE: |
name | The name of the component TYPE: |
patterns | The patterns to use for matching TYPE: |
label | The label to use for the TYPE: |
span_setter | How to set matches on the doc TYPE: |
Authors and citation
The eds.ckd
component was developed by AP-HP's Data Science team with a team of medical experts, following the insights of the algorithm proposed by Petit-Jean et al., 2024.
Petit-Jean T., Gérardin C., Berthelot E., Chatellier G., Frank M., Tannier X., Kempf E. and Bey R., 2024. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. Journal of the American Medical Informatics Association. 31, pp.1280-1290. 10.1093/jamia/ocae069