Diabetes[source]
The eds.diabetes
pipeline component extracts mentions of diabetes.
Details of the used patterns
# fmt: off
COMPLICATIONS = [
r"nephropat",
r"neuropat",
r"retinopat",
r"glomerulopathi",
r"glomeruloscleros",
r"angiopathi",
r"origine",
]
main_pattern = dict(
source="main",
regex=[
r"\bds?n?id\b",
r"\bdiabet[^o]",
r"\bdb\b",
r"\bdt.?(i|ii|1|2)\b",
],
exclude=dict(
regex=[
"insipide",
"nephrogenique",
"aigu",
r"\bdr\b", # Dr. ...
"endocrino", # Section title
"soins aux pieds", # Section title
"nutrition", # Section title
r"\s?:\n+\W+(?!oui|non|\W)", # General pattern for section title
],
window=(-5, 5),
),
regex_attr="NORM",
assign=[
dict(
name="complicated_before",
regex=r"(" + r"|".join(COMPLICATIONS + ["origine"]) + r")",
window=-3,
),
dict(
name="complicated_after",
regex=r"("
+ r"|".join([r"(?<!sans )compli", r"(?<!a)symptomatique"] + COMPLICATIONS)
+ r")",
window=12,
),
dict(
name="type",
regex=r"type.(i|ii|1|2)",
window=6,
),
dict(
name="insulin",
regex=r"insulino.?(dep|req)",
window=6,
),
dict(
name="corticoid",
regex=r"(\bctc\b|cortico(?:.?induit)?)",
window=6,
),
],
)
complicated_pattern = dict(
source="complicated",
regex=[
r"(mal|maux).perforants?(.plantaire)?",
r"pieds? diabeti",
],
exclude=dict(
regex="soins aux", # Section title
window=-2,
),
regex_attr="NORM",
)
default_patterns = [
main_pattern,
complicated_pattern,
]
# fmt: on
Extensions
On each span span
that match, the following attributes are available:
span._.detailed_status
: set to either"WITH_COMPLICATION"
if the diabetes is complicated (e.g., via organ damages)"WITHOUT_COMPLICATION"
otherwise
span._.assigned
: dictionary with the following keys, if relevant:type
: type of diabetes (I or II)insulin
: if the diabetes is insulin-dependentcorticoid
: if the diabetes is corticoid-induced
Examples
import edsnlp, edsnlp.pipes as eds
nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
nlp.add_pipe(
eds.normalizer(
accents=True,
lowercase=True,
quotes=True,
spaces=True,
pollution=dict(
information=True,
bars=True,
biology=True,
doctors=True,
web=True,
coding=True,
footer=True,
),
),
)
nlp.add_pipe(eds.diabetes())
Below are a few examples:
text = "Présence d'un DT2"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [DT2]
text = "Présence d'un DNID"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [DNID]
text = "Patient diabétique"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [diabétique]
text = "Un diabète insipide"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: []
text = "Atteinte neurologique d'origine diabétique"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [origine diabétique]
span = spans[0]
span._.detailed_status
# Out: WITH_COMPLICATION
span._.assigned
# Out: {'complicated_before': [origine]}
text = "Une rétinopathie diabétique"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [rétinopathie diabétique]
span = spans[0]
span._.detailed_status
# Out: WITH_COMPLICATION
span._.assigned
# Out: {'complicated_before': [rétinopathie]}
text = "Il y a un mal perforant plantaire"
doc = nlp(text)
spans = doc.spans["diabetes"]
spans
# Out: [mal perforant plantaire]
span = spans[0]
span._.detailed_status
# Out: WITH_COMPLICATION
Parameters
PARAMETER | DESCRIPTION |
---|---|
nlp | The pipeline TYPE: |
name | The name of the component TYPE: |
patterns | The patterns to use for matching DEFAULT: |
label | The label to use for the TYPE: |
span_setter | The span setter to use TYPE: |
Authors and citation
The eds.diabetes
component was developed by AP-HP's Data Science team with a team of medical experts, following the insights of the algorithm proposed by Petit-Jean et al., 2024.
Petit-Jean T., Gérardin C., Berthelot E., Chatellier G., Frank M., Tannier X., Kempf E. and Bey R., 2024. Collaborative and privacy-enhancing workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions. Journal of the American Medical Informatics Association. 31, pp.1280-1290. 10.1093/jamia/ocae069