Tutorials

We provide step-by-step guides to get you started. We cover the following use-cases:

Matching a terminology: you're looking for a concept within a corpus of texts.
Qualifying entities: you want to make sure that the concept you've extracted are not invalidated by linguistic modulation.
Detecting dates, which could serve as the basis for an event ordering algorithm.
Processing multiple texts: to improve the inference speed of your pipeline !
Detecting Hospitalisation Reason: you want to look spans that mention the reason of hospitalisation or tag entities as the reason.
Detecting false endlines: classify each endline and add the attribute excluded to the these tokens.

Rationale

In a typical medical NLP pipeline, a group of clinicians would define a list of synonyms for a given concept of interest (say, for example, diabetes), and look for that terminology in a corpus of documents.

Now, consider the following example:

FrenchEnglish

Le patient n'est pas diabétique.
Le patient est peut-être diabétique.
Le père du patient est diabétique.

The patient is not diabetic.
The patient could be diabetic.
The patient's father is diabetic.

There is an obvious problem: none of these examples should lead us to include this particular patient into the cohort.

Warning

We show an English example just to explain the issue. EDS-NLP remains a French-language medical NLP library.

To curb this issue, EDS-NLP proposes rule-based pipelines that qualify entities to help the user make an informed decision about which patient should be included in a real-world data cohort.

To sum up, a typical medical NLP project consists in:

Editing a terminology
"Matching" this terminology on a corpus, ie extract phrases that belong to that terminology
"Qualifying" entities to avoid false positives

Once the pipeline is ready, we need to deploy it efficiently.