Negation

The eds.negation component uses a simple rule-based algorithm to detect negated spans. It was designed at AP-HP's EDS, following the insights of the NegEx algorithm by Chapman et al., 2001.

The component looks for five kinds of expressions in the text :

preceding negations, i.e., cues that precede a negated expression
following negations, i.e., cues that follow a negated expression
pseudo negations : contain a negation cue, but are not negations (eg "pas de doute"/"no doubt")
negation verbs, i.e., verbs that indicate a negation
terminations, i.e., words that delimit propositions. The negation spans from the preceding cue to the termination.

Examples

The following snippet matches a simple terminology, and checks the polarity of the extracted entities. It is complete and can be run as is.

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(eds.sentences())
# Dummy matcher
nlp.add_pipe(eds.matcher(terms=dict(patient="patient", fracture="fracture")))
nlp.add_pipe(eds.negation())

text = (
    "Le patient est admis le 23 août 2021 pour une douleur au bras. "
    "Le scanner ne détecte aucune fracture."
)

doc = nlp(text)

doc.ents
# Out: (patient, fracture)

doc.ents[0]._.negation  # (1)
# Out: False

doc.ents[1]._.negation
# Out: True

The result of the component is kept in the negation custom extension.

Extensions

The eds.negation component declares two extensions, on both Span and Token objects :

The negation attribute is a boolean, set to True if the component predicts that the span/token is negated.
The negation_ property is a human-readable string, computed from the negation attribute. It implements a simple getter function that outputs AFF or NEG, depending on the value of negation.

Performance

The component's performance is measured on three datasets :

The ESSAI (Dalloux et al., 2017) and CAS (Grabar et al., 2018) datasets were developed at the CNRS. The two are concatenated.
The NegParHyp corpus was specifically developed at AP-HP to test the component on actual clinical notes, using pseudonymised notes from the AP-HP.

Dataset	Negation F1
CAS/ESSAI	71%
NegParHyp	88%

NegParHyp corpus

The NegParHyp corpus was built by matching a subset of the MeSH terminology with around 300 documents from AP-HP's clinical data warehouse. Matched entities were then labelled for negation, speculation and family context.

Parameters

PARAMETER	DESCRIPTION
`nlp`	The pipeline object. TYPE: `PipelineProtocol`
`name`	The component name. TYPE: `Optional[str]` DEFAULT: `'negation'`
`attr`	spaCy's attribute to use TYPE: `str` DEFAULT: `NORM`
`pseudo`	List of pseudo negation cues. TYPE: `Optional[List[str]]` DEFAULT: `None`
`preceding`	List of preceding negation cues TYPE: `Optional[List[str]]` DEFAULT: `None`
`preceding_regex`	List of preceding negation cues, but as regexes. TYPE: `Optional[List[str]]` DEFAULT: `None`
`following`	List of following negation cues. TYPE: `Optional[List[str]]` DEFAULT: `None`
`verbs`	List of negation verbs. TYPE: `Optional[List[str]]` DEFAULT: `None`
`termination`	List of termination terms. TYPE: `Optional[List[str]]` DEFAULT: `None`
`span_getter`	Which entities should be classified. By default, `doc.ents` TYPE: `SpanGetterArg` DEFAULT: `None`
`on_ents_only`	Deprecated, use `span_getter` instead. Whether to look for matches around detected entities only. Useful for faster inference in downstream tasks. If True, will look in all ents located in `doc.ents` only If an iterable of string is passed, will additionally look in `doc.spans[key]` for each key in the iterable TYPE: `Union[bool, str, List[str], Set[str]]` DEFAULT: `None`
`within_ents`	Whether to consider cues within entities. TYPE: `bool` DEFAULT: `False`
`explain`	Whether to keep track of cues for each entity. TYPE: `bool` DEFAULT: `False`

Authors and citation

The eds.negation component was developed by AP-HP's Data Science team.

Chapman W.W., Bridewell W., Hanbury P., Cooper G.F. and Buchanan B.G., 2001. A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries. Journal of Biomedical Informatics. 34, pp.301--310. 10.1006/jbin.2001.1029
Dalloux C., Claveau V. and Grabar N., 2017. Détection de la négation : corpus français et apprentissage supervisé. https://hal.archives-ouvertes.fr/hal-01659637
Grabar N., Claveau V. and Dalloux C., 2018. CAS: French Corpus with Clinical Cases. https://hal.archives-ouvertes.fr/hal-01937096