Trainable Span Qualifier

The eds.span_qualifier component is a trainable qualifier predictor. In EDS-NLP, we call span attributes "qualifiers". In this context, the span qualification task consists in assigning values (boolean, strings or any complex object) to attributes/extensions of spans such as:

span._.negation,
span._.date.mode
span._.cui

In the rest of this page, we will refer to a pair of (qualifier, value) as a "binding". For instance, the binding ("_.negation", True) means that the qualifier negation of the span is (or should be, when predicted) set to True.

Architecture

The model performs span classification by:

Calling a word pooling embedding such as eds.span_pooler to compute a single embedding for each span
Computing logits for each possible binding using a linear layer
Splitting these bindings into groups of exclusive values such as
- event=start and event=stop
- negated=False and negated=True
Note that the above groups are not exclusive, but the values within each group are.
Applying the best scoring binding in each group to each span

Examples

To create a span qualifier component, you can use the following code:

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(
    eds.span_qualifier(
        # To embed the spans, we will use a span pooler
        embedding=eds.span_pooler(
            pooling_mode="mean",  # mean pooling
            span_getter=["ents", "sc"],
            # that will use a transformer to embed the doc words
            embedding=eds.transformer(
                model="prajjwal1/bert-tiny",
                window=128,
                stride=96,
            ),
        ),
        # For every span embedded by the span pooler
        # (doc.ents and doc.spans["sc"]), we will predict both
        # span._.negation and span._.event_type
        qualifiers=["_.negation", "_.event_type"],
    ),
    name="qualifier",
)

To infer the values of the qualifiers, you can use the pipeline post_init method:

nlp.post_init(gold_data)

To train the model, refer to the Training tutorial.

You can inspect the bindings that will be used for training and prediction

print(nlp.pipes.qualifier.bindings)
# list of (qualifier name, span labels or True if all, values)
# Out: [
#   ('_.negation', True, [True, False]),
#   ('_.event_type', True, ['start', 'stop'])
# ]

You can also change these values and update the bindings by calling the update_bindings method. Don't forget to retrain the model if new values are added !

Parameters

PARAMETER	DESCRIPTION
`nlp`	The pipeline object TYPE: `PipelineProtocol`
`name`	Name of the component TYPE: `str`
`embedding`	The word embedding component TYPE: `SpanEmbeddingComponent`
`qualifiers`	The qualifiers to predict or train on. If a dict is given, keys are the qualifiers and values are the labels for which the qualifier is allowed, or True if the qualifier is allowed for all labels. TYPE: `QualifiersArg`
`keep_none`	If False, skip spans for which a qualifier returns None. If True (default), the None values will be learned and predicted, just as any other value. TYPE: `bool` DEFAULT: `False`