edsnlp.utils.span_getters
SpanSetterArg
Valid values for the span_setter argument of a component can be :
- a (doc, matches) -> None callable
- a span group name
- a list of span group names
- a dict of group name to True or list of labels
The group name "ents" is a special case, and will add the matches to doc.ents
Examples
span_setter=["ents", "ckd"]will add the matches to bothdoc.entsanddoc.spans["ckd"]. It is equivalent to{"ents": True, "ckd": True}.span_setter={"ents": ["foo", "bar"]}will add the matches with label "foo" and "bar" todoc.ents.span_setter="ents"will add all matches only todoc.ents.span_setter="ckd"will add all matches only todoc.spans["ckd"].
SpanGetterArg
Valid values for the span_getter argument of a component can be :
- a (doc) -> spans callable
- a span group name
- a list of span group names
- a dict of group name to True or list of labels
The group name "ents" is a special case, and will get the matches from doc.ents
Examples
span_getter=["ents", "ckd"]will get the matches from bothdoc.entsanddoc.spans["ckd"]. It is equivalent to{"ents": True, "ckd": True}.span_getter={"ents": ["foo", "bar"]}will get the matches with label "foo" and "bar" fromdoc.ents.span_getter="ents"will get all matches fromdoc.ents.span_getter="ckd"will get all matches fromdoc.spans["ckd"].
make_span_context_getter
Create a span context getter.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
span_getter | Span getter, i.e. for which spans to get the context. TYPE: |
context_words | Minimum number of words to include on each side of the span. TYPE: |
context_sents | Minimum number of sentences to include on each side of the span:
By default, 0 if the document has no sentence annotations, 1 otherwise. TYPE: |
overlap_policy | How to handle overlapping spans:
TYPE: |
merge_spans
Merge overlapping spans into a single span.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
spans | List of spans to merge. TYPE: |
doc | Document to merge the spans on. TYPE: |
| RETURNS | DESCRIPTION |
|---|---|
List[Span] | Merged spans. |