edsnlp.pipelines.core.matcher
matcher
GenericMatcher
Bases: BaseComponent
Provides a generic matcher component.
PARAMETER | DESCRIPTION |
---|---|
nlp |
The spaCy object.
TYPE:
|
terms |
A dictionary of terms.
TYPE:
|
regex |
A dictionary of regular expressions.
TYPE:
|
attr |
The default attribute to use for matching.
Can be overiden using the
TYPE:
|
filter_matches |
Whether to filter out matches.
TYPE:
|
on_ents_only |
Whether to to look for matches around pre-extracted entities only.
TYPE:
|
ignore_excluded |
Whether to skip excluded tokens (requires an upstream pipeline to mark excluded tokens).
TYPE:
|
Source code in edsnlp/pipelines/core/matcher/matcher.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
nlp = nlp
instance-attribute
attr = attr
instance-attribute
phrase_matcher = EDSPhraseMatcher(self.nlp.vocab, attr=attr, ignore_excluded=ignore_excluded)
instance-attribute
regex_matcher = RegexMatcher(attr=attr, ignore_excluded=ignore_excluded)
instance-attribute
__init__(nlp, terms, regex, attr, ignore_excluded)
Source code in edsnlp/pipelines/core/matcher/matcher.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
process(doc)
Find matching spans in doc.
PARAMETER | DESCRIPTION |
---|---|
doc |
spaCy Doc object.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
spans
|
List of Spans returned by the matchers. |
Source code in edsnlp/pipelines/core/matcher/matcher.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
__call__(doc)
Adds spans to document.
PARAMETER | DESCRIPTION |
---|---|
doc |
spaCy Doc object
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
doc
|
spaCy Doc object, annotated for extracted terms. |
Source code in edsnlp/pipelines/core/matcher/matcher.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
|
factory
DEFAULT_CONFIG = dict(terms=None, regex=None, attr='TEXT', ignore_excluded=False)
module-attribute
create_component(nlp, name, terms, attr, regex, ignore_excluded)
Source code in edsnlp/pipelines/core/matcher/factory.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|