edsnlp.pipelines.core.matcher.factory
create_component(nlp, name='eds.matcher', terms=None, attr=None, regex='TEXT', ignore_excluded=False, ignore_space_tokens=False, term_matcher=GenericTermMatcher.exact, term_matcher_config={})
Provides a generic matcher component.
PARAMETER | DESCRIPTION |
---|---|
nlp |
The spaCy object.
TYPE:
|
name |
The name of the component.
TYPE:
|
terms |
A dictionary of terms.
TYPE:
|
regex |
A dictionary of regular expressions.
TYPE:
|
attr |
The default attribute to use for matching.
Can be overridden using the
TYPE:
|
ignore_excluded |
Whether to skip excluded tokens (requires an upstream pipeline to mark excluded tokens).
TYPE:
|
ignore_space_tokens |
Whether to skip space tokens during matching. You won't be able to match on newlines if this is enabled and
the "spaces"/"newline" option of
TYPE:
|
term_matcher |
The matcher to use for matching phrases ? One of (exact, simstring)
TYPE:
|
term_matcher_config |
Parameters of the matcher class
TYPE:
|
Source code in edsnlp/pipelines/core/matcher/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 |
|