Skip to content

edsnlp.pipelines.ner.cim10.factory

create_component(nlp, name='eds.cim10', attr='NORM', ignore_excluded=False, ignore_space_tokens=False, term_matcher=TerminologyTermMatcher.exact, term_matcher_config={})

Create a factory that returns new a component to recognize and normalize CIM10 codes and concepts in documents.

PARAMETER DESCRIPTION
nlp

spaCy Language object.

TYPE: Language

name

The name of the pipe

TYPE: str DEFAULT: 'eds.cim10'

attr

Attribute to match on, eg TEXT, NORM, etc.

TYPE: Union[str, Dict[str, str]] DEFAULT: 'NORM'

ignore_excluded

Whether to skip excluded tokens during matching.

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to skip space tokens during matching.

TYPE: bool DEFAULT: False

term_matcher

The term matcher to use, either TerminologyTermMatcher.exact or TerminologyTermMatcher.simstring

TYPE: TerminologyTermMatcher DEFAULT: TerminologyTermMatcher.exact

term_matcher_config

The configuration for the term matcher

TYPE: Dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
TerminologyMatcher
Source code in edsnlp/pipelines/ner/cim10/factory.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
@Language.factory(
    "eds.cim10", default_config=DEFAULT_CONFIG, assigns=["doc.ents", "doc.spans"]
)
def create_component(
    nlp: Language,
    name: str = "eds.cim10",
    attr: Union[str, Dict[str, str]] = "NORM",
    ignore_excluded: bool = False,
    ignore_space_tokens: bool = False,
    term_matcher: TerminologyTermMatcher = TerminologyTermMatcher.exact,
    term_matcher_config: Dict[str, Any] = {},
):
    """
    Create a factory that returns new a component to recognize and normalize CIM10 codes
    and concepts in documents.

    Parameters
    ----------
    nlp: Language
        spaCy `Language` object.
    name: str
        The name of the pipe
    attr: Union[str, Dict[str, str]]
        Attribute to match on, eg `TEXT`, `NORM`, etc.
    ignore_excluded: bool
        Whether to skip excluded tokens during matching.
    ignore_space_tokens: bool
        Whether to skip space tokens during matching.
    term_matcher: TerminologyTermMatcher
        The term matcher to use, either `TerminologyTermMatcher.exact` or
        `TerminologyTermMatcher.simstring`
    term_matcher_config: Dict[str, Any]
        The configuration for the term matcher

    Returns
    -------
    TerminologyMatcher
    """

    return TerminologyMatcher(
        nlp,
        label="cim10",
        regex=None,
        terms=patterns.get_patterns(),
        attr=attr,
        ignore_excluded=ignore_excluded,
        ignore_space_tokens=ignore_space_tokens,
        term_matcher=term_matcher,
        term_matcher_config=term_matcher_config,
    )