Skip to content

edsnlp.pipelines.ner.drugs.factory

create_component(nlp, name='eds.drugs', attr='NORM', ignore_excluded=False, ignore_space_tokens=False, term_matcher=TerminologyTermMatcher.exact, term_matcher_config={})

Create a new component to recognize and normalize drugs in documents. The terminology is based on Romedi (see documentation) and the drugs are normalized to the ATC codes.

PARAMETER DESCRIPTION
nlp

spaCy Language object.

TYPE: Language

name

The name of the pipe

TYPE: str DEFAULT: 'eds.drugs'

attr

Attribute to match on, eg TEXT, NORM, etc.

TYPE: str DEFAULT: 'NORM'

ignore_excluded

Whether to skip excluded tokens during matching.

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to skip space tokens during matching.

TYPE: bool DEFAULT: False

term_matcher

The term matcher to use, either TerminologyTermMatcher.exact or TerminologyTermMatcher.simstring

TYPE: TerminologyTermMatcher DEFAULT: TerminologyTermMatcher.exact

term_matcher_config

The configuration for the term matcher

TYPE: Dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
TerminologyMatcher
Source code in edsnlp/pipelines/ner/drugs/factory.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
@Language.factory(
    "eds.drugs",
    default_config=DEFAULT_CONFIG,
    assigns=["doc.ents", "doc.spans"],
)
def create_component(
    nlp: Language,
    name: str = "eds.drugs",
    attr: str = "NORM",
    ignore_excluded: bool = False,
    ignore_space_tokens: bool = False,
    term_matcher: TerminologyTermMatcher = TerminologyTermMatcher.exact,
    term_matcher_config: Dict[str, Any] = {},
):
    """
    Create a new component to recognize and normalize drugs in documents.
    The terminology is based on Romedi (see documentation) and the
    drugs are normalized to the ATC codes.

    Parameters
    ----------
    nlp: Language
        spaCy `Language` object.
    name: str
        The name of the pipe
    attr: Union[str, Dict[str, str]]
        Attribute to match on, eg `TEXT`, `NORM`, etc.
    ignore_excluded: bool
        Whether to skip excluded tokens during matching.
    ignore_space_tokens: bool
        Whether to skip space tokens during matching.
    term_matcher: TerminologyTermMatcher
        The term matcher to use, either `TerminologyTermMatcher.exact` or
        `TerminologyTermMatcher.simstring`
    term_matcher_config: Dict[str, Any]
        The configuration for the term matcher

    Returns
    -------
    TerminologyMatcher
    """
    return TerminologyMatcher(
        nlp,
        label="drug",
        terms=patterns.get_patterns(),
        regex=dict(),
        attr=attr,
        ignore_excluded=ignore_excluded,
        ignore_space_tokens=ignore_space_tokens,
        term_matcher=term_matcher,
        term_matcher_config=term_matcher_config,
    )