Skip to content

edsnlp.pipelines.ner.scores.sofa.factory

create_component(nlp, name, regex=patterns.regex, value_extract=patterns.value_extract, score_normalization=patterns.score_normalization_str, attr='NORM', window=10, ignore_excluded=False, ignore_space_tokens=False, flags=0)

Matcher component to extract the SOFA score

PARAMETER DESCRIPTION
nlp

The spaCy object.

TYPE: Language

name

The name of the extracted score

TYPE: str

regex

A list of regexes to identify the SOFA score

TYPE: List[str] DEFAULT: patterns.regex

attr

Whether to match on the text ('TEXT') or on the normalized text ('CUSTOM_NORM')

TYPE: str DEFAULT: 'NORM'

value_extract

Regex to extract the score value

TYPE: Dict[str, str] DEFAULT: patterns.value_extract

score_normalization

Function that takes the "raw" value extracted from the value_extract regex, and should return - None if no score could be extracted - The desired score value else

TYPE: Callable[[Union[str, None]], Any] DEFAULT: patterns.score_normalization_str

window

Number of token to include after the score's mention to find the score's value

TYPE: int DEFAULT: 10

ignore_excluded

Whether to ignore excluded spans

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to ignore space tokens

TYPE: bool DEFAULT: False

flags

Flags to pass to the regex

TYPE: Union[re.RegexFlag, int] DEFAULT: 0

Source code in edsnlp/pipelines/ner/scores/sofa/factory.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
@deprecated_factory(
    "SOFA",
    "eds.SOFA",
    default_config=DEFAULT_CONFIG,
    assigns=["doc.ents", "doc.spans"],
)
@Language.factory(
    "eds.SOFA",
    default_config=DEFAULT_CONFIG,
    assigns=["doc.ents", "doc.spans"],
)
def create_component(
    nlp: Language,
    name: str,
    regex: List[str] = patterns.regex,
    value_extract: List[Dict[str, str]] = patterns.value_extract,
    score_normalization: Union[
        str, Callable[[Union[str, None]], Any]
    ] = patterns.score_normalization_str,
    attr: str = "NORM",
    window: int = 10,
    ignore_excluded: bool = False,
    ignore_space_tokens: bool = False,
    flags: Union[re.RegexFlag, int] = 0,
):
    """
    Matcher component to extract the SOFA score

    Parameters
    ----------
    nlp : Language
        The spaCy object.
    name : str
        The name of the extracted score
    regex : List[str]
        A list of regexes to identify the SOFA score
    attr : str
        Whether to match on the text ('TEXT') or on the normalized text ('CUSTOM_NORM')
    value_extract : Dict[str, str]
        Regex to extract the score value
    score_normalization : Callable[[Union[str,None]], Any]
        Function that takes the "raw" value extracted from the `value_extract` regex,
        and should return
        - None if no score could be extracted
        - The desired score value else
    window : int
        Number of token to include after the score's mention to find the
        score's value
    ignore_excluded : bool
        Whether to ignore excluded spans
    ignore_space_tokens : bool
        Whether to ignore space tokens
    flags : Union[re.RegexFlag, int]
        Flags to pass to the regex
    """
    return Sofa(
        nlp,
        score_name=name,
        regex=regex,
        value_extract=value_extract,
        score_normalization=score_normalization,
        attr=attr,
        window=window,
        ignore_excluded=ignore_excluded,
        ignore_space_tokens=ignore_space_tokens,
        flags=flags,
    )