edsnlp.pipelines.ner.scores
base_score
Score
Bases: AdvancedRegex
Matcher component to extract a numeric score
PARAMETER | DESCRIPTION |
---|---|
nlp |
The spaCy object.
TYPE:
|
score_name |
The name of the extracted score
TYPE:
|
regex |
A list of regexes to identify the score
TYPE:
|
attr |
Wether to match on the text ('TEXT') or on the normalized text ('NORM')
TYPE:
|
after_extract |
Regex with capturing group to get the score value
TYPE:
|
score_normalization |
Function that takes the "raw" value extracted from the
TYPE:
|
window |
Number of token to include after the score's mention to find the score's value
TYPE:
|
Source code in edsnlp/pipelines/ner/scores/base_score.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
score_name = score_name
instance-attribute
score_normalization = registry.get('misc', score_normalization)
instance-attribute
__init__(nlp, score_name, regex, attr, after_extract, score_normalization, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/base_score.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
set_extensions()
Source code in edsnlp/pipelines/ner/scores/base_score.py
72 73 74 75 76 77 78 |
|
__call__(doc)
Adds spans to document.
PARAMETER | DESCRIPTION |
---|---|
doc |
spaCy Doc object
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
doc
|
spaCy Doc object, annotated for extracted terms. |
Source code in edsnlp/pipelines/ner/scores/base_score.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
|
score_filtering(ents)
Extracts, if available, the value of the score.
Normalizes the score via the provided self.score_normalization
method.
PARAMETER | DESCRIPTION |
---|---|
ents |
List of spaCy's spans extracted by the score matcher
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
ents
|
List of spaCy's spans, with, if found, an added |
Source code in edsnlp/pipelines/ner/scores/base_score.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
factory
DEFAULT_CONFIG = dict(attr='NORM', window=7, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, score_name, regex, after_extract, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/factory.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
sofa
patterns
regex = ['\\bsofa\\b']
module-attribute
method_regex = 'sofa.*?((?P<max>max\\w*)|(?P<vqheures>24h\\w*)|(?P<admission>admission\\w*))(?P<after_value>(.|\\n)*)'
module-attribute
value_regex = '.*?.[\\n\\W]*?(\\d+)[^h\\d]'
module-attribute
score_normalization_str = 'score_normalization.sofa'
module-attribute
score_normalization(extracted_score)
Sofa score normalization. If available, returns the integer value of the SOFA score.
Source code in edsnlp/pipelines/ner/scores/sofa/patterns.py
17 18 19 20 21 22 23 24 25 |
|
sofa
Sofa
Bases: Score
Matcher component to extract the SOFA score
PARAMETER | DESCRIPTION |
---|---|
nlp |
The spaCy object.
TYPE:
|
score_name |
The name of the extracted score
TYPE:
|
regex |
A list of regexes to identify the SOFA score
TYPE:
|
attr |
Wether to match on the text ('TEXT') or on the normalized text ('CUSTOM_NORM')
TYPE:
|
method_regex |
Regex with capturing group to get the score extraction method (e.g. "Ã l'admission", "Ã 24H", "Maximum")
TYPE:
|
value_regex |
Regex to extract the score value
TYPE:
|
score_normalization |
Function that takes the "raw" value extracted from the
TYPE:
|
window |
Number of token to include after the score's mention to find the score's value
TYPE:
|
Source code in edsnlp/pipelines/ner/scores/sofa/sofa.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
method_regex = method_regex
instance-attribute
value_regex = value_regex
instance-attribute
__init__(nlp, score_name, regex, attr, method_regex, value_regex, score_normalization, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/sofa/sofa.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
|
set_extensions()
Source code in edsnlp/pipelines/ner/scores/sofa/sofa.py
71 72 73 74 75 |
|
score_filtering(ents)
Extracts, if available, the value of the score.
Normalizes the score via the provided self.score_normalization
method.
PARAMETER | DESCRIPTION |
---|---|
ents |
List of spaCy's spans extracted by the score matcher
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
ents
|
List of spaCy's spans, with, if found, an added |
Source code in edsnlp/pipelines/ner/scores/sofa/sofa.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
factory
DEFAULT_CONFIG = dict(regex=patterns.regex, method_regex=patterns.method_regex, value_regex=patterns.value_regex, score_normalization=patterns.score_normalization_str, attr='NORM', window=20, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, regex, method_regex, value_regex, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/sofa/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
charlson
patterns
regex = ['charlson']
module-attribute
after_extract = 'charlson.*?[\\n\\W]*?(\\d+)'
module-attribute
score_normalization_str = 'score_normalization.charlson'
module-attribute
score_normalization(extracted_score)
Charlson score normalization. If available, returns the integer value of the Charlson score.
Source code in edsnlp/pipelines/ner/scores/charlson/patterns.py
12 13 14 15 16 17 18 19 20 |
|
factory
DEFAULT_CONFIG = dict(regex=patterns.regex, after_extract=patterns.after_extract, score_normalization=patterns.score_normalization_str, attr='NORM', window=7, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, regex, after_extract, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/charlson/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
emergency
gemsa
patterns
regex = ['\\bgemsa\\b']
module-attribute
after_extract = 'gemsa.*?[\\n\\W]*?(\\d+)'
module-attribute
score_normalization_str = 'score_normalization.gemsa'
module-attribute
score_normalization(extracted_score)
GEMSA score normalization. If available, returns the integer value of the GEMSA score.
Source code in edsnlp/pipelines/ner/scores/emergency/gemsa/patterns.py
12 13 14 15 16 17 18 19 20 |
|
factory
DEFAULT_CONFIG = dict(regex=patterns.regex, after_extract=patterns.after_extract, score_normalization=patterns.score_normalization_str, attr='NORM', window=20, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, regex, after_extract, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/emergency/gemsa/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
priority
patterns
regex = ['\\bpriorite\\b']
module-attribute
after_extract = 'priorite.*?[\\n\\W]*?(\\d+)'
module-attribute
score_normalization_str = 'score_normalization.priority'
module-attribute
score_normalization(extracted_score)
Priority score normalization. If available, returns the integer value of the priority score.
Source code in edsnlp/pipelines/ner/scores/emergency/priority/patterns.py
12 13 14 15 16 17 18 19 20 |
|
factory
DEFAULT_CONFIG = dict(regex=patterns.regex, after_extract=patterns.after_extract, score_normalization=patterns.score_normalization_str, attr='NORM', window=7, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, regex, after_extract, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/emergency/priority/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
ccmu
patterns
regex = ['\\bccmu\\b']
module-attribute
after_extract = 'ccmu.*?[\\n\\W]*?(\\d+)'
module-attribute
score_normalization_str = 'score_normalization.ccmu'
module-attribute
score_normalization(extracted_score)
CCMU score normalization. If available, returns the integer value of the CCMU score.
Source code in edsnlp/pipelines/ner/scores/emergency/ccmu/patterns.py
12 13 14 15 16 17 18 19 20 |
|
factory
DEFAULT_CONFIG = dict(regex=patterns.regex, after_extract=patterns.after_extract, score_normalization=patterns.score_normalization_str, attr='NORM', window=20, verbose=0, ignore_excluded=False)
module-attribute
create_component(nlp, name, regex, after_extract, score_normalization, attr, window, verbose, ignore_excluded)
Source code in edsnlp/pipelines/ner/scores/emergency/ccmu/factory.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
tnm
patterns
modifier_pattern = '(?P<modifier>[cpyraum])'
module-attribute
tumour_pattern = 't\\s?(?P<tumour>([0-4o]|is|x))x?'
module-attribute
node_pattern = 'n\\s?(?P<node>[0-3o]|x)x?'
module-attribute
metastasis_pattern = 'm\\s?(?P<metastasis>[01o]|x)x?'
module-attribute
version_pattern = '\\(?(?P<version>uicc|accj|tnm)\\s+([ée]ditions|[ée]d\\.?)?\\s*(?P<version_year>\\d{4}|\\d{2})\\)?'
module-attribute
spacer = '(.|\\n){1,5}'
module-attribute
tnm_pattern = '(?<={version_pattern}{spacer})?'
module-attribute
models
TnmEnum
Bases: Enum
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
7 8 9 |
|
__str__()
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
8 9 |
|
Unknown
Bases: TnmEnum
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
12 13 |
|
unknown = 'x'
class-attribute
Modifier
Bases: TnmEnum
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
16 17 18 19 20 21 22 23 |
|
clinical = 'c'
class-attribute
histopathology = 'p'
class-attribute
neoadjuvant_therapy = 'y'
class-attribute
recurrent = 'r'
class-attribute
autopsy = 'a'
class-attribute
ultrasonography = 'u'
class-attribute
multifocal = 'm'
class-attribute
Tumour
Bases: TnmEnum
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
26 27 28 |
|
unknown = 'x'
class-attribute
in_situ = 'is'
class-attribute
TNM
Bases: BaseModel
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
modifier: Optional[Union[int, Modifier]] = None
class-attribute
tumour: Optional[Union[int, Tumour]] = None
class-attribute
node: Optional[Union[int, Unknown]] = None
class-attribute
metastasis: Optional[Union[int, Unknown]] = None
class-attribute
version: Optional[str] = None
class-attribute
version_year: Optional[int] = None
class-attribute
coerce_o(v)
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
41 42 43 44 45 |
|
validate_year(v)
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
47 48 49 50 51 52 53 54 55 56 57 |
|
norm()
Source code in edsnlp/pipelines/ner/scores/tnm/models.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
factory
DEFAULT_CONFIG = dict(pattern=None, attr='LOWER')
module-attribute
create_component(nlp, name, pattern, attr)
Source code in edsnlp/pipelines/ner/scores/tnm/factory.py
13 14 15 16 17 18 19 20 21 22 23 24 |
|
tnm
eds.tnm
pipeline.
PERIOD_PROXIMITY_THRESHOLD = 3
module-attribute
TNM
Bases: BaseComponent
Tags and normalizes TNM mentions.
PARAMETER | DESCRIPTION |
---|---|
nlp |
Language pipeline object
TYPE:
|
pattern |
List of regular expressions for TNM mentions.
TYPE:
|
attr |
spaCy attribute to use
TYPE:
|
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
nlp = nlp
instance-attribute
regex_matcher = RegexMatcher(attr=attr, alignment_mode='strict')
instance-attribute
__init__(nlp, pattern, attr)
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
set_extensions()
Set extensions for the dates pipeline.
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
51 52 53 54 55 56 57 58 |
|
process(doc)
Find TNM mentions in doc.
PARAMETER | DESCRIPTION |
---|---|
doc |
spaCy Doc object
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
spans
|
list of tnm spans |
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
|
parse(spans)
Parse dates using the groupdict returned by the matcher.
PARAMETER | DESCRIPTION |
---|---|
spans |
List of tuples containing the spans and groupdict returned by the matcher.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
List[Span]
|
List of processed spans, with the date parsed. |
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
|
__call__(doc)
Tags TNM mentions.
PARAMETER | DESCRIPTION |
---|---|
doc |
spaCy Doc object
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
doc
|
spaCy Doc object, annotated for TNM
TYPE:
|
Source code in edsnlp/pipelines/ner/scores/tnm/tnm.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|