edsnlp.models.stack_crf_ner
CRFMode
Bases: str
, Enum
Source code in edsnlp/models/stack_crf_ner.py
15 16 17 18 |
|
independent = 'independent'
class-attribute
joint = 'joint'
class-attribute
marginal = 'marginal'
class-attribute
StackedCRFNERModule
Bases: PytorchWrapperModule
Source code in edsnlp/models/stack_crf_ner.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|
crf = MultiLabelBIOULDecoder(1, learnable_transitions=False)
instance-attribute
classifier = None
instance-attribute
__init__(input_size=None, n_labels=None, mode=CRFMode.joint)
Nested NER CRF module
PARAMETER | DESCRIPTION |
---|---|
input_size |
Size of the input embeddings
TYPE:
|
n_labels |
Number of labels predicted by the module
TYPE:
|
mode |
Loss mode of the CRF
TYPE:
|
Source code in edsnlp/models/stack_crf_ner.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|
initialize()
Once the number of labels n_labels are known, this method initializes the torch linear layer.
Source code in edsnlp/models/stack_crf_ner.py
67 68 69 70 71 72 73 |
|
forward(embeds, mask, spans=None, additional_outputs=None, is_train=False, is_predict=False)
Apply the nested ner module to the document embeddings to:
- compute the loss
- predict the spans
non exclusively.
If spans are predicted, they are assigned to the additional_outputs
dictionary.
PARAMETER | DESCRIPTION |
---|---|
embeds |
Token embeddings to predict the tags from
TYPE:
|
mask |
Mask of the sequences
TYPE:
|
spans |
2d tensor of n_spans * (doc_idx, label_idx, begin, end)
TYPE:
|
additional_outputs |
Additional outputs that should not / cannot be back-propped through (Thinc treats Pytorch models solely as derivable functions, but the CRF that we employ performs the best tag decoding function with Pytorch) This dict will contain the predicted 2d tensor of spans
TYPE:
|
is_train |
Are we training the model (defaults to True)
TYPE:
|
is_predict |
Are we predicting the model (defaults to False)
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Optional[torch.FloatTensor]
|
Optional 0d loss (shape = [1]) to train the model |
Source code in edsnlp/models/stack_crf_ner.py
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
|
repeat(t, n, dim, interleave=True)
Source code in edsnlp/models/stack_crf_ner.py
21 22 23 24 25 26 27 28 29 30 31 32 |
|
flatten_dim(t, dim, ndim=1)
Source code in edsnlp/models/stack_crf_ner.py
35 36 |
|
create_model(tok2vec, mode, n_labels=None)
Source code in edsnlp/models/stack_crf_ner.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|