Skip to content

edsnlp.pipelines.core.normalizer.lowercase

factory

remove_lowercase(doc)

Add case on the NORM custom attribute. Should always be applied first.

PARAMETER DESCRIPTION
doc

The spaCy Doc object.

TYPE: Doc

RETURNS DESCRIPTION
Doc

The document, with case put back in NORM.

Source code in edsnlp/pipelines/core/normalizer/lowercase/factory.py
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
@Language.component("remove-lowercase")
@Language.component("eds.remove-lowercase")
def remove_lowercase(doc: Doc):
    """
    Add case on the `NORM` custom attribute. Should always be applied first.

    Parameters
    ----------
    doc : Doc
        The spaCy `Doc` object.

    Returns
    -------
    Doc
        The document, with case put back in `NORM`.
    """

    for token in doc:
        token.norm_ = token.text

    return doc
Back to top