edsnlp.utils.collections
batch_compress_dict
Compress a sequence of dictionaries in which values that occur multiple times are deduplicated. The corresponding keys will be merged into a single string using the "|" character as a separator. This is useful to preserve referential identities when decompressing the dictionary after it has been serialized and deserialized.
Parameters
PARAMETER | DESCRIPTION |
---|---|
seq | Sequence of dictionaries to compress TYPE: |
multi_tee
Makes copies of an iterable such that every iteration over it starts from 0. If the iterable is a sequence (list, tuple), just returns it since every iter() over the object restart from the beginning
FrozenDict
Bases: dict
Copied from spacy.util.SimpleFrozenDict
to ensure compatibility.
Initialize the frozen dict. Can be initialized with pre-defined values.
error (str): The error message when user tries to assign to dict.
FrozenList
Bases: list
Copied from spacy.util.SimpleFrozenDict
to ensure compatibility
Initialize the frozen list.
error (str): The error message when user tries to mutate the list.
ld_to_dl
Convert a list of dictionaries to a dictionary of lists
Parameters
PARAMETER | DESCRIPTION |
---|---|
ld | The list of dictionaries TYPE: |
RETURNS | DESCRIPTION |
---|---|
Dict[str, List[T]] | The dictionary of lists |
dl_to_ld
Convert a dictionary of lists to a list of dictionaries
Parameters
PARAMETER | DESCRIPTION |
---|---|
dl | The dictionary of lists TYPE: |
RETURNS | DESCRIPTION |
---|---|
List[Dict[str, Any]] | The list of dictionaries |
decompress_dict
Decompress a dictionary of lists into a sequence of dictionaries. This function assumes that the dictionary structure was obtained using the batch_compress_dict
class. Keys that were merged into a single string using the "|" character as a separator will be split into a nested dictionary structure.
Parameters
PARAMETER | DESCRIPTION |
---|---|
seq | The dictionary to decompress or a sequence of dictionaries to decompress TYPE: |
batchify
Yields batch that contain at most batch_size
elements. If an item contains more than batch_size
elements, it will be yielded as a single batch.
Parameters
PARAMETER | DESCRIPTION |
---|---|
iterable | TYPE: |
batch_size | TYPE: |
drop_last | TYPE: |