Skip to content

edsnlp.data.pandas

from_pandas [source]

The PandasReader (or edsnlp.data.from_pandas) handles reading from a table and yields documents. At the moment, only entities and attributes are loaded. Relations and events are not supported.

Example

import edsnlp

nlp = edsnlp.blank("eds")
nlp.add_pipe(...)
doc_iterator = edsnlp.data.from_pandas(df, nlp=nlp, converter="omop")
annotated_docs = nlp.pipe(doc_iterator)

Generator vs list

edsnlp.data.from_pandas returns a Stream. To iterate over the documents multiple times efficiently or to access them by index, you must convert it to a list

docs = list(edsnlp.data.from_pandas(df, converter="omop"))

Parameters

PARAMETER DESCRIPTION
data

Pandas object

shuffle

Whether to shuffle the data. If "dataset", the whole dataset will be shuffled before starting iterating on it (at the start of every epoch if looping).

TYPE: Literal['dataset', False] DEFAULT: False

seed

The seed to use for shuffling.

TYPE: Optional[int] DEFAULT: None

loop

Whether to loop over the data indefinitely.

TYPE: bool DEFAULT: False

converter

Converters to use to convert the rows of the DataFrame (represented as dicts) to Doc objects. These are documented on the Converters page.

TYPE: Optional[AsList[Union[str, Callable]]] DEFAULT: None

kwargs

Additional keyword arguments to pass to the converter. These are documented on the Converters page.

DEFAULT: {}

RETURNS DESCRIPTION
Stream

to_pandas [source]

edsnlp.data.to_pandas writes a list of documents as a pandas table.

Example

import edsnlp

nlp = edsnlp.blank("eds")
nlp.add_pipe(...)

doc = nlp("My document with entities")

edsnlp.data.to_pandas([doc], converter="omop")

Parameters

PARAMETER DESCRIPTION
data

The data to write (either a list of documents or a Stream).

TYPE: Union[Any, Stream]

dtypes

Dictionary of column names to dtypes. This is passed to pd.DataFrame.astype.

TYPE: Optional[dict] DEFAULT: None

execute

Whether to execute the writing operation immediately or to return a stream

TYPE: bool DEFAULT: True

converter

Converter to use to convert the documents to dictionary objects before storing them in the dataframe. These are documented on the Converters page.

TYPE: Optional[Union[str, Callable]] DEFAULT: None

kwargs

Additional keyword arguments to pass to the converter. These are documented on the Converters page.

DEFAULT: {}