Custom Teva

OMOP-Teva module can also be applied to any dataframe. User must use reduce_table and visualize_table from eds_scikit.plot.table_viz.

Make sure to specify categorical columns with less then 50 values.

Use the function eds_scikit.plot.table_viz.map_column to reduce columns volumetry.

Creating synthetic dataset

import numpy as np
import pandas as pd

data = pd.DataFrame(
    {
        "id": str(np.arange(1, 1001)),
        "category_1": np.random.choice(["A", "B", "C"], size=1000, p=[0.4, 0.3, 0.3]),
        "category_2": np.array([str(i) for i in range(500)] * 2),
        "location": np.random.choice(
            ["location 1", "location 2"], size=1000, p=[0.6, 0.4]
        ),
        "date": pd.to_datetime(
            np.random.choice(
                pd.date_range(start="2021-01-01", end="2022-01-01"), size=1000
            )
        ),
    }
)

from eds_scikit.plot import reduce_table, visualize_table

data_reduced = reduce_table(
    data,
    category_columns=["location", "category_1", "category_2"],
    date_column="date",
    start_date="2021-01-01",
    end_date="2021-12-01",
    mapper={"category_2": {"even": r"[02468]$", "odd": r"[13579]$"}},
)

chart = visualize_table(
    data_reduced, title="synthetic dataframe table", description=True
)