Skip to content

Model

Choosing or customizing a Model is the third step in the EDS-TeVa usage workflow.

Definition

A Model is a python class designed to characterize the temporal variability of data availability. It estimates the coefficients Θ and the metrics from a Probe.

Image title

Model class diagram

Input

The Model class is expecting a Probe object in order to estimate the Model coefficients Θ and some metrics if desired.

Attributes

  • estimates is a Pandas.DataFrame computed by the fit() method. It contains the estimated coefficients Θ and metrics for each column given by the Probe._index (e.g. care site, stay type, etc.).
  • _coefs is the list of the Model coefficients Θ that are estimated by the fit() method.

Methods

Prediction

predict() method must be called on a fitted Model.

Estimates schema

Data stored in the estimates attribute follows a specific schema:

Indexes

The estimates are computed for each column given by the Probe._index. For example, if you fit your Model on the VisitProbe, the estimates will be computed for each:

  • care_site_level: care site hierarchic level (uf, pole, hospital).
  • care_site_id: care site unique identifier.
  • stay_type: type of stay (hospitalisés, urgence, hospitalisation incomplète, consultation externe).

Model coefficients

It depends on the Model used, for instance the step function Model has 2 coefficients:

  • t0 the characteristic time that estimates the time the after which the data is available.
  • c0 the characteristic completeness that estimates the stabilized routine completeness after t0.

Metrics

It depends on the metrics you specify in the fit() method. For instance, you can specify an error metric:

error=t0ttmaxϵ(t)2tmaxt0
  • error estimates the stability of the data after t0.

Example

When considering the StepFunction.estimates fitted on a VisitProbe, it may for instance look like this:

care_site_level care_site_id stay_type t_0 c_0 error
Unité Fonctionnelle (UF) 8312056386 'Urg' 2019-05-01 0.397 0.040
Pôle/DMU 8653815660 'All' 2011-04-01 0.583 0.028
Unité Fonctionnelle (UF) 8312027648 'Hospit' 2021-03-01 0.677 0.022
Unité Fonctionnelle (UF) 8312056379 'All' 2018-08-01 0.764 0.014
Hôpital 8312022130 'Hospit' 2022-02-01 0.652 0.027

Saving and loading a fitted Model

In order to ease the future loading of a Model that has been fitted with the fit() method, one can pickle it using the save() method. This enables a rapid loading of the Model from local disk using the load() method.

from edsteva.models import StepFunction

model = StepFunction()

model.fit(probe)  # 
model.save()  # 

model_2 = StepFunction()
model_2.load()  # 

Defining a custom Model

If none of the available Models meets your requirements, you may want to create your own. To define a custom Model class CustomModel that inherits from the abstract class BaseModel you'll have to implement the fit_process() and predict_process() methods (these methods are respectively called by the fit() method and the predict() method inherited by the BaseModel class). You'll also have to define the _coefs attribute which is the list of the Model coefficients.

from edsteva.models import BaseModel
from edsteva.probes import BaseProbe


# Definition of a new Model class
class CustomProbe(BaseModel):
    _coefs = ["my_model_coefficient_1", "my_model_coefficient_2"]

    def fit_process(self, probe: BaseProbe):
        # fit process
        return custom_predictor

    def predict_process(self, probe: BaseProbe):
        # predict process
        return custom_predictor
fit_process() and predict_process() methods take a Probe as the first argument. All other parameters must be keyword arguments. For a detailed example of the implementation of a Model, please have a look on the implemented StepFunction Model.

Contributions

If you managed to create your own Model do not hesitate to share it with the community by following the contribution guidelines. Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

Available Models

We detail hereafter the step function Model that has already been implemented in the library.

The StepFunction fits a step function ft0,c0(t) with coefficients Θ=(t0,c0) on a completeness predictor c(t):

ft0,c0(t)=c0 1tt0(t)c(t)=ft0,c0(t)+ϵ(t)
  • the characteristic time t0 estimates the time after which the data is available.
  • the characteristic value c0 estimates the stabilized routine completeness.

The default metric computed is the mean squared error after t0:

error=t0ttmaxϵ(t)2tmaxt0
  • error estimates the stability of the data after t0.

Custom metric

You can define your own metric if this one doesn't meet your requirements.

The available algorithms used to fit the step function are listed below:

Custom algo

You can define your own algo if they don't meet your requirements.

This algorithm computes the estimated coefficients t0^ and c0^ by minimizing the loss function L(t0,c0):

L(t0,c0)=t=tmintmaxl(c(t),ft0,c0(t))tmaxtmin(t0^,c0^)=argmint0,c0(L(t0,c0))

Default loss function l

The loss function is l2 by default: l(c(t),ft0,c0(t))=|c(t)ft0,c0(t)|2

Optimal estimates

For complexity purposes, this algorithm has been implemented to compute the optimal estimates only with the l2 loss function. For more informations, you can have a look on the source code.

In this algorithm, c0^ is directly estimated as the xth quantile of the completeness predictor c(t), where x is a number between 0 and 1. Then, t0^ is the first time c(t) reaches c0^.

c0^=xth quantile of c(t)t0^=argmint(c(t)c0^)

Default quantile x

The default quantile is x=0.8.

from edsteva.models.step_function import StepFunction

step_function_model = StepFunction()
step_function_model.fit(probe)
step_function_model.estimates.head()
care_site_level care_site_id stay_type t_0 c_0 error
Unité Fonctionnelle (UF) 8312056386 'Urg' 2019-05-01 0.397 0.040
Unité Fonctionnelle (UF) 8312056386 'All' 2011-04-01 0.583 0.028
Pôle/DMU 8312027648 'Hospit' 2021-03-01 0.677 0.022
Pôle/DMU 8312027648 'All' 2018-08-01 0.764 0.014
Hôpital 8312022130 'Hospit' 2022-02-01 0.652 0.027

The RectangleFunction fits a step function ft0,c0,t1(t) with coefficients Θ=(t0,c0,t1) on a completeness predictor c(t):

ft0,c0,t1(t)=c0 1t0tt1(t)c(t)=ft0,c0,t1(t)+ϵ(t)
  • the characteristic time t0 estimates the time after which the data is available.
  • the characteristic time t1 estimates the time after which the data is not available anymore.
  • the characteristic value c0 estimates the completeness between t0 and t1.

The default metric computed is the mean squared error between t0 and t1:

error=t0tt1ϵ(t)2t1t0
  • error estimates the stability of the data between t0 and t1.

Custom metric

You can define your own metric if this one doesn't meet your requirements.

The available algorithms used to fit the step function are listed below:

Custom algo

You can define your own algorithm if they don't meet your requirements.

This algorithm computes the estimated coefficients t0^, c0^ and t1^ by minimizing the loss function L(t0,c0,t1):

L(t0,c0,t1)=t=tmintmaxl(c(t),ft0,c0,t1(t))tmaxtmin(t0^,t1^,c0^)=argmint0,c0,t1(L(t0,c0,t1))

Default loss function l

The loss function is l2 by default: l(c(t),ft0,c0,t1(t))=|c(t)ft0,c0,t1(t)|2

Optimal estimates

For complexity purposes, this algorithm has been implemented with a dependency relation between c0 and t0 derived from the optimal estimates using the l2 loss function. For more informations, you can have a look on the source code.

from edsteva.models.rectangle_function import RectangleFunction

rectangle_function_model = RectangleFunction()
rectangle_function_model.fit(probe)
rectangle_function_model.estimates.head()
care_site_level care_site_id stay_type t_0 c_0 t_1 error
Unité Fonctionnelle (UF) 8312056386 'Urg' 2019-05-01 0.397 2020-05-01 0.040
Unité Fonctionnelle (UF) 8312056386 'All' 2011-04-01 0.583 2013-04-01 0.028
Pôle/DMU 8312027648 'Hospit' 2021-03-01 0.677 2022-03-01 0.022
Pôle/DMU 8312027648 'All' 2018-08-01 0.764 2019-08-01 0.014
Hôpital 8312022130 'Hospit' 2022-02-01 0.652 2022-08-01 0.027