Consultation dates

When a patient comes multiple times for consultations, it is often represented as a single visit_occurrence in the CDW. If a clear history of a patient's course is needed, it is then necessary to use proxies in order to access this information. An available proxy to get those consultation dates is to check for the existence of consultation reports and use the associated reports dates.

To this extend, two methods are available. They can be combined or used separately:

  • Use the note_datetime field associated to each consultation report
  • Extract the consultation report date by using NLP

An important remark

Be careful when using the note_datetime field as it can represent the date of modification of a document (i.e. it can be modified if the clinician adds some information in it in the future).

from eds_scikit.io import HiveData
data = HiveData(DBNAME)
from eds_scikit.event import get_consultation_dates

get_consultation_dates(
    data.visit_occurrence,
    note=data.note,
    note_nlp=note_nlp,
    algo=["nlp"],
)

The snippet above required us to generate a note_nlp with a consultation_date column (see below for more informations).

Consultation pipe

A consultation date pipeline exists and is particulary suited for this task. Moreover, methods are available to run an EDS-NLP pipeline on a Pandas, Spark or even Koalas DataFrame !

We can check the various exposed parameters if needed:

Extract consultation dates. See the implementation details of the algo(s) you want to use

PARAMETER DESCRIPTION
vo

visit_occurrence DataFrame

TYPE: DataFrame

note

note DataFrame

TYPE: DataFrame

note_nlp

note_nlp DataFrame, used only with the "nlp" algo

TYPE: Optional[DataFrame] DEFAULT: None

algo

Algorithm(s) to use to determine consultation dates. Multiple algorithms can be provided as a list. Accepted values are:

TYPE: Union[str, List[str]] DEFAULT: ['nlp']

max_timedelta

If two extracted consultations are spaced by less than max_timedelta, we consider that they correspond to the same event and only keep the first one.

TYPE: timedelta DEFAULT: timedelta(days=7)

structured_config

A dictionnary of parameters when using the structured algorithm

TYPE: Dict[str, Any] DEFAULT: dict()

nlp_config

A dictionnary of parameters when using the nlp algorithm

TYPE: Dict[str, Any] DEFAULT: dict()

RETURNS DESCRIPTION
DataFrame

Event type DataFrame with the following columns:

  • person_id
  • visit_occurrence_id
  • CONSULTATION_DATE: corresponds to the note_datetime value of a consultation report coming from the considered visit.
  • CONSULTATION_NOTE_ID: the note_id of the corresponding report.
  • CONSULTATION_DATE_EXTRACTION: the method of extraction

Availables algorithms (values for "algo")

Uses consultation dates extracted a priori in consultation reports to infer true consultation dates

PARAMETER DESCRIPTION
note_nlp

A DataFrame with (at least) the following columns:

  • note_id
  • consultation_date
  • end if using dates_to_keep=first: end should store the character offset of the extracted date.

TYPE: DataFrame

dates_to_keep

How to handle multiple consultation dates found in the document:

  • min: keep the oldest one
  • first: keep the occurrence that appeared first in the text
  • all: keep all date

TYPE: str, optional DEFAULT: 'min'

RETURNS DESCRIPTION
Dataframe

With 2 added columns corresponding to the following concept:

  • CONSULTATION_DATE, containing the date
  • CONSULTATION_DATE_EXTRACTION, containing "NLP"
Source code in eds_scikit/event/consultations.py

Uses note_datetime value to infer true consultation dates

PARAMETER DESCRIPTION
note

A note DataFrame with at least the following columns:

  • note_id
  • note_datetime
  • note_source_value if kept_note_class_source_value is not None
  • visit_occurrence_id if kept_visit_source_value is not None

TYPE: DataFrame

vo

A visit_occurrence DataFrame to provide if kept_visit_source_value is not None, with at least the following columns:

  • visit_occurrence_id
  • visit_source_value if kept_visit_source_value is not None

TYPE: Optional[DataFrame] DEFAULT: None

kept_note_class_source_value

Value(s) allowed for the note_class_source_value column.

TYPE: Optional[Union[str, List[str]]] DEFAULT: 'CR-CONS'

kept_visit_source_value

Value(s) allowed for the visit_source_value column.

TYPE: Optional[Union[str, List[str]]], optional DEFAULT: 'consultation externe'

RETURNS DESCRIPTION
Dataframe

With 2 added columns corresponding to the following concept:

  • CONSULTATION_DATE, containing the date
  • CONSULTATION_DATE_EXTRACTION, containing "STRUCTURED"
Source code in eds_scikit/event/consultations.py