Loggers
When training a model, it is important to keep track of the training process, model performance at different stages, and statistics about the training data over time. This is where loggers come in. Loggers are used to store such information to be able to analyze and visualize it later.
The EDS-NLP training API (edsnlp.train) relies on accelerate's integration of popular loggers, as well as a few custom loggers. You can configure loggers in edsnlp.train via the logger parameter of the train function by specifying:
-
a string or a class instance or partially initialized class instance of a logger, e.g.
-
or a list of string / logger instances, e.g.
Draft objects
edsnlp.train can provide a default project_name and logging_dir for loggers that require these parameters. For these loggers, if you don't want to set the project name yourself, you can either:
- call
CSVLogger.draft(...)without the normal init parameters minus theproject_nameorlogging_dirparameters, which will cause aDraft[CSVLogger]object to be returned, which be instantiated later when the required parameters are available - or use
"@loggers": csv !draftin the config file, which is the config file equivalent to the.draft()method above - use the string shorthands
logger: ["csv", "tensorboard", ...], which will use the default project name and logging dir
The supported loggers are listed below.
RichLogger
A logger that displays logs in a Rich-based table using rich-logger. This logger is also available via the loggers registry as rich.
No Disk Logging
This logger doesn't save logs to disk. It's meant for displaying logs in a pretty table during training. If you need to save logs to disk, consider combining this logger with any other logger.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
fields | Field descriptors containing goal ("lower_is_better" or "higher_is_better"), format and display name The key is a regex that will be used to match the fields to log Each entry of the dictionary should match the following scheme:
This defaults to a set of metrics and stats that are commonly logged during EDS-NLP training. TYPE: |
key | Key to group the logs TYPE: |
hijack_tqdm | Whether to replace the tqdm progress bar with a rich progress bar. Indeed, rich progress bars integrate better with the rich table. TYPE: |
CSVLogger
A simple CSV-based logger that writes logs to a CSV file. By default, with edsnlp.train the CSV file is located under a local directory ${CWD}/artifact/metrics.csv.
Consistent Keys
This logger expects that the values dictionary passed to log has consistent keys across all calls. If a new key is encountered in a subsequent call, it will be ignored and a warning will be issued.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
logging_dir | Directory in which to store the CSV. TYPE: |
file_name | Name of the CSV file. Defaults to "metrics.csv". TYPE: |
JSONLogger
A simple JSON-based logger that writes logs to a JSON file as a list of dictionaries. By default, with edsnlp.train the JSON file is located under a local directory ${CWD}/artifact/metrics.json.
This method is not recommended for large and frequent logging, as it re-writes the entire JSON file on every call. Prefer CSVLogger for frequent and heavy logging.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
logging_dir | Directory in which to store the JSON file. TYPE: |
file_name | Name of the JSON file. Defaults to "metrics.json". TYPE: |
TensorBoardLogger
Logger for TensorBoard. This logger is also available via the loggers registry as tensorboard.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
project_name | Name of the project. TYPE: |
logging_dir | Directory in which to store the TensorBoard logs. Logs of different runs will be stored in TYPE: |
kwargs | Additional keyword arguments to pass to
|
AimLogger
Logger for Aim.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
project_name | Name of the project. TYPE: |
logging_dir | Directory in which to store the Aim logs. The environment variable TYPE: |
kwargs | Additional keyword arguments to pass to the Aim init function. DEFAULT: |
WandBLogger
Logger for Weights & Biases. This logger is also available via the loggers registry as wandb.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
project_name | Name of the project. This will become the TYPE: |
kwargs | Additional keyword arguments to pass to the WandB init function. DEFAULT: |
MLflowLogger
Logger for MLflow. This logger is also available via the loggers registry as mlflow.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
project_name | Name of the project. This will become the mlflow experiment name. TYPE: |
logging_dir | Directory in which to store the MLflow logs. TYPE: |
run_id | If specified, get the run with the specified UUID and log parameters and metrics under that run. The run’s end time is unset and its status is set to running, but the run’s other attributes (source_version, source_type, etc.) are not changed. Environment variable MLFLOW_RUN_ID has priority over this argument. TYPE: |
tags | An optional TYPE: |
nested_run | Controls whether run is nested in parent run. True creates a nested run. Environment variable MLFLOW_NESTED_RUN has priority over this argument. TYPE: |
run_name | Name of new run (stored as a mlflow.runName tag). Used only when TYPE: |
description | An optional string that populates the description box of the run. If a run is being resumed, the description is set on the resumed run. If a new run is being created, the description is set on the new run. TYPE: |
CometMLLogger
Logger for CometML. This logger is also available via the loggers registry as cometml.
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
project_name | Name of the project. TYPE: |
kwargs | Additional keyword arguments to pass to the CometML Experiment object. DEFAULT: |