edsnlp.training.optimizer
LinearSchedule
[source]
Bases: Schedule
Linear schedule for a parameter group. The schedule will linearly increase the value from start_value
to max_value
in the first warmup_rate
of the total_steps
and then linearly decrease it to 0
.
Parameters
PARAMETER | DESCRIPTION |
---|---|
total_steps | The total number of steps, usually used to calculate ratios. TYPE: |
max_value | The maximum value to reach. TYPE: |
start_value | The initial value. TYPE: |
path | The path to the attribute to set. TYPE: |
warmup_rate | The rate of the warmup. TYPE: |
ScheduledOptimizer
[source]
Bases: Optimizer
Wrapper optimizer that supports schedules for the parameters and easy parameter selection using the key of the groups
dictionary as regex patterns to match the parameter names.
Schedules are defined directly in the groups, in place of the scheduled value.
Examples
optim = ScheduledOptimizer(
cls="adamw",
module=model,
groups={
# Exclude all parameters matching 'bias' from optimization.
"bias": False,
# Parameters starting with 'transformer' receive this learning rate
# schedule. If a parameter matches both 'transformer' and 'ner',
# the 'transformer' settings take precedence due to the order.
"^transformer": {
"lr": {
"@schedules": "linear",
"start_value": 0.0,
"max_value": 5e-4,
"warmup_rate": 0.2,
},
},
# Parameters starting with 'ner' receive this learning rate schedule,
# unless a 'lr' value has already been set by an earlier selector.
"^ner": {
"lr": {
"@schedules": "linear",
"start_value": 0.0,
"max_value": 1e-4,
"warmup_rate": 0.2,
},
},
# Apply a weight_decay of 0.01 to all parameters not excluded.
# This setting doesn't conflict with others and applies to all.
"": {
"weight_decay": 0.01,
},
},
total_steps=1000,
)
Parameters
PARAMETER | DESCRIPTION |
---|---|
optim | The optimizer to use. If a string (like "adamw") or a type to instantiate, the TYPE: |
module | The module to optimize. Usually the TYPE: |
total_steps | The total number of steps, used for schedules. TYPE: |
groups | The groups to optimize. The key is a regex selector to match parameters in The matching is performed by running TYPE: |