trainer_callback¶
Callbacks to use with the Trainer class and customize the training loop.
-
class
TrainerState
(epoch: Optional[float] = None, global_step: int = 0, max_steps: int = 0, num_train_epochs: int = 0, total_flos: float = 0, log_history: Optional[List[Dict[str, float]]] = None, best_metric: Optional[float] = None, best_model_checkpoint: Optional[str] = None, is_local_process_zero: bool = True, is_world_process_zero: bool = True, trial_name: Optional[str] = None, trial_params: Optional[Dict[str, Union[str, float, int, bool]]] = None)[source]¶ Bases:
object
A class containing the [
Trainer
] inner state that will be saved along the model and optimizer when checkpointing and passed to the [TrainerCallback
].<Tip>
In all this class, one step is to be understood as one update step. When using gradient accumulation, one update step may require several forward and backward passes: if you use
gradient_accumulation_steps=n
, then one update step requires going through n batches.</Tip>
- Parameters
epoch (
float
, optional) – Only set during training, will represent the epoch the training is at (the decimal part being the percentage of the current epoch completed).global_step (
int
, optional, defaults to 0) – During training, represents the number of update steps completed.max_steps (
int
, optional, defaults to 0) – The number of update steps to do during the current training.total_flos (
float
, optional, defaults to 0) – The total number of floating operations done by the model since the beginning of training (stored as floats to avoid overflow).log_history (
List[Dict[str, float]]
, optional) – The list of logs done since the beginning of training.best_metric (
float
, optional) – When tracking the best model, the value of the best metric encountered so far.best_model_checkpoint (
str
, optional) – When tracking the best model, the value of the name of the checkpoint for the best model encountered so far.is_local_process_zero (
bool
, optional, defaults toTrue
) – Whether or not this process is the local (e.g., on one machine if training in a distributed fashion on several machines) main process.is_world_process_zero (
bool
, optional, defaults toTrue
) – Whether or not this process is the global main process (when training in a distributed fashion on several machines, this is only going to beTrue
for one process).
-
class
TrainerControl
(should_training_stop: bool = False, should_epoch_stop: bool = False, should_save: bool = False, should_evaluate: bool = False, should_log: bool = False)[source]¶ Bases:
object
A class that handles the [
Trainer
] control flow. This class is used by the [TrainerCallback
] to activate some switches in the training loop.- Parameters
should_training_stop (
bool
, optional, defaults toFalse
) –Whether or not the training should be interrupted.
If
True
, this variable will not be set back toFalse
. The training will just stop.should_epoch_stop (
bool
, optional, defaults toFalse
) –Whether or not the current epoch should be interrupted.
If
True
, this variable will be set back toFalse
at the beginning of the next epoch.should_save (
bool
, optional, defaults toFalse
) –Whether or not the model should be saved at this step.
If
True
, this variable will be set back toFalse
at the beginning of the next step.should_evaluate (
bool
, optional, defaults toFalse
) –Whether or not the model should be evaluated at this step.
If
True
, this variable will be set back toFalse
at the beginning of the next step.should_log (
bool
, optional, defaults toFalse
) –Whether or not the logs should be reported at this step.
If
True
, this variable will be set back toFalse
at the beginning of the next step.
-
class
TrainerCallback
[source]¶ Bases:
object
A class for objects that will inspect the state of the training loop at some events and take some decisions. At each of those events the following arguments are available:
- Parameters
args ([
TrainingArguments
]) – The training arguments used to instantiate the [Trainer
].state ([
TrainerState
]) – The current state of the [Trainer
].control ([
TrainerControl
]) – The object that is returned to the [Trainer
] and can be used to make some decisions.model ([
PreTrainedModel
] orpaddle.nn.Layer
) – The model being trained.tokenizer ([
PreTrainedTokenizer
]) – The tokenizer used for encoding the data.optimizer (
paddle.optimizer.Optimizer
) – The optimizer used for the training steps.lr_scheduler (
paddle.optimizer.lr.LRScheduler
) – The scheduler used for setting the learning rate.train_dataloader (
paddle.io.DataLoader
, optional) – The current dataloader used for training.eval_dataloader (
paddle.io.DataLoader
, optional) – The current dataloader used for training.metrics (
Dict[str, float]
) –The metrics computed by the last evaluation phase.
Those are only accessible in the event
on_evaluate
.logs (
Dict[str, float]
) –The values to log.
Those are only accessible in the event
on_log
.
The
control
object is the only one that can be changed by the callback, in which case the event that changes it should return the modified version.The argument
args
,state
andcontrol
are positionals for all events, all the others are grouped inkwargs
. You can unpack the ones you need in the signature of the event using them. As an example, see the code of the simple [PrinterCallback
].Example:
```python class PrinterCallback(TrainerCallback):
- def on_log(self, args, state, control, logs=None, **kwargs):
_ = logs.pop(“total_flos”, None) if state.is_local_process_zero:
logger.info(logs)
-
on_init_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of the initialization of the [
Trainer
].
-
on_train_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the beginning of training.
-
on_train_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of training.
-
on_epoch_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the beginning of an epoch.
-
on_epoch_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of an epoch.
-
on_step_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the beginning of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_substep_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of an substep during gradient accumulation.
-
on_step_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_evaluate
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called after an evaluation phase.
-
on_save
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called after a checkpoint save.
-
on_log
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called after logging the last logs.
-
on_prediction_step
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called after a prediction step.
-
class
CallbackHandler
(callbacks, model, tokenizer, optimizer, lr_scheduler)[source]¶ Bases:
paddlenlp.trainer.trainer_callback.TrainerCallback
Internal class that just calls the list of callbacks in order.
-
on_init_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the end of the initialization of the [
Trainer
].
-
on_train_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the beginning of training.
-
on_train_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the end of training.
-
on_epoch_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the beginning of an epoch.
-
on_epoch_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the end of an epoch.
-
on_step_begin
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the beginning of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_substep_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the end of an substep during gradient accumulation.
-
on_step_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_evaluate
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, metrics)[source]¶ Event called after an evaluation phase.
-
on_save
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called after a checkpoint save.
-
on_log
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, logs, **kwargs)[source]¶ Event called after logging the last logs.
-
on_prediction_step
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl)[source]¶ Event called after a prediction step.
-
-
class
DefaultFlowCallback
[source]¶ Bases:
paddlenlp.trainer.trainer_callback.TrainerCallback
A [
TrainerCallback
] that handles the default flow of the training loop for logs, evaluation and checkpoints.-
on_step_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_epoch_end
(args: paddlenlp.trainer.training_args.TrainingArguments, state: paddlenlp.trainer.trainer_callback.TrainerState, control: paddlenlp.trainer.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of an epoch.
-
-
class
ProgressCallback
[source]¶ Bases:
paddlenlp.trainer.trainer_callback.TrainerCallback
A [
TrainerCallback
] that displays the progress of training or evaluation.-
on_step_end
(args, state, control, **kwargs)[source]¶ Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_prediction_step
(args, state, control, eval_dataloader=None, **kwargs)[source]¶ Event called after a prediction step.
-
-
class
PrinterCallback
[source]¶ Bases:
paddlenlp.trainer.trainer_callback.TrainerCallback
A bare [
TrainerCallback
] that just prints the logs.
-
class
EarlyStoppingCallback
(early_stopping_patience: int = 1, early_stopping_threshold: Optional[float] = 0.0)[source]¶ Bases:
paddlenlp.trainer.trainer_callback.TrainerCallback
A [
TrainerCallback
] that handles early stopping.- Parameters
early_stopping_patience (
int
) – Use withmetric_for_best_model
to stop training when the specified metric worsens forearly_stopping_patience
evaluation calls.early_stopping_threshold (
float
, optional) – Use with TrainingArgumentsmetric_for_best_model
andearly_stopping_patience
to denote how much the specified metric must improve to satisfy early stopping conditions. `
This callback depends on [
TrainingArguments
] argument load_best_model_at_end functionality to set best_metric in [TrainerState
].