Engines

Multi-GPU data parallel

monai.engines.multi_gpu_supervised_trainer.create_multigpu_supervised_evaluator(net, metrics=None, devices=None, non_blocking=False, prepare_batch=<function _prepare_batch>, output_transform=<function _default_eval_transform>, distributed=False)[source]

Derived from create_supervised_evaluator in Ignite.

Factory function for creating an evaluator for supervised models.

Parameters
  • net (Module) – the model to train.

  • metrics (Optional[Dict[str, Metric]]) – a map of metric names to Metrics.

  • devices (Optional[Sequence[device]]) – device(s) type specification (default: None). Applies to both model and batches. None is all devices used, empty list is CPU only.

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function that receives batch, device, non_blocking and outputs tuple of tensors (batch_x, batch_y).

  • output_transform (Callable) – function that receives ‘x’, ‘y’, ‘y_pred’ and returns value to be assigned to engine’s state.output after each iteration. Default is returning (y_pred, y,) which fits output expected by metrics. If you change it you should use output_transform in metrics.

  • distributed (bool) – whether convert model to DistributedDataParallel, if have multiple devices, use the first device as output device.

Note

engine.state.output for this engine is defined by output_transform parameter and is a tuple of (batch_pred, batch_y) by default.

Returns

an evaluator engine with supervised inference function.

Return type

Engine

monai.engines.multi_gpu_supervised_trainer.create_multigpu_supervised_trainer(net, optimizer, loss_fn, devices=None, non_blocking=False, prepare_batch=<function _prepare_batch>, output_transform=<function _default_transform>, distributed=False)[source]

Derived from create_supervised_trainer in Ignite.

Factory function for creating a trainer for supervised models.

Parameters
  • net (Module) – the network to train.

  • optimizer (Optimizer) – the optimizer to use.

  • loss_fn (Callable) – the loss function to use.

  • devices (Optional[Sequence[device]]) – device(s) type specification (default: None). Applies to both model and batches. None is all devices used, empty list is CPU only.

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function that receives batch, device, non_blocking and outputs tuple of tensors (batch_x, batch_y).

  • output_transform (Callable) – function that receives ‘x’, ‘y’, ‘y_pred’, ‘loss’ and returns value to be assigned to engine’s state.output after each iteration. Default is returning loss.item().

  • distributed (bool) – whether convert model to DistributedDataParallel, if have multiple devices, use the first device as output device.

Returns

a trainer engine with supervised update function.

Return type

Engine

Note

engine.state.output for this engine is defined by output_transform parameter and is the loss of the processed batch by default.

Workflows

Workflow

class monai.engines.workflow.Workflow(device, max_epochs, data_loader, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, post_transform=None, key_metric=None, additional_metrics=None, handlers=None, amp=False)[source]

Workflow defines the core work process inheriting from Ignite engine. All trainer, validator and evaluator share this same workflow as base class, because they all can be treated as same Ignite engine loops. It initializes all the sharable data in Ignite engine.state. And attach additional processing logics to Ignite engine based on Event-Handler mechanism.

Users should consider to inherit from trainer or evaluator to develop more trainers or evaluators.

Parameters
  • device (device) – an object representing the device on which to run.

  • max_epochs (int) – the total epoch number for engine to run, validator and evaluator have only 1 epoch.

  • data_loader (DataLoader) – Ignite engine use data_loader to run, must be torch.DataLoader.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(data_loader).

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function to parse image and label for every iteration.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • post_transform (Optional[Callable]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

  • amp (bool) – whether to enable auto-mixed-precision training or inference, default is False.

Raises
  • TypeError – When device is not a torch.Device.

  • TypeError – When data_loader is not a torch.utils.data.DataLoader.

  • TypeError – When key_metric is not a Optional[dict].

  • TypeError – When additional_metrics is not a Optional[dict].

run()[source]

Execute training, validation or evaluation based on Ignite Engine.

Return type

None

Trainer

class monai.engines.Trainer(device, max_epochs, data_loader, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, post_transform=None, key_metric=None, additional_metrics=None, handlers=None, amp=False)[source]

Base class for all kinds of trainers, inherits from Workflow.

run()[source]

Execute training based on Ignite Engine. If call this function multiple times, it will continuously run from the previous state.

Return type

None

SupervisedTrainer

class monai.engines.SupervisedTrainer(device, max_epochs, train_data_loader, network, optimizer, loss_function, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, inferer=None, post_transform=None, key_train_metric=None, additional_metrics=None, train_handlers=None, amp=False)[source]

Standard supervised training method with image and label, inherits from Trainer and Workflow.

Parameters
  • device (device) – an object representing the device on which to run.

  • max_epochs (int) – the total epoch number for trainer to run.

  • train_data_loader (DataLoader) – Ignite engine use data_loader to run, must be torch.DataLoader.

  • network (Module) – to train with this network.

  • optimizer (Optimizer) – the optimizer associated to the network.

  • loss_function (Callable) – the loss function associated to the optimizer.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(train_data_loader).

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function to parse image and label for current iteration.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • inferer (Optional[Inferer]) – inference method that execute model forward on input data, like: SlidingWindow, etc.

  • post_transform (Optional[Transform]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_train_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_train_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • train_handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

  • amp (bool) – whether to enable auto-mixed-precision training, default is False.

GanTrainer

class monai.engines.GanTrainer(device, max_epochs, train_data_loader, g_network, g_optimizer, g_loss_function, d_network, d_optimizer, d_loss_function, epoch_length=None, g_inferer=None, d_inferer=None, d_train_steps=1, latent_shape=64, non_blocking=False, d_prepare_batch=<function default_prepare_batch>, g_prepare_batch=<function default_make_latent>, g_update_latents=True, iteration_update=None, post_transform=None, key_train_metric=None, additional_metrics=None, train_handlers=None)[source]

Generative adversarial network training based on Goodfellow et al. 2014 https://arxiv.org/abs/1406.266, inherits from Trainer and Workflow.

Training Loop: for each batch of data size m
  1. Generate m fakes from random latent codes.

  2. Update discriminator with these fakes and current batch reals, repeated d_train_steps times.

  3. If g_update_latents, generate m fakes from new random latent codes.

  4. Update generator with these fakes using discriminator feedback.

Parameters
  • device (device) – an object representing the device on which to run.

  • max_epochs (int) – the total epoch number for engine to run.

  • train_data_loader (DataLoader) – Core ignite engines uses DataLoader for training loop batchdata.

  • g_network (Module) – generator (G) network architecture.

  • g_optimizer (Optimizer) – G optimizer function.

  • g_loss_function (Callable) – G loss function for optimizer.

  • d_network (Module) – discriminator (D) network architecture.

  • d_optimizer (Optimizer) – D optimizer function.

  • d_loss_function (Callable) – D loss function for optimizer.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(train_data_loader).

  • g_inferer (Optional[Inferer]) – inference method to execute G model forward. Defaults to SimpleInferer().

  • d_inferer (Optional[Inferer]) – inference method to execute D model forward. Defaults to SimpleInferer().

  • d_train_steps (int) – number of times to update D with real data minibatch. Defaults to 1.

  • latent_shape (int) – size of G input latent code. Defaults to 64.

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • d_prepare_batch (Callable) – callback function to prepare batchdata for D inferer. Defaults to return GanKeys.REALS in batchdata dict.

  • g_prepare_batch (Callable) – callback function to create batch of latent input for G inferer. Defaults to return random latents.

  • g_update_latents (bool) – Calculate G loss with new latent codes. Defaults to True.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • post_transform (Optional[Transform]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_train_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_train_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • train_handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

Evaluator

class monai.engines.Evaluator(device, val_data_loader, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, post_transform=None, key_val_metric=None, additional_metrics=None, val_handlers=None, amp=False)[source]

Base class for all kinds of evaluators, inherits from Workflow.

Parameters
  • device (device) – an object representing the device on which to run.

  • val_data_loader (DataLoader) – Ignite engine use data_loader to run, must be torch.DataLoader.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(val_data_loader).

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function to parse image and label for current iteration.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • post_transform (Optional[Transform]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_val_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_val_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • val_handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

  • amp (bool) – whether to enable auto-mixed-precision evaluation, default is False.

run(global_epoch=1)[source]

Execute validation/evaluation based on Ignite Engine.

Parameters

global_epoch (int) – the overall epoch if during a training. evaluator engine can get it from trainer.

Return type

None

SupervisedEvaluator

class monai.engines.SupervisedEvaluator(device, val_data_loader, network, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, inferer=None, post_transform=None, key_val_metric=None, additional_metrics=None, val_handlers=None, amp=False)[source]

Standard supervised evaluation method with image and label(optional), inherits from evaluator and Workflow.

Parameters
  • device (device) – an object representing the device on which to run.

  • val_data_loader (DataLoader) – Ignite engine use data_loader to run, must be torch.DataLoader.

  • network (Module) – use the network to run model forward.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(val_data_loader).

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function to parse image and label for current iteration.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • inferer (Optional[Inferer]) – inference method that execute model forward on input data, like: SlidingWindow, etc.

  • post_transform (Optional[Transform]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_val_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_val_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • val_handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

  • amp (bool) – whether to enable auto-mixed-precision evaluation, default is False.

EnsembleEvaluator

class monai.engines.EnsembleEvaluator(device, val_data_loader, networks, pred_keys, epoch_length=None, non_blocking=False, prepare_batch=<function default_prepare_batch>, iteration_update=None, inferer=None, post_transform=None, key_val_metric=None, additional_metrics=None, val_handlers=None, amp=False)[source]

Ensemble evaluation for multiple models, inherits from evaluator and Workflow. It accepts a list of models for inference and outputs a list of predictions for further operations.

Parameters
  • device (device) – an object representing the device on which to run.

  • val_data_loader (DataLoader) – Ignite engine use data_loader to run, must be torch.DataLoader.

  • epoch_length (Optional[int]) – number of iterations for one epoch, default to len(val_data_loader).

  • networks (Sequence[Module]) – use the networks to run model forward in order.

  • pred_keys (Sequence[str]) – the keys to store every prediction data. the length must exactly match the number of networks.

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function to parse image and label for current iteration.

  • iteration_update (Optional[Callable]) – the callable function for every iteration, expect to accept engine and batchdata as input parameters. if not provided, use self._iteration() instead.

  • inferer (Optional[Inferer]) – inference method that execute model forward on input data, like: SlidingWindow, etc.

  • post_transform (Optional[Transform]) – execute additional transformation for the model output data. Typically, several Tensor based transforms composed by Compose.

  • key_val_metric (Optional[Dict[str, Metric]]) – compute metric when every iteration completed, and save average value to engine.state.metrics when epoch completed. key_val_metric is the main metric to compare and save the checkpoint into files.

  • additional_metrics (Optional[Dict[str, Metric]]) – more Ignite metrics that also attach to Ignite Engine.

  • val_handlers (Optional[Sequence]) – every handler is a set of Ignite Event-Handlers, must have attach function, like: CheckpointHandler, StatsHandler, SegmentationSaver, etc.

  • amp (bool) – whether to enable auto-mixed-precision evaluation, default is False.