Optimizers¶

Novograd¶

class monai.optimizers.Novograd(params, lr=0.001, betas=(0.9, 0.98), eps=1e-08, weight_decay=0, grad_averaging=False, amsgrad=False)[source]¶

Novograd based on Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks. The code is adapted from the implementations in Jasper for PyTorch, and OpenSeq2Seq.

Parameters

params (Iterable) – iterable of parameters to optimize or dicts defining parameter groups.
lr (float) – learning rate. Defaults to 1e-3.
betas (Tuple[float, float]) – coefficients used for computing running averages of gradient and its square. Defaults to (0.9, 0.98).
eps (float) – term added to the denominator to improve numerical stability. Defaults to 1e-8.
weight_decay (float) – weight decay (L2 penalty). Defaults to 0.
grad_averaging (bool) – gradient averaging. Defaults to False.
amsgrad (bool) – whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond. Defaults to False.

step(closure=None)[source]¶

Performs a single optimization step.

Parameters: closure (Optional[Callable]) – A closure that reevaluates the model and returns the loss. Defaults to None.