Network architectures¶
Blocks¶
ADN¶
-
class
monai.networks.blocks.
ADN
(ordering='NDA', in_channels=None, act='RELU', norm=None, norm_dim=None, dropout=None, dropout_dim=None)[source]¶ Constructs a sequential module of optional activation, dropout, and normalization layers (with an arbitrary order):
-- (Norm) -- (Dropout) -- (Acti) --
- Parameters
ordering (
str
) – a string representing the ordering of activation, dropout, and normalization. Defaults to “NDA”.in_channels (
Optional
[int
]) – C from an expected input of size (N, C, H[, W, D]).act (
Union
[Tuple
,str
,None
]) – activation type and arguments. Defaults to PReLU.norm (
Union
[Tuple
,str
,None
]) – feature normalization type and arguments. Defaults to instance norm.norm_dim (
Optional
[int
]) – determine the spatial dimensions of the normalization layer. defaults to dropout_dim if unspecified.dropout (
Union
[Tuple
,str
,float
,None
]) – dropout ratio. Defaults to no dropout.dropout_dim (
Optional
[int
]) –determine the spatial dimensions of dropout. defaults to norm_dim if unspecified.
When dropout_dim = 1, randomly zeroes some of the elements for each channel.
When dropout_dim = 2, Randomly zeroes out entire channels (a channel is a 2D feature map).
When dropout_dim = 3, Randomly zeroes out entire channels (a channel is a 3D feature map).
Examples:
# activation, group norm, dropout >>> norm_params = ("GROUP", {"num_groups": 1, "affine": False}) >>> ADN(norm=norm_params, in_channels=1, dropout_dim=1, dropout=0.8, ordering="AND") ADN( (A): ReLU() (N): GroupNorm(1, 1, eps=1e-05, affine=False) (D): Dropout(p=0.8, inplace=False) ) # LeakyReLU, dropout >>> act_params = ("leakyrelu", {"negative_slope": 0.1, "inplace": True}) >>> ADN(act=act_params, in_channels=1, dropout_dim=1, dropout=0.8, ordering="AD") ADN( (A): LeakyReLU(negative_slope=0.1, inplace=True) (D): Dropout(p=0.8, inplace=False) )
See also
monai.networks.layers.Dropout
monai.networks.layers.Act
monai.networks.layers.Norm
monai.networks.layers.split_args
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Convolution¶
-
class
monai.networks.blocks.
Convolution
(dimensions, in_channels, out_channels, strides=1, kernel_size=3, adn_ordering='NDA', act='PRELU', norm='INSTANCE', dropout=None, dropout_dim=1, dilation=1, groups=1, bias=True, conv_only=False, is_transposed=False, padding=None, output_padding=None)[source]¶ Constructs a convolution with normalization, optional dropout, and optional activation layers:
-- (Conv|ConvTrans) -- (Norm -- Dropout -- Acti) --
if
conv_only
set toTrue
:-- (Conv|ConvTrans) --
For example:
from monai.networks.blocks import Convolution conv = Convolution( dimensions=3, in_channels=1, out_channels=1, adn_ordering="ADN", act=("prelu", {"init": 0.2}), dropout=0.1, norm=("layer", {"normalized_shape": (10, 10, 10)}), ) print(conv)
output:
Convolution( (conv): Conv3d(1, 1, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1)) (adn): ADN( (A): PReLU(num_parameters=1) (D): Dropout(p=0.1, inplace=False) (N): LayerNorm((10, 10, 10), eps=1e-05, elementwise_affine=True) ) )
- Parameters
dimensions (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.strides (
Union
[Sequence
[int
],int
]) – convolution stride. Defaults to 1.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size. Defaults to 3.adn_ordering (
str
) – a string representing the ordering of activation, normalization, and dropout. Defaults to “NDA”.act (
Union
[Tuple
,str
,None
]) – activation type and arguments. Defaults to PReLU.norm (
Union
[Tuple
,str
,None
]) – feature normalization type and arguments. Defaults to instance norm.dropout (
Union
[Tuple
,str
,float
,None
]) – dropout ratio. Defaults to no dropout.dropout_dim (
Optional
[int
]) –determine the dimensions of dropout. Defaults to 1.
When dropout_dim = 1, randomly zeroes some of the elements for each channel.
When dropout_dim = 2, Randomly zeroes out entire channels (a channel is a 2D feature map).
When dropout_dim = 3, Randomly zeroes out entire channels (a channel is a 3D feature map).
The value of dropout_dim should be no no larger than the value of dimensions.
dilation (
Union
[Sequence
[int
],int
]) – dilation rate. Defaults to 1.groups (
int
) – controls the connections between inputs and outputs. Defaults to 1.bias (
bool
) – whether to have a bias term. Defaults to True.conv_only (
bool
) – whether to use the convolutional layer only. Defaults to False.is_transposed (
bool
) – if True uses ConvTrans instead of Conv. Defaults to False.padding (
Union
[Sequence
[int
],int
,None
]) – controls the amount of implicit zero-paddings on both sides for padding number of points for each dimension. Defaults to None.output_padding (
Union
[Sequence
[int
],int
,None
]) – controls the additional size added to one side of the output shape. Defaults to None.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
CRF¶
-
class
monai.networks.blocks.
CRF
(iterations=5, bilateral_weight=1.0, gaussian_weight=1.0, bilateral_spatial_sigma=5.0, bilateral_color_sigma=0.5, gaussian_spatial_sigma=5.0, update_factor=3.0, compatibility_matrix=None)[source]¶ Conditional Random Field: Combines message passing with a class compatibility convolution into an iterative process designed to successively minimise the energy of the class labeling.
In this implementation, the message passing step is a weighted combination of a gaussian filter and a bilateral filter. The bilateral term is included to respect existing structure within the reference tensor.
- Parameters
iterations (
int
) – the number of iterations.bilateral_weight (
float
) – the weighting of the bilateral term in the message passing step.gaussian_weight (
float
) – the weighting of the gaussian term in the message passing step.bilateral_spatial_sigma (
float
) – standard deviation in spatial coordinates for the bilateral term.bilateral_color_sigma (
float
) – standard deviation in color space for the bilateral term.gaussian_spatial_sigma (
float
) – standard deviation in spatial coordinates for the gaussian term.update_factor (
float
) – determines the magnitude of each update.compatibility_matrix (
Optional
[Tensor
]) – a matrix describing class compatibility, should be NxN where N is the numer of classes.
ResidualUnit¶
-
class
monai.networks.blocks.
ResidualUnit
(dimensions, in_channels, out_channels, strides=1, kernel_size=3, subunits=2, adn_ordering='NDA', act='PRELU', norm='INSTANCE', dropout=None, dropout_dim=1, dilation=1, bias=True, last_conv_only=False, padding=None)[source]¶ Residual module with multiple convolutions and a residual connection.
For example:
from monai.networks.blocks import ResidualUnit convs = ResidualUnit( dimensions=3, in_channels=1, out_channels=1, adn_ordering="AN", act=("prelu", {"init": 0.2}), norm=("layer", {"normalized_shape": (10, 10, 10)}), ) print(convs)
output:
ResidualUnit( (conv): Sequential( (unit0): Convolution( (conv): Conv3d(1, 1, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1)) (adn): ADN( (A): PReLU(num_parameters=1) (N): LayerNorm((10, 10, 10), eps=1e-05, elementwise_affine=True) ) ) (unit1): Convolution( (conv): Conv3d(1, 1, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1)) (adn): ADN( (A): PReLU(num_parameters=1) (N): LayerNorm((10, 10, 10), eps=1e-05, elementwise_affine=True) ) ) ) (residual): Identity() )
- Parameters
dimensions (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.strides (
Union
[Sequence
[int
],int
]) – convolution stride. Defaults to 1.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size. Defaults to 3.subunits (
int
) – number of convolutions. Defaults to 2.adn_ordering (
str
) – a string representing the ordering of activation, normalization, and dropout. Defaults to “NDA”.act (
Union
[Tuple
,str
,None
]) – activation type and arguments. Defaults to PReLU.norm (
Union
[Tuple
,str
,None
]) – feature normalization type and arguments. Defaults to instance norm.dropout (
Union
[Tuple
,str
,float
,None
]) – dropout ratio. Defaults to no dropout.dropout_dim (
Optional
[int
]) –determine the dimensions of dropout. Defaults to 1.
When dropout_dim = 1, randomly zeroes some of the elements for each channel.
When dropout_dim = 2, Randomly zero out entire channels (a channel is a 2D feature map).
When dropout_dim = 3, Randomly zero out entire channels (a channel is a 3D feature map).
The value of dropout_dim should be no no larger than the value of dimensions.
dilation (
Union
[Sequence
[int
],int
]) – dilation rate. Defaults to 1.bias (
bool
) – whether to have a bias term. Defaults to True.last_conv_only (
bool
) – for the last subunit, whether to use the convolutional layer only. Defaults to False.padding (
Union
[Sequence
[int
],int
,None
]) – controls the amount of implicit zero-paddings on both sides for padding number of points for each dimension. Defaults to None.
See also
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
Swish¶
-
class
monai.networks.blocks.
Swish
(alpha=1.0)[source]¶ Applies the element-wise function:
\[\text{Swish}(x) = x * \text{Sigmoid}(\alpha * x) ~~~~\text{for constant value}~ \alpha.\]Citation: Searching for Activation Functions, Ramachandran et al., 2017, https://arxiv.org/abs/1710.05941.
- Shape:
Input: \((N, *)\) where * means, any number of additional dimensions
Output: \((N, *)\), same shape as the input
Examples:
>>> m = Act['swish']() >>> input = torch.randn(2) >>> output = m(input)
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
MemoryEfficientSwish¶
-
class
monai.networks.blocks.
MemoryEfficientSwish
[source]¶ Applies the element-wise function:
\[\text{Swish}(x) = x * \text{Sigmoid}(\alpha * x) ~~~~\text{for constant value}~ \alpha=1.\]Memory efficient implementation for training following recommendation from: https://github.com/lukemelas/EfficientNet-PyTorch/issues/18#issuecomment-511677853
Results in ~ 30% memory saving during training as compared to Swish()
Citation: Searching for Activation Functions, Ramachandran et al., 2017, https://arxiv.org/abs/1710.05941.
- Shape:
Input: \((N, *)\) where * means, any number of additional dimensions
Output: \((N, *)\), same shape as the input
Examples:
>>> m = Act['memswish']() >>> input = torch.randn(2) >>> output = m(input)
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Mish¶
-
class
monai.networks.blocks.
Mish
[source]¶ Applies the element-wise function:
\[\text{Mish}(x) = x * tanh(\text{softplus}(x)).\]Citation: Mish: A Self Regularized Non-Monotonic Activation Function, Diganta Misra, 2019, https://arxiv.org/abs/1908.08681.
- Shape:
Input: \((N, *)\) where * means, any number of additional dimensions
Output: \((N, *)\), same shape as the input
Examples:
>>> m = Act['mish']() >>> input = torch.randn(2) >>> output = m(input)
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(input)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
GCN Module¶
-
class
monai.networks.blocks.
GCN
(inplanes, planes, ks=7)[source]¶ The Global Convolutional Network module using large 1D Kx1 and 1xK kernels to represent 2D kernels.
- Parameters
inplanes (
int
) – number of input channels.planes (
int
) – number of output channels.ks (
int
) – kernel size for one dimension. Defaults to 7.
Refinement Module¶
FCN Module¶
-
class
monai.networks.blocks.
FCN
(out_channels=1, upsample_mode='bilinear', pretrained=True, progress=True)[source]¶ 2D FCN network with 3 input channels. The small decoder is built with the GCN and Refine modules. The code is adapted from lsqshr’s official 2D code.
- Parameters
out_channels (
int
) – number of output channels. Defaults to 1.upsample_mode (
str
) –[
"transpose"
,"bilinear"
] The mode of upsampling manipulations. Using the second mode cannot guarantee the model’s reproducibility. Defaults tobilinear
.transpose
, uses transposed convolution layers.bilinear
, uses bilinear interpolation.
pretrained (
bool
) – If True, returns a model pre-trained on ImageNetprogress (
bool
) – If True, displays a progress bar of the download to stderr.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Multi-Channel FCN Module¶
-
class
monai.networks.blocks.
MCFCN
(in_channels=3, out_channels=1, upsample_mode='bilinear', pretrained=True, progress=True)[source]¶ The multi-channel version of the 2D FCN module. Adds a projection layer to take arbitrary number of inputs.
- Parameters
in_channels (
int
) – number of input channels. Defaults to 3.out_channels (
int
) – number of output channels. Defaults to 1.upsample_mode (
str
) –[
"transpose"
,"bilinear"
] The mode of upsampling manipulations. Using the second mode cannot guarantee the model’s reproducibility. Defaults tobilinear
.transpose
, uses transposed convolution layers.bilinear
, uses bilinear interpolate.
pretrained (
bool
) – If True, returns a model pre-trained on ImageNetprogress (
bool
) – If True, displays a progress bar of the download to stderr.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Dynamic-Unet Block¶
-
class
monai.networks.blocks.
UnetResBlock
(spatial_dims, in_channels, out_channels, kernel_size, stride, norm_name)[source]¶ A skip-connection based module that can be used for DynUNet, based on: Automated Design of Deep Learning Methods for Biomedical Image Segmentation. nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation.
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(inp)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
monai.networks.blocks.
UnetBasicBlock
(spatial_dims, in_channels, out_channels, kernel_size, stride, norm_name)[source]¶ A CNN module module that can be used for DynUNet, based on: Automated Design of Deep Learning Methods for Biomedical Image Segmentation. nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation.
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(inp)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
monai.networks.blocks.
UnetUpBlock
(spatial_dims, in_channels, out_channels, kernel_size, stride, upsample_kernel_size, norm_name)[source]¶ An upsampling module that can be used for DynUNet, based on: Automated Design of Deep Learning Methods for Biomedical Image Segmentation. nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation.
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.upsample_kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size for transposed convolution layers.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(inp, skip)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
SegResnet Block¶
-
class
monai.networks.blocks.
ResBlock
(spatial_dims, in_channels, norm, kernel_size=3)[source]¶ ResBlock employs skip connection and two convolution blocks and is used in SegResNet based on 3D MRI brain tumor segmentation using autoencoder regularization.
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2 or 3.in_channels (
int
) – number of input channels.norm (
Union
[Tuple
,str
]) – feature normalization type and arguments.kernel_size (
int
) – convolution kernel size, the value should be an odd number. Defaults to 3.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
SABlock Block¶
-
class
monai.networks.blocks.
SABlock
(hidden_size, num_heads, dropout_rate=0.0)[source]¶ A self-attention block, based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”
- Parameters
hidden_size (
int
) – dimension of hidden layer.num_heads (
int
) – number of attention heads.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Squeeze-and-Excitation¶
-
class
monai.networks.blocks.
ChannelSELayer
(spatial_dims, in_channels, r=2, acti_type_1=('relu', {'inplace': True}), acti_type_2='sigmoid', add_residual=False)[source]¶ Re-implementation of the Squeeze-and-Excitation block based on: “Hu et al., Squeeze-and-Excitation Networks, https://arxiv.org/abs/1709.01507”.
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels (
int
) – number of input channels.r (
int
) – the reduction ratio r in the paper. Defaults to 2.acti_type_1 (
Union
[Tuple
[str
,Dict
],str
]) – activation type of the hidden squeeze layer. Defaults to("relu", {"inplace": True})
.acti_type_2 (
Union
[Tuple
[str
,Dict
],str
]) – activation type of the output squeeze layer. Defaults to “sigmoid”.
- Raises
ValueError – When
r
is nonpositive or larger thanin_channels
.
See also
Transformer Block¶
-
class
monai.networks.blocks.
TransformerBlock
(hidden_size, mlp_dim, num_heads, dropout_rate=0.0)[source]¶ A transformer block, based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”
- Parameters
hidden_size (
int
) – dimension of hidden layer.mlp_dim (
int
) – dimension of feedforward layer.num_heads (
int
) – number of attention heads.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
UNETR Block¶
-
class
monai.networks.blocks.
UnetrBasicBlock
(spatial_dims, in_channels, out_channels, kernel_size, stride, norm_name, res_block=False)[source]¶ A CNN module that can be used for UNETR, based on: “Hatamizadeh et al., UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>”
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.res_block (
bool
) – bool argument to determine if residual block is used.
-
forward
(inp)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
monai.networks.blocks.
UnetrUpBlock
(spatial_dims, in_channels, out_channels, kernel_size, stride, upsample_kernel_size, norm_name, res_block=False)[source]¶ An upsampling module that can be used for UNETR: “Hatamizadeh et al., UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>”
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.upsample_kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size for transposed convolution layers.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.res_block (
bool
) – bool argument to determine if residual block is used.
-
forward
(inp, skip)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
monai.networks.blocks.
UnetrPrUpBlock
(spatial_dims, in_channels, out_channels, num_layer, kernel_size, stride, upsample_kernel_size, norm_name, conv_block=False, res_block=False)[source]¶ A projection upsampling module that can be used for UNETR: “Hatamizadeh et al., UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>”
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.num_layer (
int
) – number of upsampling blocks.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.stride (
Union
[Sequence
[int
],int
]) – convolution stride.upsample_kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size for transposed convolution layers.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.conv_block (
bool
) – bool argument to determine if convolutional block is used.res_block (
bool
) – bool argument to determine if residual block is used.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Residual Squeeze-and-Excitation¶
-
class
monai.networks.blocks.
ResidualSELayer
(spatial_dims, in_channels, r=2, acti_type_1='leakyrelu', acti_type_2='relu')[source]¶ A “squeeze-and-excitation”-like layer with a residual connection:
--+-- SE --o-- | | +--------+
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels (
int
) – number of input channels.r (
int
) – the reduction ratio r in the paper. Defaults to 2.acti_type_1 (
Union
[Tuple
[str
,Dict
],str
]) – defaults to “leakyrelu”.acti_type_2 (
Union
[Tuple
[str
,Dict
],str
]) – defaults to “relu”.
See also
Squeeze-and-Excitation Block¶
-
class
monai.networks.blocks.
SEBlock
(spatial_dims, in_channels, n_chns_1, n_chns_2, n_chns_3, conv_param_1=None, conv_param_2=None, conv_param_3=None, project=None, r=2, acti_type_1=('relu', {'inplace': True}), acti_type_2='sigmoid', acti_type_final=('relu', {'inplace': True}))[source]¶ Residual module enhanced with Squeeze-and-Excitation:
----+- conv1 -- conv2 -- conv3 -- SE -o--- | | +---(channel project if needed)----+
Re-implementation of the SE-Resnet block based on: “Hu et al., Squeeze-and-Excitation Networks, https://arxiv.org/abs/1709.01507”.
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels (
int
) – number of input channels.n_chns_1 (
int
) – number of output channels in the 1st convolution.n_chns_2 (
int
) – number of output channels in the 2nd convolution.n_chns_3 (
int
) – number of output channels in the 3rd convolution.conv_param_1 (
Optional
[Dict
]) – additional parameters to the 1st convolution. Defaults to{"kernel_size": 1, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_2 (
Optional
[Dict
]) – additional parameters to the 2nd convolution. Defaults to{"kernel_size": 3, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_3 (
Optional
[Dict
]) – additional parameters to the 3rd convolution. Defaults to{"kernel_size": 1, "norm": Norm.BATCH, "act": None}
project (
Optional
[Convolution
]) – in the case of residual chns and output chns doesn’t match, a project (Conv) layer/block is used to adjust the number of chns. In SENET, it is consisted with a Conv layer as well as a Norm layer. Defaults to None (chns are matchable) or a Conv layer with kernel size 1.r (
int
) – the reduction ratio r in the paper. Defaults to 2.acti_type_1 (
Union
[Tuple
[str
,Dict
],str
]) – activation type of the hidden squeeze layer. Defaults to “relu”.acti_type_2 (
Union
[Tuple
[str
,Dict
],str
]) – activation type of the output squeeze layer. Defaults to “sigmoid”.acti_type_final (
Union
[Tuple
[str
,Dict
],str
,None
]) – activation type of the end of the block. Defaults to “relu”.
See also
Squeeze-and-Excitation Bottleneck¶
-
class
monai.networks.blocks.
SEBottleneck
(spatial_dims, inplanes, planes, groups, reduction, stride=1, downsample=None)[source]¶ Bottleneck for SENet154.
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels – number of input channels.
n_chns_1 – number of output channels in the 1st convolution.
n_chns_2 – number of output channels in the 2nd convolution.
n_chns_3 – number of output channels in the 3rd convolution.
conv_param_1 – additional parameters to the 1st convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_2 – additional parameters to the 2nd convolution. Defaults to
{"kernel_size": 3, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_3 – additional parameters to the 3rd convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": None}
project – in the case of residual chns and output chns doesn’t match, a project (Conv) layer/block is used to adjust the number of chns. In SENET, it is consisted with a Conv layer as well as a Norm layer. Defaults to None (chns are matchable) or a Conv layer with kernel size 1.
r – the reduction ratio r in the paper. Defaults to 2.
acti_type_1 – activation type of the hidden squeeze layer. Defaults to “relu”.
acti_type_2 – activation type of the output squeeze layer. Defaults to “sigmoid”.
acti_type_final – activation type of the end of the block. Defaults to “relu”.
See also
Squeeze-and-Excitation Resnet Bottleneck¶
-
class
monai.networks.blocks.
SEResNetBottleneck
(spatial_dims, inplanes, planes, groups, reduction, stride=1, downsample=None)[source]¶ ResNet bottleneck with a Squeeze-and-Excitation module. It follows Caffe implementation and uses strides=stride in conv1 and not in conv2 (the latter is used in the torchvision implementation of ResNet).
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels – number of input channels.
n_chns_1 – number of output channels in the 1st convolution.
n_chns_2 – number of output channels in the 2nd convolution.
n_chns_3 – number of output channels in the 3rd convolution.
conv_param_1 – additional parameters to the 1st convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_2 – additional parameters to the 2nd convolution. Defaults to
{"kernel_size": 3, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_3 – additional parameters to the 3rd convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": None}
project – in the case of residual chns and output chns doesn’t match, a project (Conv) layer/block is used to adjust the number of chns. In SENET, it is consisted with a Conv layer as well as a Norm layer. Defaults to None (chns are matchable) or a Conv layer with kernel size 1.
r – the reduction ratio r in the paper. Defaults to 2.
acti_type_1 – activation type of the hidden squeeze layer. Defaults to “relu”.
acti_type_2 – activation type of the output squeeze layer. Defaults to “sigmoid”.
acti_type_final – activation type of the end of the block. Defaults to “relu”.
See also
Squeeze-and-Excitation ResNeXt Bottleneck¶
-
class
monai.networks.blocks.
SEResNeXtBottleneck
(spatial_dims, inplanes, planes, groups, reduction, stride=1, downsample=None, base_width=4)[source]¶ ResNeXt bottleneck type C with a Squeeze-and-Excitation module.
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels – number of input channels.
n_chns_1 – number of output channels in the 1st convolution.
n_chns_2 – number of output channels in the 2nd convolution.
n_chns_3 – number of output channels in the 3rd convolution.
conv_param_1 – additional parameters to the 1st convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_2 – additional parameters to the 2nd convolution. Defaults to
{"kernel_size": 3, "norm": Norm.BATCH, "act": ("relu", {"inplace": True})}
conv_param_3 – additional parameters to the 3rd convolution. Defaults to
{"kernel_size": 1, "norm": Norm.BATCH, "act": None}
project – in the case of residual chns and output chns doesn’t match, a project (Conv) layer/block is used to adjust the number of chns. In SENET, it is consisted with a Conv layer as well as a Norm layer. Defaults to None (chns are matchable) or a Conv layer with kernel size 1.
r – the reduction ratio r in the paper. Defaults to 2.
acti_type_1 – activation type of the hidden squeeze layer. Defaults to “relu”.
acti_type_2 – activation type of the output squeeze layer. Defaults to “sigmoid”.
acti_type_final – activation type of the end of the block. Defaults to “relu”.
See also
Simple ASPP¶
-
class
monai.networks.blocks.
SimpleASPP
(spatial_dims, in_channels, conv_out_channels, kernel_sizes=(1, 3, 3, 3), dilations=(1, 2, 4, 6), norm_type='BATCH', acti_type='LEAKYRELU')[source]¶ A simplified version of the atrous spatial pyramid pooling (ASPP) module.
Chen et al., Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. https://arxiv.org/abs/1802.02611
Wang et al., A Noise-robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions from CT Images. https://ieeexplore.ieee.org/document/9109297
- Parameters
spatial_dims (
int
) – number of spatial dimensions, could be 1, 2, or 3.in_channels (
int
) – number of input channels.conv_out_channels (
int
) – number of output channels of each atrous conv. The final number of output channels is conv_out_channels * len(kernel_sizes).kernel_sizes (
Sequence
[int
]) – a sequence of four convolutional kernel sizes. Defaults to (1, 3, 3, 3) for four (dilated) convolutions.dilations (
Sequence
[int
]) – a sequence of four convolutional dilation parameters. Defaults to (1, 2, 4, 6) for four (dilated) convolutions.norm_type (
Union
[Tuple
,str
,None
]) – final kernel-size-one convolution normalization type. Defaults to batch norm.acti_type (
Union
[Tuple
,str
,None
]) – final kernel-size-one convolution activation type. Defaults to leaky ReLU.
- Raises
ValueError – When
kernel_sizes
length differs fromdilations
.
MaxAvgPooling¶
-
class
monai.networks.blocks.
MaxAvgPool
(spatial_dims, kernel_size, stride=None, padding=0, ceil_mode=False)[source]¶ Downsample with both maxpooling and avgpooling, double the channel size by concatenating the downsampled feature maps.
- Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image.kernel_size (
Union
[Sequence
[int
],int
]) – the kernel size of both pooling operations.stride (
Union
[Sequence
[int
],int
,None
]) – the stride of the window. Default value is kernel_size.padding (
Union
[Sequence
[int
],int
]) – implicit zero padding to be added to both pooling operations.ceil_mode (
bool
) – when True, will use ceil instead of floor to compute the output shape.
Upsampling¶
-
class
monai.networks.blocks.
UpSample
(dimensions, in_channels=None, out_channels=None, scale_factor=2, size=None, mode=<UpsampleMode.DECONV: 'deconv'>, pre_conv='default', interp_mode=<InterpolateMode.LINEAR: 'linear'>, align_corners=True, bias=True, apply_pad_pool=True)[source]¶ Upsamples data by scale_factor. Supported modes are:
“deconv”: uses a transposed convolution.
“nontrainable”: uses
torch.nn.Upsample
.“pixelshuffle”: uses
monai.networks.blocks.SubpixelUpsample
.
This module can optionally take a pre-convolution (often used to map the number of features from in_channels to out_channels).
- Parameters
dimensions (
int
) – number of spatial dimensions of the input image.in_channels (
Optional
[int
]) – number of channels of the input image.out_channels (
Optional
[int
]) – number of channels of the output image. Defaults to in_channels.scale_factor (
Union
[Sequence
[float
],float
]) – multiplier for spatial size. Has to match input size if it is a tuple. Defaults to 2.size (
Union
[Tuple
[int
],int
,None
]) – spatial size of the output image. Only used whenmode
isUpsampleMode.NONTRAINABLE
. In torch.nn.functional.interpolate, only one of size or scale_factor should be defined, thus if size is defined, scale_factor will not be used. Defaults to None.mode (
Union
[UpsampleMode
,str
]) – {"deconv"
,"nontrainable"
,"pixelshuffle"
}. Defaults to"deconv"
.pre_conv (
Union
[Module
,str
,None
]) – a conv block applied before upsampling. Defaults to None. Whenconv_block
is"default"
, one reserved conv layer will be utilized when Only used in the “nontrainable” or “pixelshuffle” mode.interp_mode (
Union
[InterpolateMode
,str
]) – {"nearest"
,"linear"
,"bilinear"
,"bicubic"
,"trilinear"
} Only used whenmode
isUpsampleMode.NONTRAINABLE
. If ends with"linear"
will usespatial dims
to determine the correct interpolation. This corresponds to linear, bilinear, trilinear for 1D, 2D, and 3D respectively. The interpolation mode. Defaults to"linear"
. See also: https://pytorch.org/docs/stable/nn.html#upsamplealign_corners (
Optional
[bool
]) – set the align_corners parameter of torch.nn.Upsample. Defaults to True. Only used in the nontrainable mode.bias (
bool
) – whether to have a bias term in the default preconv and deconv layers. Defaults to True.apply_pad_pool (
bool
) – if True the upsampled tensor is padded then average pooling is applied with a kernel the size of scale_factor with a stride of 1. See also:monai.networks.blocks.SubpixelUpsample
. Only used in the pixelshuffle mode.
-
monai.networks.blocks.
Upsample
¶ alias of
monai.networks.blocks.upsample.UpSample
-
class
monai.networks.blocks.
SubpixelUpsample
(dimensions, in_channels, out_channels=None, scale_factor=2, conv_block='default', apply_pad_pool=True, bias=True)[source]¶ Upsample via using a subpixel CNN. This module supports 1D, 2D and 3D input images. The module is consisted with two parts. First of all, a convolutional layer is employed to increase the number of channels into:
in_channels * (scale_factor ** dimensions)
. Secondly, a pixel shuffle manipulation is utilized to aggregates the feature maps from low resolution space and build the super resolution space. The first part of the module is not fixed, a sequential layers can be used to replace the default single layer.See: Shi et al., 2016, “Real-Time Single Image and Video Super-Resolution Using a nEfficient Sub-Pixel Convolutional Neural Network.”
See: Aitken et al., 2017, “Checkerboard artifact free sub-pixel convolution”.
The idea comes from: https://arxiv.org/abs/1609.05158
The pixel shuffle mechanism refers to: https://pytorch.org/docs/stable/generated/torch.nn.PixelShuffle.html#torch.nn.PixelShuffle. and: https://github.com/pytorch/pytorch/pull/6340.
- Parameters
dimensions (
int
) – number of spatial dimensions of the input image.in_channels (
Optional
[int
]) – number of channels of the input image.out_channels (
Optional
[int
]) – optional number of channels of the output image.scale_factor (
int
) – multiplier for spatial size. Defaults to 2.conv_block (
Union
[Module
,str
,None
]) –a conv block to extract feature maps before upsampling. Defaults to None.
When
conv_block
is"default"
, one reserved conv layer will be utilized.When
conv_block
is annn.module
, please ensure the output number of channels is divisible(scale_factor ** dimensions)
.
apply_pad_pool (
bool
) – if True the upsampled tensor is padded then average pooling is applied with a kernel the size of scale_factor with a stride of 1. This implements the nearest neighbour resize convolution component of subpixel convolutions described in Aitken et al.bias (
bool
) – whether to have a bias term in the default conv_block. Defaults to True.
-
monai.networks.blocks.
Subpixelupsample
¶ alias of
monai.networks.blocks.upsample.SubpixelUpsample
-
monai.networks.blocks.
SubpixelUpSample
¶ alias of
monai.networks.blocks.upsample.SubpixelUpsample
Registration Residual Conv Block¶
-
class
monai.networks.blocks.
RegistrationResidualConvBlock
(spatial_dims, in_channels, out_channels, num_layers=2, kernel_size=3)[source]¶ A block with skip links and layer - norm - activation. Only changes the number of channels, the spatial size is kept same.
- Parameters
spatial_dims (
int
) – number of spatial dimensionsin_channels (
int
) – number of input channelsout_channels (
int
) – number of output channelsnum_layers (
int
) – number of layers inside the blockkernel_size (
int
) – kernel_size
Registration Down Sample Block¶
-
class
monai.networks.blocks.
RegistrationDownSampleBlock
(spatial_dims, channels, pooling)[source]¶ A down-sample module used in RegUNet to half the spatial size. The number of channels is kept same.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
- Parameters
spatial_dims (
int
) – number of spatial dimensions.channels (
int
) – channelspooling (
bool
) – use MaxPool if True, strided conv if False
-
forward
(x)[source]¶ Halves the spatial dimensions and keeps the same channel. output in shape (batch,
channels
, insize_1 / 2, insize_2 / 2, [insize_3 / 2]),- Parameters
x (
Tensor
) – Tensor in shape (batch,channels
, insize_1, insize_2, [insize_3])- Raises
ValueError – when input spatial dimensions are not even.
- Return type
Tensor
Registration Extraction Block¶
-
class
monai.networks.blocks.
RegistrationExtractionBlock
(spatial_dims, extract_levels, num_channels, out_channels, kernel_initializer='kaiming_uniform', activation=None)[source]¶ The Extraction Block used in RegUNet. Extracts feature from each
extract_levels
and takes the average.- Parameters
spatial_dims (
int
) – number of spatial dimensionsextract_levels (
Tuple
[int
]) – spatial levels to extract feature from, 0 refers to the input scalenum_channels (
Union
[Tuple
[int
],List
[int
]]) – number of channels at each scale level, List or Tuple of length equals to depth of the RegNetout_channels (
int
) – number of output channelskernel_initializer (
Optional
[str
]) – kernel initializeractivation (
Optional
[str
]) – kernel activation function
-
forward
(x, image_size)[source]¶ - Parameters
x (
List
[Tensor
]) – Decoded feature at different spatial levels, sorted from deep to shallowimage_size (
List
[int
]) – output image size
- Return type
Tensor
- Returns
Tensor of shape (batch, out_channels, size1, size2, size3), where (size1, size2, size3) =
image_size
LocalNet DownSample Block¶
-
class
monai.networks.blocks.
LocalNetDownSampleBlock
(spatial_dims, in_channels, out_channels, kernel_size)[source]¶ A down-sample module that can be used for LocalNet, based on: Weakly-supervised convolutional neural networks for multimodal image registration. Label-driven weakly-supervised learning for multimodal deformable image registration.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size.
- Raises
NotImplementedError – when
kernel_size
is even
-
forward
(x)[source]¶ Halves the spatial dimensions. A tuple of (x, mid) is returned:
x is the downsample result, in shape (batch,
out_channels
, insize_1 / 2, insize_2 / 2, [insize_3 / 2]),mid is the mid-level feature, in shape (batch,
out_channels
, insize_1, insize_2, [insize_3])
- Parameters
x – Tensor in shape (batch,
in_channels
, insize_1, insize_2, [insize_3])- Raises
ValueError – when input spatial dimensions are not even.
- Return type
Tuple
[Tensor
,Tensor
]
LocalNet UpSample Block¶
-
class
monai.networks.blocks.
LocalNetUpSampleBlock
(spatial_dims, in_channels, out_channels)[source]¶ A up-sample module that can be used for LocalNet, based on: Weakly-supervised convolutional neural networks for multimodal image registration. Label-driven weakly-supervised learning for multimodal deformable image registration.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.
- Raises
ValueError – when
in_channels != 2 * out_channels
-
forward
(x, mid)[source]¶ Halves the channel and doubles the spatial dimensions.
- Parameters
x – feature to be up-sampled, in shape (batch,
in_channels
, insize_1, insize_2, [insize_3])mid – mid-level feature saved during down-sampling, in shape (batch,
out_channels
, midsize_1, midsize_2, [midsize_3])
- Raises
ValueError – when
midsize != insize * 2
- Return type
Tensor
LocalNet Feature Extractor Block¶
-
class
monai.networks.blocks.
LocalNetFeatureExtractorBlock
(spatial_dims, in_channels, out_channels, act='RELU', initializer='kaiming_uniform')[source]¶ A feature-extraction module that can be used for LocalNet, based on: Weakly-supervised convolutional neural networks for multimodal image registration. Label-driven weakly-supervised learning for multimodal deformable image registration.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
Args: spatial_dims: number of spatial dimensions. in_channels: number of input channels. out_channels: number of output channels. act: activation type and arguments. Defaults to ReLU. kernel_initializer: kernel initializer. Defaults to None.
MLP Block¶
-
class
monai.networks.blocks.
MLPBlock
(hidden_size, mlp_dim, dropout_rate=0.0)[source]¶ A multi-layer perceptron block, based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”
- Parameters
hidden_size (
int
) – dimension of hidden layer.mlp_dim (
int
) – dimension of feedforward layer.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Patch Embedding Block¶
-
class
monai.networks.blocks.
PatchEmbeddingBlock
(in_channels, img_size, patch_size, hidden_size, num_heads, pos_embed, dropout_rate=0.0)[source]¶ A patch embedding block, based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”
- Parameters
in_channels (
int
) – dimension of input channels.img_size (
Tuple
[int
,int
,int
]) – dimension of input image.patch_size (
Tuple
[int
,int
,int
]) – dimension of patch size.hidden_size (
int
) – dimension of hidden layer.num_heads (
int
) – number of attention heads.pos_embed (
str
) – position embedding layer type.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Warp¶
-
class
monai.networks.blocks.
Warp
(mode='bilinear', padding_mode='border')[source]¶ Warp an image with given dense displacement field (DDF).
For pytorch native APIs, the possible values are:
mode:
"nearest"
,"bilinear"
,"bicubic"
.padding_mode:
"zeros"
,"border"
,"reflection"
See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
For MONAI C++/CUDA extensions, the possible values are:
mode:
"nearest"
,"bilinear"
,"bicubic"
, 0, 1, …padding_mode:
"zeros"
,"border"
,"reflection"
, 0, 1, …
See also:
monai.networks.layers.grid_pull
DVF2DDF¶
-
class
monai.networks.blocks.
DVF2DDF
(num_steps=7, mode='bilinear', padding_mode='zeros')[source]¶ Layer calculates a dense displacement field (DDF) from a dense velocity field (DVF) with scaling and squaring.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Layers¶
Factories¶
Defines factories for creating layers in generic, extensible, and dimensionally independent ways. A separate factory object is created for each type of layer, and factory functions keyed to names are added to these objects. Whenever a layer is requested the factory name and any necessary arguments are passed to the factory object. The return value is typically a type but can be any callable producing a layer object.
The factory objects contain functions keyed to names converted to upper case, these names can be referred to as members of the factory so that they can function as constant identifiers. eg. instance normalization is named Norm.INSTANCE.
For example, to get a transpose convolution layer the name is needed and then a dimension argument is provided which is passed to the factory function:
dimension = 3
name = Conv.CONVTRANS
conv = Conv[name, dimension]
This allows the dimension value to be set in the constructor, for example so that the dimensionality of a network is parameterizable. Not all factories require arguments after the name, the caller must be aware which are required.
Defining new factories involves creating the object then associating it with factory functions:
fact = LayerFactory()
@fact.factory_function('test')
def make_something(x, y):
# do something with x and y to choose which layer type to return
return SomeLayerType
...
# request object from factory TEST with 1 and 2 as values for x and y
layer = fact[fact.TEST, 1, 2]
Typically the caller of a factory would know what arguments to pass (ie. the dimensionality of the requested type) but can be parameterized with the factory name and the arguments to pass to the created type at instantiation time:
def use_factory(fact_args):
fact_name, type_args = split_args
layer_type = fact[fact_name, 1, 2]
return layer_type(**type_args)
...
kw_args = {'arg0':0, 'arg1':True}
layer = use_factory( (fact.TEST, kwargs) )
-
class
monai.networks.layers.
LayerFactory
[source]¶ Factory object for creating layers, this uses given factory functions to actually produce the types or constructing callables. These functions are referred to by name and can be added at any time.
-
add_factory_callable
(name, func)[source]¶ Add the factory function to this object under the given name.
- Return type
None
-
factory_function
(name)[source]¶ Decorator for adding a factory function with the given name.
- Return type
Callable
-
get_constructor
(factory_name, *args)[source]¶ Get the constructor for the given factory name and arguments.
- Raises
TypeError – When
factory_name
is not astr
.- Return type
Any
-
property
names
¶ Produces all factory names.
- Return type
Tuple
[str
, …]
-
split_args¶
-
monai.networks.layers.
split_args
(args)[source]¶ Split arguments in a way to be suitable for using with the factory types. If args is a string it’s interpreted as the type name.
- Parameters
args (str or a tuple of object name and kwarg dict) – input arguments to be parsed.
- Raises
TypeError – When
args
type is not inUnion[str, Tuple[Union[str, Callable], dict]]
.
Examples:
>>> act_type, args = split_args("PRELU") >>> monai.networks.layers.Act[act_type] <class 'torch.nn.modules.activation.PReLU'> >>> act_type, args = split_args(("PRELU", {"num_parameters": 1, "init": 0.25})) >>> monai.networks.layers.Act[act_type](**args) PReLU(num_parameters=1)
Dropout¶
The supported members are: DROPOUT
, ALPHADROPOUT
.
Please see monai.networks.layers.split_args
for additional args parsing.
Act¶
The supported members are: ELU
, RELU
, LEAKYRELU
, PRELU
, RELU6
, SELU
, CELU
, GELU
, SIGMOID
, TANH
, SOFTMAX
, LOGSOFTMAX
, SWISH
, MEMSWISH
, MISH
.
Please see monai.networks.layers.split_args
for additional args parsing.
Norm¶
The supported members are: INSTANCE
, BATCH
, GROUP
, LAYER
, LOCALRESPONSE
, SYNCBATCH
.
Please see monai.networks.layers.split_args
for additional args parsing.
Conv¶
The supported members are: CONV
, CONVTRANS
.
Please see monai.networks.layers.split_args
for additional args parsing.
Pool¶
The supported members are: MAX
, ADAPTIVEMAX
, AVG
, ADAPTIVEAVG
.
Please see monai.networks.layers.split_args
for additional args parsing.
ChannelPad¶
-
class
monai.networks.layers.
ChannelPad
(spatial_dims, in_channels, out_channels, mode=<ChannelMatching.PAD: 'pad'>)[source]¶ Expand the input tensor’s channel dimension from length in_channels to out_channels, by padding or a projection.
- Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.mode (
Union
[ChannelMatching
,str
]) –{
"pad"
,"project"
} Specifies handling residual branch and conv branch channel mismatches. Defaults to"pad"
."pad"
: with zero padding."project"
: with a trainable conv with kernel size one.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
SkipConnection¶
-
class
monai.networks.layers.
SkipConnection
(submodule, dim=1, mode='cat')[source]¶ Combine the forward pass input with the result from the given submodule:
--+--submodule--o-- |_____________|
The available modes are
"cat"
,"add"
,"mul"
.- Parameters
submodule – the module defines the trainable branch.
dim (
int
) – the dimension over which the tensors are concatenated. Used when mode is"cat"
.mode (
Union
[str
,SkipMode
]) –"cat"
,"add"
,"mul"
. defaults to"cat"
.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
Flatten¶
-
class
monai.networks.layers.
Flatten
[source]¶ Flattens the given input in the forward pass to be [B,-1] in shape.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
-
GaussianFilter¶
-
class
monai.networks.layers.
GaussianFilter
(spatial_dims, sigma, truncated=4.0, approx='erf', requires_grad=False)[source]¶ - Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image. must have shape (Batch, channels, H[, W, …]).sigma (
Union
[Sequence
[float
],float
,Sequence
[Tensor
],Tensor
]) – std. could be a single value, or spatial_dims number of values.truncated (
float
) – spreads how many stds.approx (
str
) –discrete Gaussian kernel type, available options are “erf”, “sampled”, and “scalespace”.
erf
approximation interpolates the error function;sampled
uses a sampled Gaussian kernel;scalespace
corresponds to https://en.wikipedia.org/wiki/Scale_space_implementation#The_discrete_Gaussian_kernel based on the modified Bessel functions.
requires_grad (
bool
) – whether to store the gradients for sigma. if True, sigma will be the initial value of the parameters of this module (for example parameters() iterator could be used to get the parameters); otherwise this module will fix the kernels using sigma as the std.
BilateralFilter¶
-
class
monai.networks.layers.
BilateralFilter
(*args, **kwargs)[source]¶ Blurs the input tensor spatially whilst preserving edges. Can run on 1D, 2D, or 3D, tensors (on top of Batch and Channel dimensions). Two implementations are provided, an exact solution and a much faster approximation which uses a permutohedral lattice.
- See:
https://en.wikipedia.org/wiki/Bilateral_filter https://graphics.stanford.edu/papers/permutohedral/
- Parameters
input – input tensor.
sigma (color) – the standard deviation of the spatial blur. Higher values can hurt performance when not using the approximate method (see fast approx).
sigma – the standard deviation of the color blur. Lower values preserve edges better whilst higher values tend to a simple gaussian spatial blur.
approx (fast) – This flag chooses between two implementations. The approximate method may produce artifacts in some scenarios whereas the exact solution may be intolerably slow for high spatial standard deviations.
- Returns
output tensor.
- Return type
output (torch.Tensor)
-
static
backward
(ctx, grad_output)[source]¶ Defines a formula for differentiating the operation.
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
-
static
forward
(ctx, input, spatial_sigma=5, color_sigma=0.5, fast_approx=True)[source]¶ Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store arbitrary data that can be then retrieved during the backward pass.
PHLFilter¶
-
class
monai.networks.layers.
PHLFilter
(*args, **kwargs)[source]¶ Filters input based on arbitrary feature vectors. Uses a permutohedral lattice data structure to efficiently approximate n-dimensional gaussian filtering. Complexity is broadly independent of kernel size. Most applicable to higher filter dimensions and larger kernel sizes.
- Parameters
input – input tensor to be filtered.
features – feature tensor used to filter the input.
sigmas – the standard deviations of each feature in the filter.
- Returns
output tensor.
- Return type
output (torch.Tensor)
GaussianMixtureModel¶
-
class
monai.networks.layers.
GaussianMixtureModel
(channel_count, mixture_count, mixture_size, verbose_build=False)[source]¶ Takes an initial labeling and uses a mixture of Gaussians to approximate each classes distribution in the feature space. Each unlabeled element is then assigned a probability of belonging to each class based on it’s fit to each classes approximated distribution.
- Parameters
channel_count (
int
) – The number of features per element.mixture_count (
int
) – The number of class distributions.mixture_size (
int
) – The number Gaussian components per class distribution.verbose_build (
bool
) – IfTrue
, turns on verbose logging of load steps.
SavitzkyGolayFilter¶
-
class
monai.networks.layers.
SavitzkyGolayFilter
(window_length, order, axis=2, mode='zeros')[source]¶ Convolve a Tensor along a particular axis with a Savitzky-Golay kernel.
- Parameters
window_length (
int
) – Length of the filter window, must be a positive odd integer.order (
int
) – Order of the polynomial to fit to each window, must be less thanwindow_length
.axis (optional) – Axis along which to apply the filter kernel. Default 2 (first spatial dimension).
mode (string, optional) – padding mode passed to convolution class.
'zeros'
,'reflect'
,'replicate'
orDefault ('circular'.) –
'zeros'
. See torch.nn.Conv1d() for more information.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ - Parameters
x (
Tensor
) – Tensor or array-like to filter. Must be real, in shape[Batch, chns, spatial1, spatial2, ...]
and have a device type of'cpu'
.- Returns
x
filtered by Savitzky-Golay kernel with window lengthself.window_length
using polynomials of orderself.order
, along axis specified inself.axis
.- Return type
torch.Tensor
HilbertTransform¶
-
class
monai.networks.layers.
HilbertTransform
(axis=2, n=None)[source]¶ Determine the analytical signal of a Tensor along a particular axis. Requires PyTorch 1.7.0+ and the PyTorch FFT module (which is not included in NVIDIA PyTorch Release 20.10).
- Parameters
axis (
int
) – Axis along which to apply Hilbert transform. Default 2 (first spatial dimension).N – Number of Fourier components (i.e. FFT size). Default:
x.shape[axis]
.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ - Parameters
x (
Tensor
) – Tensor or array-like to transform. Must be real and in shape[Batch, chns, spatial1, spatial2, ...]
.- Returns
Analytical signal of
x
, transformed along axis specified inself.axis
using FFT of sizeself.N
. The absolute value ofx_ht
relates to the envelope ofx
along axisself.axis
.- Return type
torch.Tensor
Affine Transform¶
-
class
monai.networks.layers.
AffineTransform
(spatial_size=None, normalized=False, mode=<GridSampleMode.BILINEAR: 'bilinear'>, padding_mode=<GridSamplePadMode.ZEROS: 'zeros'>, align_corners=False, reverse_indexing=True)[source]¶ Apply affine transformations with a batch of affine matrices.
When normalized=False and reverse_indexing=True, it does the commonly used resampling in the ‘pull’ direction following the
scipy.ndimage.affine_transform
convention. In this case theta is equivalent to (ndim+1, ndim+1) inputmatrix
ofscipy.ndimage.affine_transform
, operates on homogeneous coordinates. See also: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.affine_transform.htmlWhen normalized=True and reverse_indexing=False, it applies theta to the normalized coordinates (coords. in the range of [-1, 1]) directly. This is often used with align_corners=False to achieve resolution-agnostic resampling, thus useful as a part of trainable modules such as the spatial transformer networks. See also: https://pytorch.org/tutorials/intermediate/spatial_transformer_tutorial.html
- Parameters
spatial_size (
Union
[Sequence
[int
],int
,None
]) – output spatial shape, the full output shape will be [N, C, *spatial_size] where N and C are inferred from the src input of self.forward.normalized (
bool
) – indicating whether the provided affine matrix theta is defined for the normalized coordinates. If normalized=False, theta will be converted to operate on normalized coordinates as pytorch affine_grid works with the normalized coordinates.mode (
Union
[GridSampleMode
,str
]) – {"bilinear"
,"nearest"
} Interpolation mode to calculate output values. Defaults to"bilinear"
. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-samplepadding_mode (
Union
[GridSamplePadMode
,str
]) – {"zeros"
,"border"
,"reflection"
} Padding mode for outside grid values. Defaults to"zeros"
. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-samplealign_corners (
bool
) – see also https://pytorch.org/docs/stable/nn.functional.html#grid-sample.reverse_indexing (
bool
) – whether to reverse the spatial indexing of image and coordinates. set to False if theta follows pytorch’s default “D, H, W” convention. set to True if theta follows scipy.ndimage default “i, j, k” convention.
-
forward
(src, theta, spatial_size=None)[source]¶ theta
must be an affine transformation matrix with shape 3x3 or Nx3x3 or Nx2x3 or 2x3 for spatial 2D transforms, 4x4 or Nx4x4 or Nx3x4 or 3x4 for spatial 3D transforms, where N is the batch size. theta will be converted into float Tensor for the computation.- Parameters
src (array_like) – image in spatial 2D or 3D (N, C, spatial_dims), where N is the batch dim, C is the number of channels.
theta (array_like) – Nx3x3, Nx2x3, 3x3, 2x3 for spatial 2D inputs, Nx4x4, Nx3x4, 3x4, 4x4 for spatial 3D inputs. When the batch dimension is omitted, theta will be repeated N times, N is the batch dim of src.
spatial_size (
Union
[Sequence
[int
],int
,None
]) – output spatial shape, the full output shape will be [N, C, *spatial_size] where N and C are inferred from the src.
- Raises
TypeError – When
theta
is not atorch.Tensor
.ValueError – When
theta
is not one of [Nxdxd, dxd].ValueError – When
theta
is not one of [Nx3x3, Nx4x4].TypeError – When
src
is not atorch.Tensor
.ValueError – When
src
spatially is not one of [2D, 3D].ValueError – When affine and image batch dimension differ.
- Return type
Tensor
grid_pull¶
-
monai.networks.layers.
grid_pull
(input, grid, interpolation='linear', bound='zero', extrapolate=True)[source]¶ Sample an image with respect to a deformation field.
interpolation can be an int, a string or an InterpolationType. Possible values are:
- 0 or 'nearest' or InterpolationType.nearest - 1 or 'linear' or InterpolationType.linear - 2 or 'quadratic' or InterpolationType.quadratic - 3 or 'cubic' or InterpolationType.cubic - 4 or 'fourth' or InterpolationType.fourth - 5 or 'fifth' or InterpolationType.fifth - 6 or 'sixth' or InterpolationType.sixth - 7 or 'seventh' or InterpolationType.seventh
A list of values can be provided, in the order [W, H, D], to specify dimension-specific interpolation orders.
bound can be an int, a string or a BoundType. Possible values are:
- 0 or 'replicate' or 'nearest' or BoundType.replicate - 1 or 'dct1' or 'mirror' or BoundType.dct1 - 2 or 'dct2' or 'reflect' or BoundType.dct2 - 3 or 'dst1' or 'antimirror' or BoundType.dst1 - 4 or 'dst2' or 'antireflect' or BoundType.dst2 - 5 or 'dft' or 'wrap' or BoundType.dft - 7 or 'zero' or BoundType.zero
A list of values can be provided, in the order [W, H, D], to specify dimension-specific boundary conditions. sliding is a specific condition than only applies to flow fields (with as many channels as dimensions). It cannot be dimension-specific. Note that:
dft corresponds to circular padding
dct2 corresponds to Neumann boundary conditions (symmetric)
dst2 corresponds to Dirichlet boundary conditions (antisymmetric)
See also
help(monai._C.BoundType)
help(monai._C.InterpolationType)
- Parameters
input (
Tensor
) – Input image. (B, C, Wi, Hi, Di).grid (
Tensor
) – Deformation field. (B, Wo, Ho, Do, 1|2|3).interpolation (int or list[int] , optional) – Interpolation order. Defaults to ‘linear’.
bound (BoundType, or list[BoundType], optional) – Boundary conditions. Defaults to ‘zero’.
extrapolate (
bool
) – Extrapolate out-of-bound data. Defaults to True.
- Returns
Deformed image (B, C, Wo, Ho, Do).
- Return type
output (torch.Tensor)
grid_push¶
-
monai.networks.layers.
grid_push
(input, grid, shape=None, interpolation='linear', bound='zero', extrapolate=True)[source]¶ Splat an image with respect to a deformation field (pull adjoint).
interpolation can be an int, a string or an InterpolationType. Possible values are:
- 0 or 'nearest' or InterpolationType.nearest - 1 or 'linear' or InterpolationType.linear - 2 or 'quadratic' or InterpolationType.quadratic - 3 or 'cubic' or InterpolationType.cubic - 4 or 'fourth' or InterpolationType.fourth - 5 or 'fifth' or InterpolationType.fifth - 6 or 'sixth' or InterpolationType.sixth - 7 or 'seventh' or InterpolationType.seventh
A list of values can be provided, in the order [W, H, D], to specify dimension-specific interpolation orders.
bound can be an int, a string or a BoundType. Possible values are:
- 0 or 'replicate' or 'nearest' or BoundType.replicate - 1 or 'dct1' or 'mirror' or BoundType.dct1 - 2 or 'dct2' or 'reflect' or BoundType.dct2 - 3 or 'dst1' or 'antimirror' or BoundType.dst1 - 4 or 'dst2' or 'antireflect' or BoundType.dst2 - 5 or 'dft' or 'wrap' or BoundType.dft - 7 or 'zero' or BoundType.zero
A list of values can be provided, in the order [W, H, D], to specify dimension-specific boundary conditions. sliding is a specific condition than only applies to flow fields (with as many channels as dimensions). It cannot be dimension-specific. Note that:
dft corresponds to circular padding
dct2 corresponds to Neumann boundary conditions (symmetric)
dst2 corresponds to Dirichlet boundary conditions (antisymmetric)
See also
help(monai._C.BoundType)
help(monai._C.InterpolationType)
- Parameters
input (
Tensor
) – Input image (B, C, Wi, Hi, Di).grid (
Tensor
) – Deformation field (B, Wi, Hi, Di, 1|2|3).shape – Shape of the source image.
interpolation (int or list[int] , optional) – Interpolation order. Defaults to ‘linear’.
bound (BoundType, or list[BoundType], optional) – Boundary conditions. Defaults to ‘zero’.
extrapolate (
bool
) – Extrapolate out-of-bound data. Defaults to True.
- Returns
Splatted image (B, C, Wo, Ho, Do).
- Return type
output (torch.Tensor)
grid_count¶
-
monai.networks.layers.
grid_count
(grid, shape=None, interpolation='linear', bound='zero', extrapolate=True)[source]¶ Splatting weights with respect to a deformation field (pull adjoint).
This function is equivalent to applying grid_push to an image of ones.
interpolation can be an int, a string or an InterpolationType. Possible values are:
- 0 or 'nearest' or InterpolationType.nearest - 1 or 'linear' or InterpolationType.linear - 2 or 'quadratic' or InterpolationType.quadratic - 3 or 'cubic' or InterpolationType.cubic - 4 or 'fourth' or InterpolationType.fourth - 5 or 'fifth' or InterpolationType.fifth - 6 or 'sixth' or InterpolationType.sixth - 7 or 'seventh' or InterpolationType.seventh
A list of values can be provided, in the order [W, H, D], to specify dimension-specific interpolation orders.
bound can be an int, a string or a BoundType. Possible values are:
- 0 or 'replicate' or 'nearest' or BoundType.replicate - 1 or 'dct1' or 'mirror' or BoundType.dct1 - 2 or 'dct2' or 'reflect' or BoundType.dct2 - 3 or 'dst1' or 'antimirror' or BoundType.dst1 - 4 or 'dst2' or 'antireflect' or BoundType.dst2 - 5 or 'dft' or 'wrap' or BoundType.dft - 7 or 'zero' or BoundType.zero
A list of values can be provided, in the order [W, H, D], to specify dimension-specific boundary conditions. sliding is a specific condition than only applies to flow fields (with as many channels as dimensions). It cannot be dimension-specific. Note that:
dft corresponds to circular padding
dct2 corresponds to Neumann boundary conditions (symmetric)
dst2 corresponds to Dirichlet boundary conditions (antisymmetric)
See also
help(monai._C.BoundType)
help(monai._C.InterpolationType)
- Parameters
grid (
Tensor
) – Deformation field (B, Wi, Hi, Di, 2|3).shape – shape of the source image.
interpolation (int or list[int] , optional) – Interpolation order. Defaults to ‘linear’.
bound (BoundType, or list[BoundType], optional) – Boundary conditions. Defaults to ‘zero’.
extrapolate (bool, optional) – Extrapolate out-of-bound data. Defaults to True.
- Returns
Splat weights (B, 1, Wo, Ho, Do).
- Return type
output (torch.Tensor)
grid_grad¶
-
monai.networks.layers.
grid_grad
(input, grid, interpolation='linear', bound='zero', extrapolate=True)[source]¶ Sample an image with respect to a deformation field.
interpolation can be an int, a string or an InterpolationType. Possible values are:
- 0 or 'nearest' or InterpolationType.nearest - 1 or 'linear' or InterpolationType.linear - 2 or 'quadratic' or InterpolationType.quadratic - 3 or 'cubic' or InterpolationType.cubic - 4 or 'fourth' or InterpolationType.fourth - 5 or 'fifth' or InterpolationType.fifth - 6 or 'sixth' or InterpolationType.sixth - 7 or 'seventh' or InterpolationType.seventh
A list of values can be provided, in the order [W, H, D], to specify dimension-specific interpolation orders.
bound can be an int, a string or a BoundType. Possible values are:
- 0 or 'replicate' or 'nearest' or BoundType.replicate - 1 or 'dct1' or 'mirror' or BoundType.dct1 - 2 or 'dct2' or 'reflect' or BoundType.dct2 - 3 or 'dst1' or 'antimirror' or BoundType.dst1 - 4 or 'dst2' or 'antireflect' or BoundType.dst2 - 5 or 'dft' or 'wrap' or BoundType.dft - 7 or 'zero' or BoundType.zero
A list of values can be provided, in the order [W, H, D], to specify dimension-specific boundary conditions. sliding is a specific condition than only applies to flow fields (with as many channels as dimensions). It cannot be dimension-specific. Note that:
dft corresponds to circular padding
dct2 corresponds to Neumann boundary conditions (symmetric)
dst2 corresponds to Dirichlet boundary conditions (antisymmetric)
See also
help(monai._C.BoundType)
help(monai._C.InterpolationType)
- Parameters
input (
Tensor
) – Input image. (B, C, Wi, Hi, Di).grid (
Tensor
) – Deformation field. (B, Wo, Ho, Do, 2|3).interpolation (int or list[int] , optional) – Interpolation order. Defaults to ‘linear’.
bound (BoundType, or list[BoundType], optional) – Boundary conditions. Defaults to ‘zero’.
extrapolate (
bool
) – Extrapolate out-of-bound data. Defaults to True.
- Returns
Sampled gradients (B, C, Wo, Ho, Do, 1|2|3).
- Return type
output (torch.Tensor)
LLTM¶
-
class
monai.networks.layers.
LLTM
(input_features, state_size)[source]¶ This recurrent unit is similar to an LSTM, but differs in that it lacks a forget gate and uses an Exponential Linear Unit (ELU) as its internal activation function. Because this unit never forgets, call it LLTM, or Long-Long-Term-Memory unit. It has both C++ and CUDA implementation, automatically switch according to the target device where put this module to.
- Parameters
input_features (
int
) – size of input feature datastate_size (
int
) – size of the state of recurrent unit
Referring to: https://pytorch.org/tutorials/advanced/cpp_extension.html
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(input, state)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Utilities¶
-
monai.networks.layers.convutils.
calculate_out_shape
(in_shape, kernel_size, stride, padding)[source]¶ Calculate the output tensor shape when applying a convolution to a tensor of shape inShape with kernel size kernel_size, stride value stride, and input padding value padding. All arguments can be scalars or multiple values, return value is a scalar if all inputs are scalars.
- Return type
Union
[Tuple
[int
, …],int
]
-
monai.networks.layers.convutils.
gaussian_1d
(sigma, truncated=4.0, approx='erf', normalize=False)[source]¶ one dimensional Gaussian kernel.
- Parameters
sigma (
Tensor
) – std of the kerneltruncated (
float
) – tail lengthapprox (
str
) –discrete Gaussian kernel type, available options are “erf”, “sampled”, and “scalespace”.
erf
approximation interpolates the error function;sampled
uses a sampled Gaussian kernel;scalespace
corresponds to https://en.wikipedia.org/wiki/Scale_space_implementation#The_discrete_Gaussian_kernel based on the modified Bessel functions.
normalize (
bool
) – whether to normalize the kernel with kernel.sum().
- Raises
ValueError – When
truncated
is non-positive.- Return type
Tensor
- Returns
1D torch tensor
-
monai.networks.layers.convutils.
polyval
(coef, x)[source]¶ Evaluates the polynomial defined by coef at x.
For a 1D sequence of coef (length n), evaluate:
y = coef[n-1] + x * (coef[n-2] + ... + x * (coef[1] + x * coef[0]))
- Parameters
coef – a sequence of floats representing the coefficients of the polynomial
x – float or a sequence of floats representing the variable of the polynomial
- Return type
Tensor
- Returns
1D torch tensor
-
monai.networks.layers.convutils.
same_padding
(kernel_size, dilation=1)[source]¶ Return the padding value needed to ensure a convolution using the given kernel size produces an output of the same shape as the input for a stride of 1, otherwise ensure a shape of the input divided by the stride rounded down.
- Raises
NotImplementedError – When
np.any((kernel_size - 1) * dilation % 2 == 1)
.- Return type
Union
[Tuple
[int
, …],int
]
-
monai.networks.layers.utils.
get_act_layer
(name)[source]¶ Create an activation layer instance.
For example, to create activation layers:
from monai.networks.layers import get_act_layer s_layer = get_act_layer(name="swish") p_layer = get_act_layer(name=("prelu", {"num_parameters": 1, "init": 0.25}))
- Parameters
name (
Union
[Tuple
,str
]) – an activation type string or a tuple of type string and parameters.
-
monai.networks.layers.utils.
get_dropout_layer
(name, dropout_dim=1)[source]¶ Create a dropout layer instance.
For example, to create dropout layers:
from monai.networks.layers import get_dropout_layer d_layer = get_dropout_layer(name="dropout") a_layer = get_dropout_layer(name=("alphadropout", {"p": 0.25}))
- Parameters
name (
Union
[Tuple
,str
,float
,int
]) – a dropout ratio or a tuple of dropout type and parameters.dropout_dim (
Optional
[int
]) – the spatial dimension of the dropout operation.
-
monai.networks.layers.utils.
get_norm_layer
(name, spatial_dims=1, channels=1)[source]¶ Create a normalization layer instance.
For example, to create normalization layers:
from monai.networks.layers import get_norm_layer g_layer = get_norm_layer(name=("group", {"num_groups": 1})) n_layer = get_norm_layer(name="instance", spatial_dims=2)
- Parameters
name (
Union
[Tuple
,str
]) – a normalization type string or a tuple of type string and parameters.spatial_dims (
Optional
[int
]) – number of spatial dimensions of the input.channels (
Optional
[int
]) – number of features/channels when the normalization layer requires this parameter but it is not specified in the norm parameters.
-
monai.networks.layers.utils.
get_pool_layer
(name, spatial_dims=1)[source]¶ Create a pooling layer instance.
For example, to create adaptiveavg layer:
from monai.networks.layers import get_pool_layer pool_layer = get_pool_layer(("adaptiveavg", {"output_size": (1, 1, 1)}), spatial_dims=3)
- Parameters
name (
Union
[Tuple
,str
]) – a pooling type string or a tuple of type string and parameters.spatial_dims (
Optional
[int
]) – number of spatial dimensions of the input.
Nets¶
AHNet¶
-
class
monai.networks.nets.
AHNet
(layers=(3, 4, 6, 3), spatial_dims=3, in_channels=1, out_channels=1, psp_block_num=4, upsample_mode='transpose', pretrained=False, progress=True)[source]¶ AHNet based on Anisotropic Hybrid Network. Adapted from lsqshr’s official code. Except from the original network that supports 3D inputs, this implementation also supports 2D inputs. According to the tests for deconvolutions, using
"transpose"
rather than linear interpolations is faster. Therefore, this implementation sets"transpose"
as the default upsampling method.To meet to requirements of the structure, for
transpose
mode, the input size of the firstdim-1
dimensions should be divisible by 2 ** (psp_block_num + 3) and no less than 32. For other modes, the input size of the firstdim-1
dimensions should be divisible by 32 and no less than 2 ** (psp_block_num + 3). In addition, at least one dimension should have a no less than 64 size.- Parameters
layers (
tuple
) – number of residual blocks for 4 layers of the network (layer1…layer4). Defaults to(3, 4, 6, 3)
.spatial_dims (
int
) – spatial dimension of the input data. Defaults to 3.in_channels (
int
) – number of input channels for the network. Default to 1.out_channels (
int
) – number of output channels for the network. Defaults to 1.psp_block_num (
int
) – the number of pyramid volumetric pooling modules used at the end of the network before the final output layer for extracting multiscale features. The number should be an integer that belongs to [0,4]. Defaults to 4.upsample_mode (
str
) –[
"transpose"
,"bilinear"
,"trilinear"
,nearest
] The mode of upsampling manipulations. Using the last two modes cannot guarantee the model’s reproducibility. Defaults totranspose
."transpose"
, uses transposed convolution layers."bilinear"
, uses bilinear interpolate."trilinear"
, uses trilinear interpolate."nearest"
, uses nearest interpolate.
pretrained (
bool
) – whether to load pretrained weights from ResNet50 to initialize convolution layers, default to False.progress (
bool
) – If True, displays a progress bar of the download of pretrained weights to stderr.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
DenseNet¶
-
class
monai.networks.nets.
DenseNet
(spatial_dims, in_channels, out_channels, init_features=64, growth_rate=32, block_config=(6, 12, 24, 16), bn_size=4, dropout_prob=0.0)[source]¶ Densenet based on: Densely Connected Convolutional Networks. Adapted from PyTorch Hub 2D version: https://pytorch.org/vision/stable/models.html#id16.
- Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image.in_channels (
int
) – number of the input channel.out_channels (
int
) – number of the output classes.init_features (
int
) – number of filters in the first convolution layer.growth_rate (
int
) – how many filters to add each layer (k in paper).block_config (
Sequence
[int
]) – how many layers in each pooling block.bn_size (
int
) – multiplicative factor for number of bottle neck layers. (i.e. bn_size * k features in the bottleneck layer)dropout_prob (
float
) – dropout rate after each dense layer.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
DenseNet121¶
-
class
monai.networks.nets.
DenseNet121
(init_features=64, growth_rate=32, block_config=(6, 12, 24, 16), pretrained=False, progress=True, **kwargs)[source]¶ DenseNet121 with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
DenseNet169¶
-
class
monai.networks.nets.
DenseNet169
(init_features=64, growth_rate=32, block_config=(6, 12, 32, 32), pretrained=False, progress=True, **kwargs)[source]¶ DenseNet169 with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
DenseNet201¶
-
class
monai.networks.nets.
DenseNet201
(init_features=64, growth_rate=32, block_config=(6, 12, 48, 32), pretrained=False, progress=True, **kwargs)[source]¶ DenseNet201 with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
DenseNet264¶
EfficientNet¶
-
class
monai.networks.nets.
EfficientNet
(blocks_args_str, spatial_dims=2, in_channels=3, num_classes=1000, width_coefficient=1.0, depth_coefficient=1.0, dropout_rate=0.2, image_size=224, batch_norm_momentum=0.99, batch_norm_epsilon=0.001, drop_connect_rate=0.2, depth_divisor=8)[source]¶ EfficientNet based on Rethinking Model Scaling for Convolutional Neural Networks. Adapted from EfficientNet-PyTorch.
- Parameters
blocks_args_str (
List
[str
]) – block definitions.spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.num_classes (
int
) – number of output classes.width_coefficient (
float
) – width multiplier coefficient (w in paper).depth_coefficient (
float
) – depth multiplier coefficient (d in paper).dropout_rate (
float
) – dropout rate for dropout layers.image_size (
int
) – input image resolution.batch_norm_momentum (
float
) – momentum for batch norm.batch_norm_epsilon (
float
) – epsilon for batch norm.drop_connect_rate (
float
) – dropconnect rate for drop connection (individual weights) layers.depth_divisor (
int
) – depth divisor for channel rounding.
EfficientNetBN¶
-
class
monai.networks.nets.
EfficientNetBN
(model_name, pretrained=True, progress=True, spatial_dims=2, in_channels=3, num_classes=1000)[source]¶ Generic wrapper around EfficientNet, used to initialize EfficientNet-B0 to EfficientNet-B7 models model_name is mandatory argument as there is no EfficientNetBN itself, it needs the N in [0, 1, 2, 3, 4, 5, 6, 7] to be a model
- Parameters
model_name (
str
) – name of model to initialize, can be from [efficientnet-b0, …, efficientnet-b7].pretrained (
bool
) – whether to initialize pretrained ImageNet weights, only available for spatial_dims=2.progress (
bool
) – whether to show download progress for pretrained weights download.spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.num_classes (
int
) – number of output classes.
Examples:
# for pretrained spatial 2D ImageNet >>> image_size = get_efficientnet_image_size("efficientnet-b0") >>> inputs = torch.rand(1, 3, image_size, image_size) >>> model = EfficientNetBN("efficientnet-b0", pretrained=True) >>> model.eval() >>> outputs = model(inputs) # create spatial 2D >>> model = EfficientNetBN("efficientnet-b0", spatial_dims=2) # create spatial 3D >>> model = EfficientNetBN("efficientnet-b0", spatial_dims=3) # create EfficientNetB7 for spatial 2D >>> model = EfficientNetBN("efficientnet-b7", spatial_dims=2)
SegResNet¶
-
class
monai.networks.nets.
SegResNet
(spatial_dims=3, init_filters=8, in_channels=1, out_channels=2, dropout_prob=None, act=('RELU', {'inplace': True}), norm=('GROUP', {'num_groups': 8}), norm_name='', num_groups=8, use_conv_final=True, blocks_down=(1, 2, 2, 4), blocks_up=(1, 1, 1), upsample_mode=<UpsampleMode.NONTRAINABLE: 'nontrainable'>)[source]¶ SegResNet based on 3D MRI brain tumor segmentation using autoencoder regularization. The module does not include the variational autoencoder (VAE). The model supports 2D or 3D inputs.
- Parameters
spatial_dims (
int
) – spatial dimension of the input data. Defaults to 3.init_filters (
int
) – number of output channels for initial convolution layer. Defaults to 8.in_channels (
int
) – number of input channels for the network. Defaults to 1.out_channels (
int
) – number of output channels for the network. Defaults to 2.dropout_prob (
Optional
[float
]) – probability of an element to be zero-ed. Defaults toNone
.act (
Union
[Tuple
,str
]) – activation type and arguments. Defaults toRELU
.norm (
Union
[Tuple
,str
]) – feature normalization type and arguments. Defaults toGROUP
.norm_name (
str
) – deprecating option for feature normalization type.num_groups (
int
) – deprecating option for group norm. parameters.use_conv_final (
bool
) – if add a final convolution block to output. Defaults toTrue
.blocks_down (
tuple
) – number of down sample blocks in each layer. Defaults to[1,2,2,4]
.blocks_up (
tuple
) – number of up sample blocks in each layer. Defaults to[1,1,1]
.upsample_mode (
Union
[UpsampleMode
,str
]) –[
"deconv"
,"nontrainable"
,"pixelshuffle"
] The mode of upsampling manipulations. Using thenontrainable
modes cannot guarantee the model’s reproducibility. Defaults to``nontrainable``.deconv
, uses transposed convolution layers.nontrainable
, uses non-trainable linear interpolation.pixelshuffle
, usesmonai.networks.blocks.SubpixelUpsample
.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
SegResNetVAE¶
-
class
monai.networks.nets.
SegResNetVAE
(input_image_size, vae_estimate_std=False, vae_default_std=0.3, vae_nz=256, spatial_dims=3, init_filters=8, in_channels=1, out_channels=2, dropout_prob=None, act=('RELU', {'inplace': True}), norm=('GROUP', {'num_groups': 8}), use_conv_final=True, blocks_down=(1, 2, 2, 4), blocks_up=(1, 1, 1), upsample_mode=<UpsampleMode.NONTRAINABLE: 'nontrainable'>)[source]¶ SegResNetVAE based on 3D MRI brain tumor segmentation using autoencoder regularization. The module contains the variational autoencoder (VAE). The model supports 2D or 3D inputs.
- Parameters
input_image_size (
Sequence
[int
]) – the size of images to input into the network. It is used to determine the in_features of the fc layer in VAE.vae_estimate_std (
bool
) – whether to estimate the standard deviations in VAE. Defaults toFalse
.vae_default_std (
float
) – if not to estimate the std, use the default value. Defaults to 0.3.vae_nz (
int
) – number of latent variables in VAE. Defaults to 256. Where, 128 to represent mean, and 128 to represent std.spatial_dims (
int
) – spatial dimension of the input data. Defaults to 3.init_filters (
int
) – number of output channels for initial convolution layer. Defaults to 8.in_channels (
int
) – number of input channels for the network. Defaults to 1.out_channels (
int
) – number of output channels for the network. Defaults to 2.dropout_prob (
Optional
[float
]) – probability of an element to be zero-ed. Defaults toNone
.act (
Union
[str
,tuple
]) – activation type and arguments. Defaults toRELU
.norm (
Union
[Tuple
,str
]) – feature normalization type and arguments. Defaults toGROUP
.use_conv_final (
bool
) – if add a final convolution block to output. Defaults toTrue
.blocks_down (
tuple
) – number of down sample blocks in each layer. Defaults to[1,2,2,4]
.blocks_up (
tuple
) – number of up sample blocks in each layer. Defaults to[1,1,1]
.upsample_mode (
Union
[UpsampleMode
,str
]) –[
"deconv"
,"nontrainable"
,"pixelshuffle"
] The mode of upsampling manipulations. Using thenontrainable
modes cannot guarantee the model’s reproducibility. Defaults to``nontrainable``.deconv
, uses transposed convolution layers.nontrainable
, uses non-trainable linear interpolation.pixelshuffle
, usesmonai.networks.blocks.SubpixelUpsample
.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
SENet¶
-
class
monai.networks.nets.
SENet
(spatial_dims, in_channels, block, layers, groups, reduction, dropout_prob=0.2, dropout_dim=1, inplanes=128, downsample_kernel_size=3, input_3x3=True, num_classes=1000)[source]¶ SENet based on Squeeze-and-Excitation Networks. Adapted from Cadene Hub 2D version.
- Parameters
spatial_dims (
int
) – spatial dimension of the input data.in_channels (
int
) – channel number of the input data.block (
Type
[Union
[SEBottleneck
,SEResNetBottleneck
,SEResNeXtBottleneck
]]) – SEBlock class. for SENet154: SEBottleneck for SE-ResNet models: SEResNetBottleneck for SE-ResNeXt models: SEResNeXtBottlenecklayers (
Sequence
[int
]) – number of residual blocks for 4 layers of the network (layer1…layer4).groups (
int
) – number of groups for the 3x3 convolution in each bottleneck block. for SENet154: 64 for SE-ResNet models: 1 for SE-ResNeXt models: 32reduction (
int
) – reduction ratio for Squeeze-and-Excitation modules. for all models: 16dropout_prob (
Optional
[float
]) – drop probability for the Dropout layer. if None the Dropout layer is not used. for SENet154: 0.2 for SE-ResNet models: None for SE-ResNeXt models: Nonedropout_dim (
int
) – determine the dimensions of dropout. Defaults to 1. When dropout_dim = 1, randomly zeroes some of the elements for each channel. When dropout_dim = 2, Randomly zeroes out entire channels (a channel is a 2D feature map). When dropout_dim = 3, Randomly zeroes out entire channels (a channel is a 3D feature map).inplanes (
int
) – number of input channels for layer1. for SENet154: 128 for SE-ResNet models: 64 for SE-ResNeXt models: 64downsample_kernel_size (
int
) – kernel size for downsampling convolutions in layer2, layer3 and layer4. for SENet154: 3 for SE-ResNet models: 1 for SE-ResNeXt models: 1input_3x3 (
bool
) – If True, use three 3x3 convolutions instead of a single 7x7 convolution in layer0. - For SENet154: True - For SE-ResNet models: False - For SE-ResNeXt models: Falsenum_classes (
int
) – number of outputs in last_linear layer. for all models: 1000
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
SENet154¶
-
class
monai.networks.nets.
SENet154
(layers=(3, 8, 36, 3), groups=64, reduction=16, pretrained=False, progress=True, **kwargs)[source]¶ SENet154 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
SEResNet50¶
-
class
monai.networks.nets.
SEResNet50
(layers=(3, 4, 6, 3), groups=1, reduction=16, dropout_prob=None, inplanes=64, downsample_kernel_size=1, input_3x3=False, pretrained=False, progress=True, **kwargs)[source]¶ SEResNet50 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
SEResNet101¶
-
class
monai.networks.nets.
SEResNet101
(layers=(3, 4, 23, 3), groups=1, reduction=16, inplanes=64, downsample_kernel_size=1, input_3x3=False, pretrained=False, progress=True, **kwargs)[source]¶ SEResNet101 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
SEResNet152¶
-
class
monai.networks.nets.
SEResNet152
(layers=(3, 8, 36, 3), groups=1, reduction=16, inplanes=64, downsample_kernel_size=1, input_3x3=False, pretrained=False, progress=True, **kwargs)[source]¶ SEResNet152 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
SEResNext50¶
-
class
monai.networks.nets.
SEResNext50
(layers=(3, 4, 6, 3), groups=32, reduction=16, dropout_prob=None, inplanes=64, downsample_kernel_size=1, input_3x3=False, pretrained=False, progress=True, **kwargs)[source]¶ SEResNext50 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
SEResNext101¶
-
class
monai.networks.nets.
SEResNext101
(layers=(3, 4, 23, 3), groups=32, reduction=16, dropout_prob=None, inplanes=64, downsample_kernel_size=1, input_3x3=False, pretrained=False, progress=True, **kwargs)[source]¶ SEResNext101 based on Squeeze-and-Excitation Networks with optional pretrained support when spatial_dims is 2.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
HighResNet¶
-
class
monai.networks.nets.
HighResNet
(spatial_dims=3, in_channels=1, out_channels=1, norm_type=('batch', {'affine': True}), acti_type=('relu', {'inplace': True}), dropout_prob=0.0, layer_params=({'name': 'conv_0', 'n_features': 16, 'kernel_size': 3}, {'name': 'res_1', 'n_features': 16, 'kernels': (3, 3), 'repeat': 3}, {'name': 'res_2', 'n_features': 32, 'kernels': (3, 3), 'repeat': 3}, {'name': 'res_3', 'n_features': 64, 'kernels': (3, 3), 'repeat': 3}, {'name': 'conv_1', 'n_features': 80, 'kernel_size': 1}, {'name': 'conv_2', 'kernel_size': 1}), channel_matching=<ChannelMatching.PAD: 'pad'>)[source]¶ Reimplementation of highres3dnet based on Li et al., “On the compactness, efficiency, and representation of 3D convolutional networks: Brain parcellation as a pretext task”, IPMI ‘17
Adapted from: https://github.com/NifTK/NiftyNet/blob/v0.6.0/niftynet/network/highres3dnet.py https://github.com/fepegar/highresnet
- Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.norm_type (
Union
[str
,tuple
]) – feature normalization type and arguments. Defaults to("batch", {"affine": True})
.acti_type (
Union
[str
,tuple
]) – activation type and arguments. Defaults to("relu", {"inplace": True})
.dropout_prob (
Union
[Tuple
,str
,float
,None
]) – probability of the feature map to be zeroed (only applies to the penultimate conv layer).layer_params (
Sequence
[Dict
]) – specifying key parameters of each layer/block.channel_matching (
Union
[ChannelMatching
,str
]) –{
"pad"
,"project"
} Specifies handling residual branch and conv branch channel mismatches. Defaults to"pad"
."pad"
: with zero padding."project"
: with a trainable conv with kernel size one.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
-
class
monai.networks.nets.
HighResBlock
(spatial_dims, in_channels, out_channels, kernels=(3, 3), dilation=1, norm_type=('batch', {'affine': True}), acti_type=('relu', {'inplace': True}), channel_matching=<ChannelMatching.PAD: 'pad'>)[source]¶ - Parameters
spatial_dims (
int
) – number of spatial dimensions of the input image.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernels (
Sequence
[int
]) – each integer k in kernels corresponds to a convolution layer with kernel size k.dilation (
Union
[Sequence
[int
],int
]) – spacing between kernel elements.norm_type (
Union
[Tuple
,str
]) – feature normalization type and arguments. Defaults to("batch", {"affine": True})
.acti_type (
Union
[Tuple
,str
]) – {"relu"
,"prelu"
,"relu6"
} Non-linear activation using ReLU or PReLU. Defaults to"relu"
.channel_matching (
Union
[ChannelMatching
,str
]) –{
"pad"
,"project"
} Specifies handling residual branch and conv branch channel mismatches. Defaults to"pad"
."pad"
: with zero padding."project"
: with a trainable conv with kernel size one.
- Raises
ValueError – When
channel_matching=pad
andin_channels > out_channels
. Incompatible values.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
DynUNet¶
-
class
monai.networks.nets.
DynUNet
(spatial_dims, in_channels, out_channels, kernel_size, strides, upsample_kernel_size, norm_name=('INSTANCE', {'affine': True}), deep_supervision=False, deep_supr_num=1, res_block=False)[source]¶ This reimplementation of a dynamic UNet (DynUNet) is based on: Automated Design of Deep Learning Methods for Biomedical Image Segmentation. nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation.
This model is more flexible compared with
monai.networks.nets.UNet
in three places:Residual connection is supported in conv blocks.
Anisotropic kernel sizes and strides can be used in each layers.
Deep supervision heads can be added.
The model supports 2D or 3D inputs and is consisted with four kinds of blocks: one input block, n downsample blocks, one bottleneck and n+1 upsample blocks. Where, n>0. The first and last kernel and stride values of the input sequences are used for input block and bottleneck respectively, and the rest value(s) are used for downsample and upsample blocks. Therefore, pleasure ensure that the length of input sequences (
kernel_size
andstrides
) is no less than 3 in order to have at least one downsample upsample blocks.- Parameters
spatial_dims (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.kernel_size (
Sequence
[Union
[Sequence
[int
],int
]]) – convolution kernel size.strides (
Sequence
[Union
[Sequence
[int
],int
]]) – convolution strides for each blocks.upsample_kernel_size (
Sequence
[Union
[Sequence
[int
],int
]]) – convolution kernel size for transposed convolution layers.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments. Defaults toINSTANCE
.deep_supervision (
bool
) – whether to add deep supervision head before output. Defaults toFalse
. IfTrue
, in training mode, the forward function will output not only the last feature map, but also the previous feature maps that come from the intermediate up sample layers. In order to unify the return type (the restriction of TorchScript), all intermediate feature maps are interpolated into the same size as the last feature map and stacked together (with a new dimension in the first axis)into one single tensor. For instance, if there are three feature maps with shapes: (1, 2, 32, 24), (1, 2, 16, 12) and (1, 2, 8, 6). The last two will be interpolated into (1, 2, 32, 24), and the stacked tensor will has the shape (1, 3, 2, 8, 6). When calculating the loss, you can use torch.unbind to get all feature maps can compute the loss one by one with the ground truth, then do a weighted average for all losses to achieve the final loss. (To be added: a corresponding tutorial link)deep_supr_num (
int
) – number of feature maps that will output during deep supervision head. The value should be larger than 0 and less than the number of up sample layers. Defaults to 1.res_block (
bool
) – whether to use residual connection based convolution blocks during the network. Defaults toFalse
.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
monai.networks.nets.
DynUnet
¶ alias of
monai.networks.nets.dynunet.DynUNet
-
monai.networks.nets.
Dynunet
¶ alias of
monai.networks.nets.dynunet.DynUNet
UNet¶
-
class
monai.networks.nets.
UNet
(dimensions, in_channels, out_channels, channels, strides, kernel_size=3, up_kernel_size=3, num_res_units=0, act='PRELU', norm='INSTANCE', dropout=0.0)[source]¶ Enhanced version of UNet which has residual units implemented with the ResidualUnit class. The residual part uses a convolution to change the input dimensions to match the output dimensions if this is necessary but will use nn.Identity if not. Refer to: https://link.springer.com/chapter/10.1007/978-3-030-12029-0_40.
- Parameters
dimensions (
int
) – number of spatial dimensions.in_channels (
int
) – number of input channels.out_channels (
int
) – number of output channels.channels (
Sequence
[int
]) – sequence of channels. Top block first.strides (
Sequence
[int
]) – convolution stride.kernel_size (
Union
[Sequence
[int
],int
]) – convolution kernel size. Defaults to 3.up_kernel_size (
Union
[Sequence
[int
],int
]) – upsampling convolution kernel size. Defaults to 3.num_res_units (
int
) – number of residual units. Defaults to 0.act – activation type and arguments. Defaults to PReLU.
norm – feature normalization type and arguments. Defaults to instance norm.
dropout – dropout ratio. Defaults to no dropout.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
-
monai.networks.nets.
Unet
¶ alias of
monai.networks.nets.unet.UNet
-
monai.networks.nets.
unet
¶ alias of
monai.networks.nets.unet.UNet
UNETR¶
-
class
monai.networks.nets.
UNETR
(in_channels, out_channels, img_size, feature_size, hidden_size, mlp_dim, num_heads, pos_embed, norm_name, conv_block=False, res_block=False, dropout_rate=0.0)[source]¶ UNETR based on: “Hatamizadeh et al., UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>”
- Parameters
in_channels (
int
) – dimension of input channels.out_channels (
int
) – dimension of output channels.img_size (
Tuple
[int
,int
,int
]) – dimension of input image.feature_size (
int
) – dimension of network feature size.hidden_size (
int
) – dimension of hidden layer.mlp_dim (
int
) – dimension of feedforward layer.num_heads (
int
) – number of attention heads.pos_embed (
str
) – position embedding layer type.norm_name (
Union
[Tuple
,str
]) – feature normalization type and arguments.conv_block (
bool
) – bool argument to determine if convolutional block is used.res_block (
bool
) – bool argument to determine if residual block is used.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x_in)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
BasicUNet¶
-
class
monai.networks.nets.
BasicUNet
(dimensions=3, in_channels=1, out_channels=2, features=(32, 32, 64, 128, 256, 32), act=('LeakyReLU', {'negative_slope': 0.1, 'inplace': True}), norm=('instance', {'affine': True}), dropout=0.0, upsample='deconv')[source]¶ A UNet implementation with 1D/2D/3D supports.
Based on:
Falk et al. “U-Net – Deep Learning for Cell Counting, Detection, and Morphometry”. Nature Methods 16, 67–70 (2019), DOI: http://dx.doi.org/10.1038/s41592-018-0261-2
- Parameters
dimensions (
int
) – number of spatial dimensions. Defaults to 3 for spatial 3D inputs.in_channels (
int
) – number of input channels. Defaults to 1.out_channels (
int
) – number of output channels. Defaults to 2.features (
Sequence
[int
]) –six integers as numbers of features. Defaults to
(32, 32, 64, 128, 256, 32)
,the first five values correspond to the five-level encoder feature sizes.
the last value corresponds to the feature size after the last upsampling.
act (
Union
[str
,tuple
]) – activation type and arguments. Defaults to LeakyReLU.norm (
Union
[str
,tuple
]) – feature normalization type and arguments. Defaults to instance norm.dropout (
Union
[float
,tuple
]) – dropout ratio. Defaults to no dropout.upsample (
str
) – upsampling mode, available options are"deconv"
,"pixelshuffle"
,"nontrainable"
.
Examples:
# for spatial 2D >>> net = BasicUNet(dimensions=2, features=(64, 128, 256, 512, 1024, 128)) # for spatial 2D, with group norm >>> net = BasicUNet(dimensions=2, features=(64, 128, 256, 512, 1024, 128), norm=("group", {"num_groups": 4})) # for spatial 3D >>> net = BasicUNet(dimensions=3, features=(32, 32, 64, 128, 256, 32))
See Also
-
forward
(x)[source]¶ - Parameters
x (
Tensor
) – input should have spatially N dimensions(Batch, in_channels, dim_0[, dim_1, ..., dim_N])
, N is defined by dimensions. It is recommended to havedim_n % 16 == 0
to ensure all maxpooling inputs have even edge lengths.- Returns
A torch Tensor of “raw” predictions in shape
(Batch, out_channels, dim_0[, dim_1, ..., dim_N])
.
-
monai.networks.nets.
BasicUnet
¶ alias of
monai.networks.nets.basic_unet.BasicUNet
-
monai.networks.nets.
Basicunet
¶ alias of
monai.networks.nets.basic_unet.BasicUNet
VNet¶
-
class
monai.networks.nets.
VNet
(spatial_dims=3, in_channels=1, out_channels=1, act=('elu', {'inplace': True}), dropout_prob=0.5, dropout_dim=3)[source]¶ V-Net based on Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. Adapted from the official Caffe implementation. and another pytorch implementation. The model supports 2D or 3D inputs.
- Parameters
spatial_dims (
int
) – spatial dimension of the input data. Defaults to 3.in_channels (
int
) – number of input channels for the network. Defaults to 1. The value should meet the condition that16 % in_channels == 0
.out_channels (
int
) – number of output channels for the network. Defaults to 1.act (
Union
[Tuple
[str
,Dict
],str
]) – activation type in the network. Defaults to("elu", {"inplace": True})
.dropout_prob (
float
) – dropout ratio. Defaults to 0.5. Defaults to 3.dropout_dim (
int
) –determine the dimensions of dropout. Defaults to 3.
dropout_dim = 1
, randomly zeroes some of the elements for each channel.dropout_dim = 2
, Randomly zeroes out entire channels (a channel is a 2D feature map).dropout_dim = 3
, Randomly zeroes out entire channels (a channel is a 3D feature map).
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
RegUNet¶
-
class
monai.networks.nets.
RegUNet
(spatial_dims, in_channels, num_channel_initial, depth, out_kernel_initializer='kaiming_uniform', out_activation=None, out_channels=3, extract_levels=None, pooling=True, concat_skip=False, encode_kernel_sizes=3)[source]¶ Class that implements an adapted UNet. This class also serve as the parent class of LocalNet and GlobalNet
- Reference:
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,”, Lecture Notes in Computer Science, 2015, vol. 9351, pp. 234–241. https://arxiv.org/abs/1505.04597
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
- Parameters
spatial_dims (
int
) – number of spatial dimsin_channels (
int
) – number of input channelsnum_channel_initial (
int
) – number of initial channelsdepth (
int
) – input is at level 0, bottom is at level depth.out_kernel_initializer (
Optional
[str
]) – kernel initializer for the last layerout_activation (
Optional
[str
]) – activation at the last layerout_channels (
int
) – number of channels for the outputextract_levels (
Optional
[Tuple
[int
]]) – list, which levels from net to extract. The maximum level must equal todepth
pooling (
bool
) – for down-sampling, use non-parameterized pooling if true, otherwise use conv3dconcat_skip (
bool
) – when up-sampling, concatenate skipped tensor if true, otherwise use additionencode_kernel_sizes (
Union
[int
,List
[int
]]) – kernel size for down-sampling
GlobalNet¶
-
class
monai.networks.nets.
GlobalNet
(image_size, spatial_dims, in_channels, num_channel_initial, depth, out_kernel_initializer='kaiming_uniform', out_activation=None, pooling=True, concat_skip=False, encode_kernel_sizes=3)[source]¶ Build GlobalNet for image registration.
- Reference:
Hu, Yipeng, et al. “Label-driven weakly-supervised learning for multimodal deformable image registration,” https://arxiv.org/abs/1711.01666
Args: spatial_dims: number of spatial dims in_channels: number of input channels num_channel_initial: number of initial channels depth: input is at level 0, bottom is at level depth. out_kernel_initializer: kernel initializer for the last layer out_activation: activation at the last layer out_channels: number of channels for the output extract_levels: list, which levels from net to extract. The maximum level must equal to
depth
pooling: for down-sampling, use non-parameterized pooling if true, otherwise use conv3d concat_skip: when up-sampling, concatenate skipped tensor if true, otherwise use addition encode_kernel_sizes: kernel size for down-sampling
LocalNet¶
-
class
monai.networks.nets.
LocalNet
(spatial_dims, in_channels, num_channel_initial, extract_levels, out_kernel_initializer='kaiming_uniform', out_activation=None, out_channels=3, pooling=True, concat_skip=False)[source]¶ Reimplementation of LocalNet, based on: Weakly-supervised convolutional neural networks for multimodal image registration. Label-driven weakly-supervised learning for multimodal deformable image registration.
- Adapted from:
DeepReg (https://github.com/DeepRegNet/DeepReg)
- Parameters
spatial_dims (
int
) – number of spatial dimsin_channels (
int
) – number of input channelsnum_channel_initial (
int
) – number of initial channelsout_kernel_initializer (
Optional
[str
]) – kernel initializer for the last layerout_activation (
Optional
[str
]) – activation at the last layerout_channels (
int
) – number of channels for the outputextract_levels (
Tuple
[int
]) – list, which levels from net to extract. The maximum level must equal todepth
pooling (
bool
) – for down-sampling, use non-parameterized pooling if true, otherwise use conv3dconcat_skip (
bool
) – when up-sampling, concatenate skipped tensor if true, otherwise use addition
AutoEncoder¶
-
class
monai.networks.nets.
AutoEncoder
(dimensions, in_channels, out_channels, channels, strides, kernel_size=3, up_kernel_size=3, num_res_units=0, inter_channels=None, inter_dilations=None, num_inter_units=2, act='PRELU', norm='INSTANCE', dropout=None)[source]¶ Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Any
-
VarAutoEncoder¶
-
class
monai.networks.nets.
VarAutoEncoder
(dimensions, in_shape, out_channels, latent_size, channels, strides, kernel_size=3, up_kernel_size=3, num_res_units=0, inter_channels=None, inter_dilations=None, num_inter_units=2, act='PRELU', norm='INSTANCE', dropout=None)[source]¶ Variational Autoencoder based on the paper - https://arxiv.org/abs/1312.6114
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tuple
[Tensor
,Tensor
,Tensor
,Tensor
]
-
ViT¶
-
class
monai.networks.nets.
ViT
(in_channels, img_size, patch_size, hidden_size, mlp_dim, num_layers, num_heads, pos_embed, classification, num_classes=2, dropout_rate=0.0)[source]¶ Vision Transformer (ViT), based on: “Dosovitskiy et al., An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>”
- Parameters
in_channels (
int
) – dimension of input channels.img_size (
Tuple
[int
,int
,int
]) – dimension of input image.patch_size (
Tuple
[int
,int
,int
]) – dimension of patch size.hidden_size (
int
) – dimension of hidden layer.mlp_dim (
int
) – dimension of feedforward layer.num_layers (
int
) – number of transformer blocks.num_heads (
int
) – number of attention heads.pos_embed (
str
) – position embedding layer type.classification (
bool
) – bool argument to determine if classification is used.num_classes (
int
) – number of classes if classification is used.dropout_rate (
float
) – faction of the input units to drop.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
FullyConnectedNet¶
-
class
monai.networks.nets.
FullyConnectedNet
(in_channels, out_channels, hidden_channels, dropout=None, act='PRELU', bias=True, adn_ordering=None)[source]¶ Plain full-connected layer neural network
The network uses dropout and, by default, PReLU activation
Defines a network accept input with in_channels channels, output of out_channels channels, and hidden layers with channels given in hidden_channels. If bias is True then linear units have a bias term.
Generator¶
-
class
monai.networks.nets.
Generator
(latent_shape, start_shape, channels, strides, kernel_size=3, num_res_units=2, act='PRELU', norm='INSTANCE', dropout=None, bias=True)[source]¶ Defines a simple generator network accepting a latent vector and through a sequence of convolution layers constructs an output tensor of greater size and high dimensionality. The method _get_layer is used to create each of these layers, override this method to define layers beyond the default Convolution or ResidualUnit layers.
For example, a generator accepting a latent vector if shape (42,24) and producing an output volume of shape (1,64,64) can be constructed as:
gen = Generator((42, 24), (64, 8, 8), (32, 16, 1), (2, 2, 2))
Construct the generator network with the number of layers defined by channels and strides. In the forward pass a nn.Linear layer relates the input latent vector to a tensor of dimensions start_shape, this is then fed forward through the sequence of convolutional layers. The number of layers is defined by the length of channels and strides which must match, each layer having the number of output channels given in channels and an upsample factor given in strides (ie. a transpose convolution with that stride size).
- Parameters
latent_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the input latent vector (minus batch dimension)start_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the tensor to pass to convolution subnetworkchannels (
Sequence
[int
]) – tuple of integers stating the output channels of each convolutional layerstrides (
Sequence
[int
]) – tuple of integers stating the stride (upscale factor) of each convolutional layerkernel_size (
Union
[Sequence
[int
],int
]) – integer or tuple of integers stating size of convolutional kernelsnum_res_units (
int
) – integer stating number of convolutions in residual units, 0 means no residual unitsact – name or type defining activation layers
norm – name or type defining normalization layers
dropout (
Optional
[float
]) – optional float value in range [0, 1] stating dropout probability for layers, None for no dropoutbias (
bool
) – boolean stating if convolution layers should have a bias component
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
Regressor¶
-
class
monai.networks.nets.
Regressor
(in_shape, out_shape, channels, strides, kernel_size=3, num_res_units=2, act='PRELU', norm='INSTANCE', dropout=None, bias=True)[source]¶ This defines a network for relating large-sized input tensors to small output tensors, ie. regressing large values to a prediction. An output of a single dimension can be used as value regression or multi-label classification prediction, an output of a single value can be used as a discriminator or critic prediction.
Construct the regressor network with the number of layers defined by channels and strides. Inputs are first passed through the convolutional layers in the forward pass, the output from this is then pass through a fully connected layer to relate them to the final output tensor.
- Parameters
in_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the input tensor (minus batch dimension)out_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the final output tensorchannels (
Sequence
[int
]) – tuple of integers stating the output channels of each convolutional layerstrides (
Sequence
[int
]) – tuple of integers stating the stride (downscale factor) of each convolutional layerkernel_size (
Union
[Sequence
[int
],int
]) – integer or tuple of integers stating size of convolutional kernelsnum_res_units (
int
) – integer stating number of convolutions in residual units, 0 means no residual unitsact – name or type defining activation layers
norm – name or type defining normalization layers
dropout (
Optional
[float
]) – optional float value in range [0, 1] stating dropout probability for layers, None for no dropoutbias (
bool
) – boolean stating if convolution layers should have a bias component
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
Classifier¶
-
class
monai.networks.nets.
Classifier
(in_shape, classes, channels, strides, kernel_size=3, num_res_units=2, act='PRELU', norm='INSTANCE', dropout=None, bias=True, last_act=None)[source]¶ Defines a classification network from Regressor by specifying the output shape as a single dimensional tensor with size equal to the number of classes to predict. The final activation function can also be specified, eg. softmax or sigmoid.
- Parameters
in_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the input tensor (minus batch dimension)classes (
int
) – integer stating the dimension of the final output tensorchannels (
Sequence
[int
]) – tuple of integers stating the output channels of each convolutional layerstrides (
Sequence
[int
]) – tuple of integers stating the stride (downscale factor) of each convolutional layerkernel_size (
Union
[Sequence
[int
],int
]) – integer or tuple of integers stating size of convolutional kernelsnum_res_units (
int
) – integer stating number of convolutions in residual units, 0 means no residual unitsact – name or type defining activation layers
norm – name or type defining normalization layers
dropout (
Optional
[float
]) – optional float value in range [0, 1] stating dropout probability for layers, None for no dropoutbias (
bool
) – boolean stating if convolution layers should have a bias componentlast_act (
Optional
[str
]) – name defining the last activation layer
Discriminator¶
-
class
monai.networks.nets.
Discriminator
(in_shape, channels, strides, kernel_size=3, num_res_units=2, act='PRELU', norm='INSTANCE', dropout=0.25, bias=True, last_act='SIGMOID')[source]¶ Defines a discriminator network from Classifier with a single output value and sigmoid activation by default. This is meant for use with GANs or other applications requiring a generic discriminator network.
- Parameters
in_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the input tensor (minus batch dimension)channels (
Sequence
[int
]) – tuple of integers stating the output channels of each convolutional layerstrides (
Sequence
[int
]) – tuple of integers stating the stride (downscale factor) of each convolutional layerkernel_size (
Union
[Sequence
[int
],int
]) – integer or tuple of integers stating size of convolutional kernelsnum_res_units (
int
) – integer stating number of convolutions in residual units, 0 means no residual unitsact – name or type defining activation layers
norm – name or type defining normalization layers
dropout (
Optional
[float
]) – optional float value in range [0, 1] stating dropout probability for layers, None for no dropoutbias (
bool
) – boolean stating if convolution layers should have a bias componentlast_act – name defining the last activation layer
Critic¶
-
class
monai.networks.nets.
Critic
(in_shape, channels, strides, kernel_size=3, num_res_units=2, act='PRELU', norm='INSTANCE', dropout=0.25, bias=True)[source]¶ Defines a critic network from Classifier with a single output value and no final activation. The final layer is nn.Flatten instead of nn.Linear, the final result is computed as the mean over the first dimension. This is meant to be used with Wasserstein GANs.
- Parameters
in_shape (
Sequence
[int
]) – tuple of integers stating the dimension of the input tensor (minus batch dimension)channels (
Sequence
[int
]) – tuple of integers stating the output channels of each convolutional layerstrides (
Sequence
[int
]) – tuple of integers stating the stride (downscale factor) of each convolutional layerkernel_size (
Union
[Sequence
[int
],int
]) – integer or tuple of integers stating size of convolutional kernelsnum_res_units (
int
) – integer stating number of convolutions in residual units, 0 means no residual unitsact – name or type defining activation layers
norm – name or type defining normalization layers
dropout (
Optional
[float
]) – optional float value in range [0, 1] stating dropout probability for layers, None for no dropoutbias (
bool
) – boolean stating if convolution layers should have a bias component
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type
Tensor
NetAdapter¶
-
class
monai.networks.nets.
NetAdapter
(model, n_classes=1, dim=2, in_channels=None, use_conv=False, pool=('avg', {'kernel_size': 7, 'stride': 1}), bias=True)[source]¶ Wrapper to replace the last layer of model by convolutional layer or FC layer. This module expects the output of model layers[0: -2] is a feature map with shape [B, C, spatial dims], then replace the model’s last two layers with an optional pooling and a conv or linear layer.
- Parameters
model (
Module
) – a PyTorch model, support both 2D and 3D models. typically, it can be a pretrained model in Torchvision, like:resnet18
,resnet34m
,resnet50
,resnet101
,resnet152
, etc. more details: https://pytorch.org/vision/stable/models.html.n_classes (
int
) – number of classes for the last classification layer. Default to 1.dim (
int
) – number of spatial dimensions, default to 2.in_channels (
Optional
[int
]) – number of the input channels of last layer. if None, get it from in_features of last layer.use_conv (
bool
) – whether use convolutional layer to replace the last layer, default to False.pool (
Optional
[Tuple
[str
,Dict
[str
,Any
]]]) – parameters for the pooling layer, it should be a tuple, the first item is name of the pooling layer, the second item is dictionary of the initialization args. if None, will not replace the layers[-2]. default to (“avg”, {“kernel_size”: 7, “stride”: 1}).bias (
bool
) – the bias value when replacing the last layer. if False, the layer will not learn an additive bias, default to True.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
TorchVisionFCModel¶
-
class
monai.networks.nets.
TorchVisionFCModel
(model_name='resnet18', n_classes=1, dim=2, in_channels=None, use_conv=False, pool=('avg', {'kernel_size': 7, 'stride': 1}), bias=True, pretrained=False)[source]¶ Customize the fully connected layer of TorchVision model or replace it by convolutional layer.
- Parameters
model_name (
str
) – name of any torchvision model with fully connected layer at the end.resnet18
(default),resnet34m
,resnet50
,resnet101
,resnet152
,resnext50_32x4d
,resnext101_32x8d
,wide_resnet50_2
,wide_resnet101_2
. model details: https://pytorch.org/vision/stable/models.html.n_classes (
int
) – number of classes for the last classification layer. Default to 1.dim (
int
) – number of spatial dimensions, default to 2.in_channels (
Optional
[int
]) – number of the input channels of last layer. if None, get it from in_features of last layer.use_conv (
bool
) – whether use convolutional layer to replace the last layer, default to False.pool (
Optional
[Tuple
[str
,Dict
[str
,Any
]]]) – parameters for the pooling layer, it should be a tuple, the first item is name of the pooling layer, the second item is dictionary of the initialization args. if None, will not replace the layers[-2]. default to (“avg”, {“kernel_size”: 7, “stride”: 1}).bias (
bool
) – the bias value when replacing the last layer. if False, the layer will not learn an additive bias, default to True.pretrained (
bool
) – whether to use the imagenet pretrained weights. Default to False.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
TorchVisionFullyConvModel¶
-
class
monai.networks.nets.
TorchVisionFullyConvModel
(model_name='resnet18', n_classes=1, pool_size=(7, 7), pool_stride=1, pretrained=False)[source]¶ Customize TorchVision models to replace fully connected layer by convolutional layer.
- Parameters
model_name (
str
) – name of any torchvision with adaptive avg pooling and fully connected layer at the end.resnet18
(default),resnet34m
,resnet50
,resnet101
,resnet152
,resnext50_32x4d
,resnext101_32x8d
,wide_resnet50_2
,wide_resnet101_2
.n_classes (
int
) – number of classes for the last classification layer. Default to 1.pool_size (
Union
[int
,Tuple
[int
,int
]]) – the kernel size for AvgPool2d to replace AdaptiveAvgPool2d. Default to (7, 7).pool_stride (
Union
[int
,Tuple
[int
,int
]]) – the stride for AvgPool2d to replace AdaptiveAvgPool2d. Default to 1.pretrained (
bool
) – whether to use the imagenet pretrained weights. Default to False.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Utilities¶
Utilities and types for defining networks, these depend on PyTorch.
-
monai.networks.utils.
copy_model_state
(dst, src, dst_prefix='', mapping=None, exclude_vars=None, inplace=True)[source]¶ Compute a module state_dict, of which the keys are the same as dst. The values of dst are overwritten by the ones from src whenever their keys match. The method provides additional dst_prefix for the dst key when matching them. mapping can be a {“src_key”: “dst_key”} dict, indicating dst[dst_prefix + dst_key] = src[src_key]. This function is mainly to return a model state dict for loading the src model state into the dst model, src and dst can have different dict keys, but their corresponding values normally have the same shape.
- Parameters
dst (
Union
[Module
,Mapping
]) – a pytorch module or state dict to be updated.src (
Union
[Module
,Mapping
]) – a pytorch module or state dist used to get the values used for the update.dst_prefix – dst key prefix, so that dst[dst_prefix + src_key] will be assigned to the value of src[src_key].
mapping – a {“src_key”: “dst_key”} dict, indicating that dst[dst_prefix + dst_key] to be assigned to the value of src[src_key].
exclude_vars – a regular expression to match the dst variable names, so that their values are not overwritten by src.
inplace – whether to set the dst module with the updated state_dict via load_state_dict. This option is only available when dst is a torch.nn.Module.
Examples
from monai.networks.nets import BasicUNet from monai.networks.utils import copy_model_state model_a = BasicUNet(in_channels=1, out_channels=4) model_b = BasicUNet(in_channels=1, out_channels=2) model_a_b, changed, unchanged = copy_model_state( model_a, model_b, exclude_vars="conv_0.conv_0", inplace=False) # dst model updated: 76 of 82 variables. model_a.load_state_dict(model_a_b) # <All keys matched successfully>
Returns: an OrderedDict of the updated dst state, the changed, and unchanged keys.
-
monai.networks.utils.
eval_mode
(*nets)[source]¶ Set network(s) to eval mode and then return to original state at the end.
- Parameters
nets (
Module
) – Input network(s)
Examples
t=torch.rand(1,1,16,16) p=torch.nn.Conv2d(1,1,3) print(p.training) # True with eval_mode(p): print(p.training) # False print(p(t).sum().backward()) # will correctly raise an exception as gradients are calculated
-
monai.networks.utils.
icnr_init
(conv, upsample_factor, init=<function kaiming_normal_>)[source]¶ ICNR initialization for 2D/3D kernels adapted from Aitken et al.,2017 , “Checkerboard artifact free sub-pixel convolution”.
-
monai.networks.utils.
normal_init
(m, std=0.02, normal_func=<function normal_>)[source]¶ Initialize the weight and bias tensors of m’ and its submodules to values from a normal distribution with a stddev of `std’. Weight tensors of convolution and linear modules are initialized with a mean of 0, batch norm modules with a mean of 1. The callable `normal_func’, used to assign values, should have the same arguments as its default normal_(). This can be used with `nn.Module.apply to visit submodules of a network.
- Return type
None
-
monai.networks.utils.
normalize_transform
(shape, device=None, dtype=None, align_corners=False)[source]¶ Compute an affine matrix according to the input shape. The transform normalizes the homogeneous image coordinates to the range of [-1, 1].
- Parameters
shape (
Sequence
[int
]) – input spatial shapedevice (
Optional
[device
]) – device on which the returned affine will be allocated.dtype (
Optional
[dtype
]) – data type of the returned affinealign_corners (
bool
) – if True, consider -1 and 1 to refer to the centers of the corner pixels rather than the image corners. See also: https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.grid_sample
- Return type
Tensor
-
monai.networks.utils.
one_hot
(labels, num_classes, dtype=torch.float32, dim=1)[source]¶ For every value v in labels, the value in the output will be either 1 or 0. Each vector along the dim-th dimension has the “one-hot” format, i.e., it has a total length of num_classes, with a one and num_class-1 zeros. Note that this will include the background label, thus a binary mask should be treated as having two classes.
- Parameters
labels (
Tensor
) – input tensor of integers to be converted into the ‘one-hot’ format. Internally labels will be converted into integers labels.long().num_classes (
int
) – number of output channels, the corresponding length of labels[dim] will be converted to num_classes from 1.dtype (
dtype
) – the data type of the output one_hot label.dim (
int
) – the dimension to be converted to num_classes channels from 1 channel, should be non-negative number.
Example:
For a tensor labels of dimensions [B]1[spatial_dims], return a tensor of dimensions [B]N[spatial_dims] when num_classes=N number of classes and dim=1.
from monai.networks.utils import one_hot import torch a = torch.randint(0, 2, size=(1, 2, 2, 2)) out = one_hot(a, num_classes=2, dim=0) print(out.shape) # torch.Size([2, 2, 2, 2]) a = torch.randint(0, 2, size=(2, 1, 2, 2, 2)) out = one_hot(a, num_classes=2, dim=1) print(out.shape) # torch.Size([2, 2, 2, 2, 2])
- Return type
Tensor
-
monai.networks.utils.
pixelshuffle
(x, dimensions, scale_factor)[source]¶ Apply pixel shuffle to the tensor x with spatial dimensions dimensions and scaling factor scale_factor.
See: Shi et al., 2016, “Real-Time Single Image and Video Super-Resolution Using a nEfficient Sub-Pixel Convolutional Neural Network.”
See: Aitken et al., 2017, “Checkerboard artifact free sub-pixel convolution”.
- Parameters
x (
Tensor
) – Input tensordimensions (
int
) – number of spatial dimensions, typically 2 or 3 for 2D or 3Dscale_factor (
int
) – factor to rescale the spatial dimensions by, must be >=1
- Return type
Tensor
- Returns
Reshuffled version of x.
- Raises
ValueError – When input channels of x are not divisible by (scale_factor ** dimensions)
-
monai.networks.utils.
predict_segmentation
(logits, mutually_exclusive=False, threshold=0.0)[source]¶ Given the logits from a network, computing the segmentation by thresholding all values above 0 if multi-labels task, computing the argmax along the channel axis if multi-classes task, logits has shape BCHW[D].
- Parameters
logits (
Tensor
) – raw data of model output.mutually_exclusive (
bool
) – if True, logits will be converted into a binary matrix using a combination of argmax, which is suitable for multi-classes task. Defaults to False.threshold (
float
) – thresholding the prediction values if multi-labels task.
- Return type
Any
-
monai.networks.utils.
to_norm_affine
(affine, src_size, dst_size, align_corners=False)[source]¶ Given
affine
defined for coordinates in the pixel space, compute the corresponding affine for the normalized coordinates.- Parameters
affine (
Tensor
) – Nxdxd batched square matrixsrc_size (
Sequence
[int
]) – source image spatial shapedst_size (
Sequence
[int
]) – target image spatial shapealign_corners (
bool
) – if True, consider -1 and 1 to refer to the centers of the corner pixels rather than the image corners. See also: https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.grid_sample
- Raises
TypeError – When
affine
is not atorch.Tensor
.ValueError – When
affine
is not Nxdxd.ValueError – When
src_size
ordst_size
dimensions differ fromaffine
.
- Return type
Tensor
-
monai.networks.utils.
train_mode
(*nets)[source]¶ Set network(s) to train mode and then return to original state at the end.
- Parameters
nets (
Module
) – Input network(s)
Examples
t=torch.rand(1,1,16,16) p=torch.nn.Conv2d(1,1,3) p.eval() print(p.training) # False with train_mode(p): print(p.training) # True print(p(t).sum().backward()) # No exception