Data¶

Generic Interfaces¶

Dataset¶

class monai.data.Dataset(data, transform=None)[source]¶

A generic dataset with a length property and an optional callable data transform when fetching a data sample. For example, typical input data can be a list of dictionaries:

[{                            {                            {
     'img': 'image1.nii.gz',      'img': 'image2.nii.gz',      'img': 'image3.nii.gz',
     'seg': 'label1.nii.gz',      'seg': 'label2.nii.gz',      'seg': 'label3.nii.gz',
     'extra': 123                 'extra': 456                 'extra': 789
 },                           },                           }]

Parameters

data (Sequence) – input data to load and transform to generate dataset for model.
transform (Optional[Callable]) – a callable data transform on input data.

PersistentDataset¶

class monai.data.PersistentDataset(data, transform, cache_dir=None)[source]¶

Persistent storage of pre-computed values to efficiently manage larger than memory dictionary format data, it can operate transforms for specific fields. Results from the non-random transform components are computed when first used, and stored in the cache_dir for rapid retrieval on subsequent uses.

For example, typical input data can be a list of dictionaries:

[{                            {                            {
     'img': 'image1.nii.gz',      'img': 'image2.nii.gz',      'img': 'image3.nii.gz',
     'seg': 'label1.nii.gz',      'seg': 'label2.nii.gz',      'seg': 'label3.nii.gz',
     'extra': 123                 'extra': 456                 'extra': 789
 },                           },                           }]

For a composite transform like

[ LoadNiftid(keys=['image', 'label']),
  Orientationd(keys=['image', 'label'], axcodes='RAS'),
  ScaleIntensityRanged(keys=['image'], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True),
  RandCropByPosNegLabeld(keys=['image', 'label'], label_key='label', spatial_size=(96, 96, 96),
                         pos=1, neg=1, num_samples=4, image_key='image', image_threshold=0),
  ToTensord(keys=['image', 'label'])]

Upon first use a filename based dataset will be processed by the transform for the [LoadNiftid, Orientationd, ScaleIntensityRanged] and the resulting tensor written to the cache_dir before applying the remaining random dependant transforms [RandCropByPosNegLabeld, ToTensord] elements for use in the analysis.

Subsequent uses of a dataset directly read pre-processed results from cache_dir followed by applying the random dependant parts of transform processing.

Note

The input data must be a list of file paths and will hash them as cache keys.

Parameters

data (Sequence[str]) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of file paths and hashes them as cache keys.
transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.
cache_dir (Union[Path, str, None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If the cache_dir doesn’t exist, will automatically create it.

CacheDataset¶

class monai.data.CacheDataset(data, transform, cache_num=9223372036854775807, cache_rate=1.0, num_workers=0)[source]¶

Dataset with cache mechanism that can load data and cache deterministic transforms’ result during training.

By caching the results of non-random preprocessing transforms, it accelerates the training data pipeline. If the requested data is not in the cache, all transforms will run normally (see also monai.data.dataset.Dataset).

Users can set the cache rate or number of items to cache. It is recommended to experiment with different cache_num or cache_rate to identify the best training speed.

To improve the caching efficiency, please always put as many as possible non-random transforms before the randomized ones when composing the chain of transforms.

For example, if the transform is a Compose of:

transforms = Compose([
    LoadNiftid(),
    AddChanneld(),
    Spacingd(),
    Orientationd(),
    ScaleIntensityRanged(),
    RandCropByPosNegLabeld(),
    ToTensord()
])

when transforms is used in a multi-epoch training pipeline, before the first training epoch, this dataset will cache the results up to ScaleIntensityRanged, as all non-random transforms LoadNiftid, AddChanneld, Spacingd, Orientationd, ScaleIntensityRanged can be cached. During training, the dataset will load the cached results and run RandCropByPosNegLabeld and ToTensord, as RandCropByPosNegLabeld is a randomized transform and the outcome not cached.

Parameters

data (Sequence) – input data to load and transform to generate dataset for model.
transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.
cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).
cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).
num_workers (int) – the number of worker threads to use. If 0 a single thread will be used. Default is 0.

SmartCacheDataset¶

class monai.data.SmartCacheDataset(data, transform, replace_rate, cache_num=9223372036854775807, cache_rate=1.0, num_init_workers=0, num_replace_workers=0)[source]¶

Re-implementation of the SmartCache mechanism in NVIDIA Clara-train SDK. At any time, the cache pool only keeps a subset of the whole dataset. In each epoch, only the items in the cache are used for training. This ensures that data needed for training is readily available, keeping GPU resources busy. Note that cached items may still have to go through a non-deterministic transform sequence before being fed to GPU. At the same time, another thread is preparing replacement items by applying the transform sequence to items not in cache. Once one epoch is completed, Smart Cache replaces the same number of items with replacement items. Smart Cache uses a simple running window algorithm to determine the cache content and replacement items. Let N be the configured number of objects in cache; and R be the number of replacement objects (R = ceil(N * r), where r is the configured replace rate). For more details, please refer to: https://docs.nvidia.com/clara/tlt-mi/clara-train-sdk-v3.0/nvmidl/additional_features/smart_cache.html#smart-cache

For example, if we have 5 images: [image1, image2, image3, image4, image5], and cache_num=4, replace_rate=0.25. so the actual training images cached and replaced for every epoch are as below:

epoch 1: [image1, image2, image3, image4]
epoch 2: [image2, image3, image4, image5]
epoch 3: [image3, image4, image5, image1]
epoch 3: [image4, image5, image1, image2]
epoch N: [image[N % 5] ...]

The usage of SmartCacheDataset contains 4 steps:

Initialize SmartCacheDataset object and cache for the first epoch.

Call start() to run replacement thread in background.

Call update_cache() before every epoch to replace training items.

Call shutdown() when training ends.

Parameters

data (Sequence) – input data to load and transform to generate dataset for model.
transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.
replace_rate (float) – percentage of the cached items to be replaced in every epoch.
cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).
cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).
num_init_workers (int) – the number of worker threads to initialize the cache for first epoch. if 0, run in main thread, no separate thread will open.
num_replace_workers (int) – the number of worker threads to prepare the replacement cache for every epoch. if 0, run in main thread, no separate thread will open.

is_started()[source]¶: Check whether the replacement thread is already started.

manage_replacement()[source]¶: Background thread for replacement.

shutdown()[source]¶: Shut down the background thread for replacement.

start()[source]¶: Start the background thread to replace training items for every epoch.

update_cache()[source]¶: Update cache items for current epoch, need to call this function before every epoch. If the cache has been shutdown before, need to restart the _replace_mgr thread.

ZipDataset¶

class monai.data.ZipDataset(datasets, transform=None)[source]¶

Zip several PyTorch datasets and output data(with the same index) together in a tuple. If the output of single dataset is already a tuple, flatten it and extend to the result. For example: if datasetA returns (img, imgmeta), datasetB returns (seg, segmeta), finally return (img, imgmeta, seg, segmeta). And if the datasets don’t have same length, use the minimum length of them as the length of ZipDataset.

Examples:

>>> zip_data = ZipDataset([[1, 2, 3], [4, 5]])
>>> print(len(zip_data))
2
>>> for item in zip_data:
>>>    print(item)
[1, 4]
[2, 5]

Parameters

datasets (Sequence) – list of datasets to zip together.
transform (Optional[Callable]) – a callable data transform operates on the zipped item from datasets.

ArrayDataset¶

class monai.data.ArrayDataset(img, img_transform=None, seg=None, seg_transform=None, labels=None, label_transform=None)[source]¶

Dataset for segmentation and classification tasks based on array format input data and transforms. It ensures the same random seeds in the randomized transforms defined for image, segmentation and label. The transform can be monai.transforms.Compose or any other callable object. For example: If train based on Nifti format images without metadata, all transforms can be composed:

img_transform = Compose(
    [
        LoadNifti(image_only=True),
        AddChannel(),
        RandAdjustContrast()
    ]
)
ArrayDataset(img_file_list, img_transform=img_transform)

If training based on images and the metadata, the array transforms can not be composed because several transforms receives multiple parameters or return multiple values. Then Users need to define their own callable method to parse metadata from LoadNifti or set affine matrix to Spacing transform:

class TestCompose(Compose):
    def __call__(self, input_):
        img, metadata = self.transforms[0](input_)
        img = self.transforms[1](img)
        img, _, _ = self.transforms[2](img, metadata["affine"])
        return self.transforms[3](img), metadata
img_transform = TestCompose(
    [
        LoadNifti(image_only=False),
        AddChannel(),
        Spacing(pixdim=(1.5, 1.5, 3.0)),
        RandAdjustContrast()
    ]
)
ArrayDataset(img_file_list, img_transform=img_transform)

Examples:

>>> ds = ArrayDataset([1, 2, 3, 4], lambda x: x + 0.1)
>>> print(ds[0])
1.1

>>> ds = ArrayDataset(img=[1, 2, 3, 4], seg=[5, 6, 7, 8])
>>> print(ds[0])
[1, 5]

Initializes the dataset with the filename lists. The transform img_transform is applied to the images and seg_transform to the segmentations.

Parameters

img (Sequence) – sequence of images.
img_transform (Optional[Callable]) – transform to apply to each element in img.
seg (Optional[Sequence]) – sequence of segmentations.
seg_transform (Optional[Callable]) – transform to apply to each element in seg.
labels (Optional[Sequence]) – sequence of labels.
label_transform (Optional[Callable]) – transform to apply to each element in labels.

randomize(data=None)[source]¶

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises: NotImplementedError – When the subclass does not override this method.
Return type: None

Patch-based dataset¶

GridPatchDataset¶

class monai.data.GridPatchDataset(dataset, patch_size, start_pos=(), mode=<NumpyPadMode.WRAP: 'wrap'>, **pad_opts)[source]¶

Yields patches from arrays read from an input dataset. The patches are chosen in a contiguous grid sampling scheme.

Initializes this dataset in terms of the input dataset and patch size. The patch_size is the size of the patch to sample from the input arrays. It is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).

Parameters

dataset (Dataset) – the dataset to read array data from
patch_size (Sequence[int]) – size of patches to generate slices for, 0/None selects whole dimension
start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension
mode (Union[NumpyPadMode, str]) – {"constant", "edge", "linear_ramp", "maximum", "mean", "median", "minimum", "reflect", "symmetric", "wrap", "empty"} One of the listed string values or a user supplied function. Defaults to "wrap". See also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html
pad_opts (Dict) – padding options, see numpy.pad

Image reader¶

ITKReader¶

class monai.data.ITKReader(**kwargs)[source]¶

Load medical images based on ITK library. All the supported image formats can be found: https://github.com/InsightSoftwareConsortium/ITK/tree/master/Modules/IO The loaded data array will be in C order, for example, a 3D image will be CDWH.

Parameters: kwargs – additional args for itk.imread API. more details about available args: https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/itkExtras.py

get_data(img)[source]¶

Extract data array and meta data from loaded image and return them. This function returns 2 objects, first is numpy array of image data, second is dict of meta data. It constructs affine, original_affine, and spatial_shape and stores in meta dict. If loading a list of files, stack them together and add a new dimension as first dimension, and use the meta data of the first image to represent the stacked result.

Parameters: img – a ITK image object loaded from a image file or a list of ITK image objects.

read(data, **kwargs)[source]¶

Read image data from specified file or files. Note that the returned object is ITK image object or list of ITK image objects.

Parameters

data (Union[Sequence[str], str]) – file name or a list of file names to read,
kwargs – additional args for itk.imread API, will override self.kwargs for existing keys. More details about available args: https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/itkExtras.py

verify_suffix(filename)[source]¶

Verify whether the specified file or files format is supported by ITK reader.

Parameters: filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.
Return type: bool

NibabelReader¶

class monai.data.NibabelReader(as_closest_canonical=False, **kwargs)[source]¶

Load NIfTI format images based on Nibabel library.

Parameters

as_closest_canonical (bool) – if True, load the image as closest to canonical axis format.
kwargs – additional args for nibabel.load API. more details about available args: https://github.com/nipy/nibabel/blob/master/nibabel/loadsave.py

get_data(img)[source]¶

Extract data array and meta data from loaded image and return them. This function returns 2 objects, first is numpy array of image data, second is dict of meta data. It constructs affine, original_affine, and spatial_shape and stores in meta dict. If loading a list of files, stack them together and add a new dimension as first dimension, and use the meta data of the first image to represent the stacked result.

Parameters: img – a Nibabel image object loaded from a image file or a list of Nibabel image objects.

read(data, **kwargs)[source]¶

Read image data from specified file or files. Note that the returned object is Nibabel image object or list of Nibabel image objects.

Parameters

data (Union[Sequence[str], str]) – file name or a list of file names to read.
kwargs – additional args for nibabel.load API, will override self.kwargs for existing keys. More details about available args: https://github.com/nipy/nibabel/blob/master/nibabel/loadsave.py

verify_suffix(filename)[source]¶

Verify whether the specified file or files format is supported by Nibabel reader.

Parameters: filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.
Return type: bool

NumpyReader¶

class monai.data.NumpyReader(npz_keys=None, **kwargs)[source]¶

Load NPY or NPZ format data based on Numpy library, they can be arrays or pickled objects. A typical usage is to load the mask data for classification task. It can load part of the npz file with specified npz_keys.

Parameters

npz_keys (Union[Collection[Hashable], Hashable, None]) – if loading npz file, only load the specified keys, if None, load all the items. stack the loaded items together to construct a new first dimension.
kwargs – additional args for numpy.load API except allow_pickle. more details about available args: https://numpy.org/doc/stable/reference/generated/numpy.load.html

get_data(img)[source]¶

Extract data array and meta data from loaded data and return them. This function returns 2 objects, first is numpy array of image data, second is dict of meta data. It constructs spatial_shape=data.shape and stores in meta dict if the data is numpy array. If loading a list of files, stack them together and add a new dimension as first dimension, and use the meta data of the first image to represent the stacked result.

Parameters: img – a Numpy array loaded from a file or a list of Numpy arrays.

read(data, **kwargs)[source]¶

Read image data from specified file or files. Note that the returned object is Numpy array or list of Numpy arrays.

Parameters

data (Union[Sequence[str], str]) – file name or a list of file names to read.
kwargs – additional args for numpy.load API except allow_pickle, will override self.kwargs for existing keys. More details about available args: https://numpy.org/doc/stable/reference/generated/numpy.load.html

verify_suffix(filename)[source]¶

Verify whether the specified file or files format is supported by Numpy reader.

Parameters: filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.
Return type: bool

PILReader¶

class monai.data.PILReader(converter=None, **kwargs)[source]¶

Load common 2D image format (supports PNG, JPG, BMP) file or files from provided path.

Parameters

converter (Optional[Callable]) – additional function to convert the image data after read(). for example, use converter=lambda image: image.convert(“LA”) to convert image format.
kwargs – additional args for Image.open API in read(), mode details about available args: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open

get_data(img)[source]¶

Extract data array and meta data from loaded data and return them. This function returns 2 objects, first is numpy array of image data, second is dict of meta data. It constructs spatial_shape and stores in meta dict. If loading a list of files, stack them together and add a new dimension as first dimension, and use the meta data of the first image to represent the stacked result.

Parameters: img – a PIL Image object loaded from a file or a list of PIL Image objects.

read(data, **kwargs)[source]¶

Read image data from specified file or files. Note that the returned object is PIL image or list of PIL image.

Parameters

data (Union[Sequence[str], str, ndarray]) – file name or a list of file names to read.
kwargs – additional args for Image.open API in read(), will override self.kwargs for existing keys. Mode details about available args: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open

verify_suffix(filename)[source]¶

Verify whether the specified file or files format is supported by PIL reader.

Parameters: filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.
Return type: bool

Nifti format handling¶

Reading¶

class monai.data.NiftiDataset(image_files, seg_files=None, labels=None, as_closest_canonical=False, transform=None, seg_transform=None, image_only=True, dtype=<class 'numpy.float32'>)[source]¶

Loads image/segmentation pairs of Nifti files from the given filename lists. Transformations can be specified for the image and segmentation arrays separately.

Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.

Parameters

image_files (Sequence[str]) – list of image filenames
seg_files (Optional[Sequence[str]]) – if in segmentation task, list of segmentation filenames
labels (Optional[Sequence[float]]) – if in classification task, list of classification labels
as_closest_canonical (bool) – if True, load the image as closest to canonical orientation
transform (Optional[Callable]) – transform to apply to image arrays
seg_transform (Optional[Callable]) – transform to apply to segmentation arrays
image_only (bool) – if True return only the image volume, other return image volume and header dict
dtype (Optional[dtype]) – if not None convert the loaded image to this data type

Raises

ValueError – When seg_files length differs from image_files.

randomize(data=None)[source]¶

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises: NotImplementedError – When the subclass does not override this method.
Return type: None

Writing Nifti¶

class monai.data.NiftiSaver(output_dir='./', output_postfix='seg', output_ext='.nii.gz', resample=True, mode=<GridSampleMode.BILINEAR: 'bilinear'>, padding_mode=<GridSamplePadMode.BORDER: 'border'>, align_corners=False, dtype=<class 'numpy.float64'>)[source]¶

Save the data as NIfTI file, it can support single data content or a batch of data. Typically, the data can be segmentation predictions, call save for single data or call save_batch to save a batch of data together. If no meta data provided, use index from 0 as the filename prefix.

Parameters

output_dir (str) – output image directory.
output_postfix (str) – a string appended to all output file names.
output_ext (str) – output file extension name.
resample (bool) – whether to resample before saving the data array.
mode (Union[GridSampleMode, str]) – {"bilinear", "nearest"} This option is used when resample = True. Interpolation mode to calculate output values. Defaults to "bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
padding_mode (Union[GridSamplePadMode, str]) – {"zeros", "border", "reflection"} This option is used when resample = True. Padding mode for outside grid values. Defaults to "border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
dtype (Optional[dtype]) – data type for resampling computation. Defaults to np.float64 for best precision. If None, use the data type of input data. To be compatible with other modules, the output data type is always np.float32.

save(data, meta_data=None)[source]¶

Save data into a Nifti file. The meta_data could optionally have the following keys:

'filename_or_obj' – for output file name creation, corresponding to filename or object.

'original_affine' – for data orientation handling, defaulting to an identity matrix.

'affine' – for data output affine, defaulting to an identity matrix.

'spatial_shape' – for data output shape.

When meta_data is specified, the saver will try to resample batch data from the space defined by “affine” to the space defined by “original_affine”.

If meta_data is None, use the default index (starting from 0) as the filename.

Parameters

data (Union[Tensor, ndarray]) – target data content that to be saved as a NIfTI format file. Assuming the data shape starts with a channel dimension and followed by spatial dimensions.
meta_data (Optional[Dict]) – the meta data information corresponding to the data.

See Also: monai.data.nifti_writer.write_nifti()

Return type: None

save_batch(batch_data, meta_data=None)[source]¶

Save a batch of data into Nifti format files.

Spatially it supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively (with resampling supports for 2D and 3D only).

When saving multiple time steps or multiple channels batch_data, time and/or modality axes should be appended after the batch dimensions. For example, the shape of a batch of 2D eight-class segmentation probabilities to be saved could be (batch, 8, 64, 64); in this case each item in the batch will be saved as (64, 64, 1, 8) NIfTI file (the third dimension is reserved as a spatial dimension).

Parameters

batch_data (Union[Tensor, ndarray]) – target batch data content that save into NIfTI format.
meta_data (Optional[Dict]) – every key-value in the meta_data is corresponding to a batch of data.

Return type

None

monai.data.write_nifti(data, file_name, affine=None, target_affine=None, resample=True, output_spatial_shape=None, mode=<GridSampleMode.BILINEAR: 'bilinear'>, padding_mode=<GridSamplePadMode.BORDER: 'border'>, align_corners=False, dtype=<class 'numpy.float64'>)[source]¶

Write numpy data into NIfTI files to disk. This function converts data into the coordinate system defined by target_affine when target_affine is specified.

If the coordinate transform between affine and target_affine could be achieved by simply transposing and flipping data, no resampling will happen. otherwise this function will resample data using the coordinate transform computed from affine and target_affine. Note that the shape of the resampled data may subject to some rounding errors. For example, resampling a 20x20 pixel image from pixel size (1.5, 1.5)-mm to (3.0, 3.0)-mm space will return a 10x10-pixel image. However, resampling a 20x20-pixel image from pixel size (2.0, 2.0)-mm to (3.0, 3.0)-mma space will output a 14x14-pixel image, where the image shape is rounded from 13.333x13.333 pixels. In this case output_spatial_shape could be specified so that this function writes image data to a designated shape.

When affine and target_affine are None, the data will be saved with an identity matrix as the image affine.

This function assumes the NIfTI dimension notations. Spatially it supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively. When saving multiple time steps or multiple channels data, time and/or modality axes should be appended after the first three dimensions. For example, shape of 2D eight-class segmentation probabilities to be saved could be (64, 64, 1, 8). Also, data in shape (64, 64, 8), (64, 64, 8, 1) will be considered as a single-channel 3D image.

Parameters

data (ndarray) – input data to write to file.
file_name (str) – expected file name that saved on disk.
affine (Optional[ndarray]) – the current affine of data. Defaults to np.eye(4)
target_affine (Optional[ndarray]) – before saving the (data, affine) as a Nifti1Image, transform the data into the coordinates defined by target_affine.
resample (bool) – whether to run resampling when the target affine could not be achieved by swapping/flipping data axes.
output_spatial_shape (Optional[Sequence[int]]) – spatial shape of the output image. This option is used when resample = True.
mode (Union[GridSampleMode, str]) – {"bilinear", "nearest"} This option is used when resample = True. Interpolation mode to calculate output values. Defaults to "bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
padding_mode (Union[GridSamplePadMode, str]) – {"zeros", "border", "reflection"} This option is used when resample = True. Padding mode for outside grid values. Defaults to "border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample
dtype (Optional[dtype]) – data type for resampling computation. Defaults to np.float64 for best precision. If None, use the data type of input data. To be compatible with other modules, the output data type is always np.float32.

Return type

None

PNG format handling¶

Writing PNG¶

class monai.data.PNGSaver(output_dir='./', output_postfix='seg', output_ext='.png', resample=True, mode=<InterpolateMode.NEAREST: 'nearest'>, scale=None)[source]¶

Save the data as png file, it can support single data content or a batch of data. Typically, the data can be segmentation predictions, call save for single data or call save_batch to save a batch of data together. If no meta data provided, use index from 0 as the filename prefix.

Parameters

output_dir (str) – output image directory.
output_postfix (str) – a string appended to all output file names.
output_ext (str) – output file extension name.
resample (bool) – whether to resample and resize if providing spatial_shape in the metadata.
mode (Union[InterpolateMode, str]) – {"nearest", "linear", "bilinear", "bicubic", "trilinear", "area"} The interpolation mode. Defaults to "nearest". See also: https://pytorch.org/docs/stable/nn.functional.html#interpolate
scale (Optional[int]) – {255, 65535} postprocess data by clipping to [0, 1] and scaling [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.

save(data, meta_data=None)[source]¶

Save data into a png file. The meta_data could optionally have the following keys:

'filename_or_obj' – for output file name creation, corresponding to filename or object.

'spatial_shape' – for data output shape.

If meta_data is None, use the default index (starting from 0) as the filename.

Parameters

data (Union[Tensor, ndarray]) – target data content that to be saved as a png format file. Assuming the data shape are spatial dimensions. Shape of the spatial dimensions (C,H,W). C should be 1, 3 or 4
meta_data (Optional[Dict]) – the meta data information corresponding to the data.

Raises

ValueError – When data channels is not one of [1, 3, 4].

See Also: monai.data.png_writer.write_png()

Return type: None

save_batch(batch_data, meta_data=None)[source]¶

Save a batch of data into png format files.

Parameters

batch_data (Union[Tensor, ndarray]) – target batch data content that save into png format.
meta_data (Optional[Dict]) – every key-value in the meta_data is corresponding to a batch of data.

Return type

None

monai.data.write_png(data, file_name, output_spatial_shape=None, mode=<InterpolateMode.BICUBIC: 'bicubic'>, scale=None)[source]¶

Write numpy data into png files to disk. Spatially it supports HW for 2D.(H,W) or (H,W,3) or (H,W,4). If scale is None, expect the input data in np.uint8 or np.uint16 type. It’s based on the Image module in PIL library: https://pillow.readthedocs.io/en/stable/reference/Image.html

Parameters

data (ndarray) – input data to write to file.
file_name (str) – expected file name that saved on disk.
output_spatial_shape (Optional[Sequence[int]]) – spatial shape of the output image.
mode (Union[InterpolateMode, str]) – {"nearest", "linear", "bilinear", "bicubic", "trilinear", "area"} The interpolation mode. Defaults to "bicubic". See also: https://pytorch.org/docs/stable/nn.functional.html#interpolate
scale (Optional[int]) – {255, 65535} postprocess data by clipping to [0, 1] and scaling to [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.

Raises

ValueError – When scale is not one of [255, 65535].

Return type

None

Synthetic¶

monai.data.synthetic.create_test_image_2d(width, height, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]¶

Return a noisy 2D image with num_objs circles and a 2D mask image. The maximum radius of the circles is given as rad_max. The mask will have num_seg_classes number of classes for segmentations labeled sequentially from 1, plus a background class represented as 0. If noise_max is greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). If channel_dim is None, will create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim.

Parameters

width (int) – width of the image.
height (int) – height of the image.
num_objs (int) – number of circles to generate. Defaults to 12.
rad_max (int) – maximum circle radius. Defaults to 30.
noise_max (float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.
num_seg_classes (int) – number of classes for segmentations. Defaults to 5.
channel_dim (Optional[int]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.
random_state (Optional[RandomState]) – the random generator to use. Defaults to np.random.

Return type

Tuple[ndarray, ndarray]

monai.data.synthetic.create_test_image_3d(height, width, depth, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]¶

Return a noisy 3D image and segmentation.

Parameters

height (int) – height of the image.
width (int) – width of the image.
depth (int) – depth of the image.
num_objs (int) – number of circles to generate. Defaults to 12.
rad_max (int) – maximum circle radius. Defaults to 30.
noise_max (float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.
num_seg_classes (int) – number of classes for segmentations. Defaults to 5.
channel_dim (Optional[int]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.
random_state (Optional[RandomState]) – the random generator to use. Defaults to np.random.

Utilities¶

monai.data.utils.compute_importance_map(patch_size, mode=<BlendMode.CONSTANT: 'constant'>, sigma_scale=0.125, device=None)[source]¶

Get importance map for different weight modes.

Parameters

patch_size (Tuple[int, …]) – Size of the required importance map. This should be either H, W [,D].
mode (Union[BlendMode, str]) –
{"constant", "gaussian"} How to blend output of overlapping windows. Defaults to "constant".
- "constant”: gives equal weight to all predictions.
- "gaussian”: gives less weight to predictions on edges of windows.
sigma_scale (Union[Sequence[float], float]) – Sigma_scale to calculate sigma for each dimension (sigma = sigma_scale * dim_size). Used for gaussian mode only.
device (Optional[device]) – Device to put importance map on.

Raises

ValueError – When mode is not one of [“constant”, “gaussian”].

Return type

Tensor

Returns

Tensor of size patch_size.

monai.data.utils.compute_shape_offset(spatial_shape, in_affine, out_affine)[source]¶

Given input and output affine, compute appropriate shapes in the output space based on the input array’s shape. This function also returns the offset to put the shape in a good position with respect to the world coordinate system.

Parameters

spatial_shape (ndarray) – input array’s shape
in_affine (matrix) – 2D affine matrix
out_affine (matrix) – 2D affine matrix

Return type

Tuple[ndarray, ndarray]

monai.data.utils.correct_nifti_header_if_necessary(img_nii)[source]¶

Check nifti object header’s format, update the header if needed. In the updated image pixdim matches the affine.

Parameters: img_nii – nifti image object

monai.data.utils.create_file_basename(postfix, input_file_name, folder_path, data_root_dir='')[source]¶

Utility function to create the path to the output file based on the input filename (extension is added by lib level writer before writing the file)

Parameters

postfix (str) – output name’s postfix
input_file_name (str) – path to the input image file.
folder_path (str) – path for the output file
data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. This is used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names.

Return type

str

monai.data.utils.dense_patch_slices(image_size, patch_size, scan_interval)[source]¶

Enumerate all slices defining 2D/3D patches of size patch_size from an image_size input image.

Parameters

image_size (Sequence[int]) – dimensions of image to iterate over
patch_size (Sequence[int]) – size of patches to generate slices
scan_interval (Sequence[int]) – dense patch sampling interval

Raises

ValueError – When image_size length is not one of [2, 3].

Return type

List[Tuple[slice, …]]

Returns

a list of slice objects defining each patch

monai.data.utils.get_random_patch(dims, patch_size, rand_state=None)[source]¶

Returns a tuple of slices to define a random patch in an array of shape dims with size patch_size or the as close to it as possible within the given dimension. It is expected that patch_size is a valid patch for a source of shape dims as returned by get_valid_patch_size.

Parameters

dims (Sequence[int]) – shape of source array
patch_size (Sequence[int]) – shape of patch size to generate
rand_state (Optional[RandomState]) – a random state object to generate random numbers from

Returns

a tuple of slice objects defining the patch

Return type

(tuple of slice)

monai.data.utils.get_valid_patch_size(image_size, patch_size)[source]¶

Given an image of dimensions image_size, return a patch size tuple taking the dimension from patch_size if this is not 0/None. Otherwise, or if patch_size is shorter than image_size, the dimension from image_size is taken. This ensures the returned patch size is within the bounds of image_size. If patch_size is a single number this is interpreted as a patch of the same dimensionality of image_size with that size in each dimension.

Return type: Tuple[int, …]

monai.data.utils.is_supported_format(filename, suffixes)[source]¶

Verify whether the specified file or files format match supported suffixes. If supported suffixes is None, skip the verification and return True.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.
suffixes (Sequence[str]) – all the supported image suffixes of current reader, must be a list of lower case suffixes.

Return type

bool

monai.data.utils.iter_patch(arr, patch_size=0, start_pos=(), copy_back=True, mode=<NumpyPadMode.WRAP: 'wrap'>, **pad_opts)[source]¶

Yield successive patches from arr of size patch_size. The iteration can start from position start_pos in arr but drawing from a padded array extended by the patch_size in each dimension (so these coordinates can be negative to start in the padded region). If copy_back is True the values from each patch are written back to arr.

Parameters

arr (ndarray) – array to iterate over
patch_size (Union[Sequence[int], int]) – size of patches to generate slices for, 0 or None selects whole dimension
start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension
copy_back (bool) – if True data from the yielded patches is copied back to arr once the generator completes
mode (Union[NumpyPadMode, str]) – {"constant", "edge", "linear_ramp", "maximum", "mean", "median", "minimum", "reflect", "symmetric", "wrap", "empty"} One of the listed string values or a user supplied function. Defaults to "wrap". See also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html
pad_opts (Dict) – padding options, see numpy.pad

Yields

Patches of array data from arr which are views into a padded array which can be modified, if copy_back is True these changes will be reflected in arr once the iteration completes.

Return type

Generator[ndarray, None, None]

monai.data.utils.iter_patch_slices(dims, patch_size, start_pos=())[source]¶

Yield successive tuples of slices defining patches of size patch_size from an array of dimensions dims. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a first dimension as least significant ordering.

Parameters

dims (Sequence[int]) – dimensions of array to iterate over
patch_size (Union[Sequence[int], int]) – size of patches to generate slices for, 0 or None selects whole dimension
start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension

Yields

Tuples of slice objects defining each patch

Return type

Generator[Tuple[slice, …], None, None]

monai.data.utils.list_data_collate(batch)[source]¶: Enhancement for PyTorch DataLoader default collate. If dataset already returns a list of batch data that generated in transforms, need to merge all data to 1 list. Then it’s same as the default collate behavior.

Note

Need to use this collate if apply some transforms that can generate batch data.

monai.data.utils.rectify_header_sform_qform(img_nii)[source]¶

Look at the sform and qform of the nifti object and correct it if any incompatibilities with pixel dimensions

Adapted from https://github.com/NifTK/NiftyNet/blob/v0.6.0/niftynet/io/misc_io.py

Parameters: img_nii – nifti image object

monai.data.utils.to_affine_nd(r, affine)[source]¶

Using elements from affine, to create a new affine matrix by assigning the rotation/zoom/scaling matrix and the translation vector.

when r is an integer, output is an (r+1)x(r+1) matrix, where the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(r, len(affine) - 1).

when r is an affine matrix, the output has the same as r, the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(len(r) - 1, len(affine) - 1).

Parameters

r (int or matrix) – number of spatial dimensions or an output affine to be filled.
affine (matrix) – 2D affine matrix

Raises

ValueError – When affine dimensions is not 2.
ValueError – When r is nonpositive.

Return type

ndarray

Returns

an (r+1) x (r+1) matrix

monai.data.utils.worker_init_fn(worker_id)[source]¶

Callback function for PyTorch DataLoader worker_init_fn. It can set different random seed for the transforms in different workers.

Return type: None

monai.data.utils.zoom_affine(affine, scale, diagonal=True)[source]¶

To make column norm of affine the same as scale. If diagonal is False, returns an affine that combines orthogonal rotation and the new scale. This is done by first decomposing affine, then setting the zoom factors to scale, and composing a new affine; the shearing factors are removed. If diagonal is True, returns a diagonal matrix, the scaling factors are set to the diagonal elements. This function always return an affine with zero translations.

Parameters

affine (nxn matrix) – a square matrix.
scale (Sequence[float]) – new scaling factor along each dimension.
diagonal (bool) – whether to return a diagonal scaling matrix. Defaults to True.

Raises

ValueError – When affine is not a square matrix.
ValueError – When scale contains a nonpositive scalar.

Return type

ndarray

Returns

the updated n x n affine.

Decathlon Datalist¶

monai.data.load_decathlon_datalist(data_list_file_path, is_segmentation=True, data_list_key='training', base_dir=None)[source]¶

Load image/label paths of decathlon challenge from JSON file

Json file is similar to what you get from http://medicaldecathlon.com/ Those dataset.json files

Parameters

data_list_file_path (str) – the path to the json file of datalist.
is_segmentation (bool) – whether the datalist is for segmentation task, default is True.
data_list_key (str) – the key to get a list of dictionary to be used, default is “training”.
base_dir (Optional[str]) – the base directory of the dataset, if None, use the datalist directory.

Raises

ValueError – When data_list_file_path does not point to a file.
ValueError – When data_list_key is not specified in the data list file.

Returns a list of data items, each of which is a dict keyed by element names, for example:

[
    {'image': '/workspace/data/chest_19.nii.gz',  'label': 0},
    {'image': '/workspace/data/chest_31.nii.gz',  'label': 1}
]

Return type: List[Dict]

DataLoader¶

monai.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None, num_workers=0, pin_memory=False, drop_last=False, timeout=0.0, multiprocessing_context=None)[source]¶

Generates images/labels for train/validation/testing from dataset. It inherits from PyTorch DataLoader and adds callbacks for collate and worker_fn.

Parameters

dataset (Dataset) – dataset from which to load the data.
batch_size (int) – how many samples per batch to load (default: 1).
shuffle (bool) – set to True to have the data reshuffled at every epoch (default: False).
sampler (Optional[Sampler]) – defines the strategy to draw samples from the dataset. If specified, shuffle must be False.
batch_sampler (Optional[Sampler]) – like sampler, but returns a batch of indices at a time. Mutually exclusive with batch_size, shuffle, sampler, and drop_last.
num_workers (int) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)
pin_memory (bool) – If True, the data loader will copy Tensors into CUDA pinned memory before returning them. If your data elements are a custom type, or your collate_fn returns a batch that is a custom type, see the example below.
drop_last (bool) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)
timeout (float) – if positive, the timeout value for collecting a batch from workers. Should always be non-negative. (default: 0)
multiprocessing_context (Optional[Callable]) – specify a valid start method for multi-processing.