Data

Generic Interfaces

Dataset

class Dataset(data, transform=None)[source]

Generic dataset to handle dictionary format data, it can operate transforms for specific fields. For example, typical input data can be a list of dictionaries:

[{                            {                            {
     'img': 'image1.nii.gz',      'img': 'image2.nii.gz',      'img': 'image3.nii.gz',
     'seg': 'label1.nii.gz',      'seg': 'label2.nii.gz',      'seg': 'label3.nii.gz',
     'extra': 123                 'extra': 456                 'extra': 789
 },                           },                           }]
Parameters
  • data (Iterable) – input data to load and transform to generate dataset for model.

  • transform (Callable, optional) – transforms to execute operations on input data.

CacheDataset

class CacheDataset(data, transform, cache_num=9223372036854775807, cache_rate=1.0)[source]

Dataset with cache mechanism that can load data and cache deterministic transforms’ result during training.

By caching the results of non-random preprocessing transforms, it accelerates the training data pipeline. If the requested data is not in the cache, all transforms will run normally (see also monai.data.dataset.Dataset).

Users can set the cache rate or number of items to cache. It is recommended to experiment with different cache_num or cache_rate to identify the best training speed.

To improve the caching efficiency, please always put as many as possible non-random transforms before the randomised ones when composing the chain of transforms.

For example, if the transform is a Compose of:

transforms = Compose([
    LoadNiftid(),
    AddChanneld(),
    Spacingd(),
    Orientationd(),
    ScaleIntensityRanged(),
    RandCropByPosNegLabeld(),
    ToTensord()
])

when transforms is used in a multi-epoch training pipeline, before the first training epoch, this dataset will cache the results up to ScaleIntensityRanged, as all non-random transforms LoadNiftid, AddChanneld, Spacingd, Orientationd, ScaleIntensityRanged can be cached. During training, the dataset will load the cached results and run RandCropByPosNegLabeld and ToTensord, as RandCropByPosNegLabeld is a randomised transform and the outcome not cached.

Parameters
  • data (Iterable) – input data to load and transform to generate dataset for model.

  • transform (Callable) – transforms to execute operations on input data.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

Patch-based dataset

GridPatchDataset

class GridPatchDataset(dataset, patch_size, start_pos=(), pad_mode='wrap', **pad_opts)[source]

Yields patches from arrays read from an input dataset. The patches are chosen in a contiguous grid sampling scheme.

Initializes this dataset in terms of the input dataset and patch size. The patch_size is the size of the patch to sample from the input arrays. Tt is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).

Parameters
  • dataset (Dataset) – the dataset to read array data from

  • patch_size (tuple of int or None) – size of patches to generate slices for, 0/None selects whole dimension

  • start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension

  • pad_mode (str, optional) – padding mode, see numpy.pad

  • pad_opts (dict, optional) – padding options, see numpy.pad

Sliding window inference

sliding_window_inference(inputs, roi_size, sw_batch_size, predictor)[source]

Use SlidingWindow method to execute inference.

Parameters
  • inputs (torch Tensor) – input image to be processed (assuming NCHW[D])

  • roi_size (list, tuple) – the window size to execute SlidingWindow inference.

  • sw_batch_size (int) – the batch size to run window slices.

  • predictor (Callable) – given input tensor patch_data in shape NCHW[D], predictor(patch_data) should return a prediction with the same spatial shape and batch_size, i.e. NMHW[D]; where HW[D] represents the patch spatial size, M is the number of output channels, N is sw_batch_size.

Note

must be channel first, support both 2D and 3D. input data must have batch dim. execute on 1 image/per inference, run a batch of window slices of 1 input image.

Nifti format handling

Reading

class NiftiDataset(image_files, seg_files=None, labels=None, as_closest_canonical=False, transform=None, seg_transform=None, image_only=True, dtype=None)[source]

Loads image/segmentation pairs of Nifti files from the given filename lists. Transformations can be specified for the image and segmentation arrays separately.

Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.

Parameters
  • image_files (list of str) – list of image filenames

  • seg_files (list of str) – if in segmentation task, list of segmentation filenames

  • labels (list or array) – if in classification task, list of classification labels

  • as_closest_canonical (bool) – if True, load the image as closest to canonical orientation

  • transform (Callable, optional) – transform to apply to image arrays

  • seg_transform (Callable, optional) – transform to apply to segmentation arrays

  • image_only (bool) – if True return only the image volume, other return image volume and header dict

  • dtype (np.dtype, optional) – if not None convert the loaded image to this data type

load_nifti(filename_or_obj, as_closest_canonical=False, image_only=True, dtype=None)[source]

Loads a Nifti file from the given path or file-like object.

Parameters
  • filename_or_obj (str or file) – path to file or file-like object

  • as_closest_canonical (bool) – if True, load the image as closest to canonical axis format

  • image_only (bool) – if True return only the image volume, other return image volume and header dict

  • dtype (np.dtype, optional) – if not None convert the loaded image to this data type

Returns

The loaded image volume if image_only is True, or a tuple containing the volume and the Nifti header in dict format otherwise

Note

header[‘original_affine’] stores the original affine loaded from filename_or_obj. header[‘affine’] stores the affine after the optional as_closest_canonical transform.

Writing

class NiftiSaver(output_dir='./', output_postfix='seg', output_ext='.nii.gz', resample=True, interp_order=0, mode='constant', cval=0, dtype=None)[source]

Save the data as NIfTI file, it can support single data content or a batch of data. Typically, the data can be segmentation predictions, call save for single data or call save_batch to save a batch of data together. If no meta data provided, use index from 0 as the filename prefix.

Parameters
  • output_dir (str) – output image directory.

  • output_postfix (str) – a string appended to all output file names.

  • output_ext (str) – output file extension name.

  • resample (bool) – whether to resample before saving the data array.

  • interp_order (int) – the order of the spline interpolation, default is 0. The order has to be in the range 0 - 5. https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.affine_transform.html this option is used when resample = True.

  • mode (reflect|constant|nearest|mirror|wrap) – The mode parameter determines how the input array is extended beyond its boundaries. this option is used when resample = True.

  • cval (scalar) – Value to fill past edges of input if mode is “constant”. Default is 0.0. this option is used when resample = True.

  • dtype (np.dtype, optional) – convert the image data to save to this data type. If None, keep the original type of data.

save(data, meta_data=None)[source]

Save data into a Nifti file. The metadata could optionally have the following keys:

  • 'filename_or_obj' – for output file name creation, corresponding to filename or object.

  • 'original_affine' – for data orientation handling, defaulting to an identity matrix.

  • 'affine' – for data output affine, defaulting to an identity matrix.

  • 'spatial_shape' – for data output shape.

If meta_data is None, use the default index from 0 to save data instead.

Parameters
  • data (Tensor or ndarray) – target data content that to be saved as a NIfTI format file. Assuming the data shape starts with a channel dimension and followed by spatial dimensions.

  • meta_data (dict) – the meta data information corresponding to the data.

See Also

monai.data.nifti_writer.write_nifti()

save_batch(batch_data, meta_data=None)[source]

Save a batch of data into Nifti format files.

Parameters
  • batch_data (Tensor or ndarray) – target batch data content that save into NIfTI format.

  • meta_data (dict) – every key-value in the meta_data is corresponding to a batch of data.

write_nifti(data, file_name, affine=None, target_affine=None, resample=True, output_shape=None, interp_order=3, mode='constant', cval=0, dtype=None)[source]

Write numpy data into NIfTI files to disk. This function converts data into the coordinate system defined by target_affine when target_affine is specified.

if the coordinate transform between affine and target_affine could be achieved by simply transposing and flipping data, no resampling will happen. otherwise this function will resample data using the coordinate transform computed from affine and target_affine. Note that the shape of the resampled data may subject to some rounding errors. For example, resampling a 20x20 pixel image from pixel size (1.5, 1.5)-mm to (3.0, 3.0)-mm space will return a 10x10-pixel image. However, resampling a 20x20-pixel image from pixel size (2.0, 2.0)-mm to (3.0, 3.0)-mma space will output a 14x14-pixel image, where the image shape is rounded from 13.333x13.333 pixels. In this case output_shape could be specified so that this function writes image data to a designated shape.

when affine and target_affine are None, the data will be saved with an identity matrix as the image affine.

This function assumes the NIfTI dimension notations. Spatially It supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively. When saving multiple time steps or multiple channels data, time and/or modality axes should be appended after the first three dimensions. For example, shape of 2D eight-class segmentation probabilities to be saved could be (64, 64, 1, 8),

Parameters
  • data (numpy.ndarray) – input data to write to file.

  • file_name (string) – expected file name that saved on disk.

  • affine (numpy.ndarray) – the current affine of data. Defaults to np.eye(4)

  • target_affine (numpy.ndarray, optional) – before saving the (data, affine) as a Nifti1Image, transform the data into the coordinates defined by target_affine.

  • resample (bool) – whether to run resampling when the target affine could not be achieved by swapping/flipping data axes.

  • output_shape (None or tuple of ints) – output image shape. this option is used when resample = True.

  • interp_order (int) – the order of the spline interpolation, default is 3. The order has to be in the range 0 - 5. https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.affine_transform.html this option is used when resample = True.

  • mode (reflect|constant|nearest|mirror|wrap) – The mode parameter determines how the input array is extended beyond its boundaries. this option is used when resample = True.

  • cval (scalar) – Value to fill past edges of input if mode is “constant”. Default is 0.0. this option is used when resample = True.

  • dtype (np.dtype, optional) – convert the image to save to this data type.

Synthetic

create_test_image_2d(width, height, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None)[source]

Return a noisy 2D image with num_obj circles and a 2D mask image. The maximum radius of the circles is given as rad_max. The mask will have num_seg_classes number of classes for segmentations labeled sequentially from 1, plus a background class represented as 0. If noise_max is greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). If channel_dim is None, will create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim.

create_test_image_3d(height, width, depth, num_objs=12, rad_max=30, noise_max=0.0, num_seg_classes=5, channel_dim=None)[source]

Return a noisy 3D image and segmentation.

Utilities

compute_shape_offset(spatial_shape, in_affine, out_affine)[source]

Given input and output affine, compute appropriate shapes in the output space based on the input array’s shape. This function also returns the offset to put the shape in a good position with respect to the world coordinate system.

correct_nifti_header_if_necessary(img_nii)[source]

check nifti object header’s format, update the header if needed. in the updated image pixdim matches the affine.

Parameters

img (nifti image object) –

dense_patch_slices(image_size, patch_size, scan_interval)[source]

Enumerate all slices defining 2D/3D patches of size patch_size from an image_size input image.

Parameters
  • image_size (tuple of int) – dimensions of image to iterate over

  • patch_size (tuple of int) – size of patches to generate slices

  • scan_interval (tuple of int) – dense patch sampling interval

Returns

a list of slice objects defining each patch

get_random_patch(dims, patch_size, rand_state=None)[source]

Returns a tuple of slices to define a random patch in an array of shape dims with size patch_size or the as close to it as possible within the given dimension. It is expected that patch_size is a valid patch for a source of shape dims as returned by get_valid_patch_size.

Parameters
  • dims (tuple of int) – shape of source array

  • patch_size (tuple of int) – shape of patch size to generate

  • rand_state (np.random.RandomState) – a random state object to generate random numbers from

Returns

a tuple of slice objects defining the patch

Return type

(tuple of slice)

get_valid_patch_size(dims, patch_size)[source]

Given an image of dimensions dims, return a patch size tuple taking the dimension from patch_size if this is not 0/None. Otherwise, or if patch_size is shorter than dims, the dimension from dims is taken. This ensures the returned patch size is within the bounds of dims. If patch_size is a single number this is interpreted as a patch of the same dimensionality of dims with that size in each dimension.

iter_patch(arr, patch_size, start_pos=(), copy_back=True, pad_mode='wrap', **pad_opts)[source]

Yield successive patches from arr of size patch_size. The iteration can start from position start_pos in arr but drawing from a padded array extended by the patch_size in each dimension (so these coordinates can be negative to start in the padded region). If copy_back is True the values from each patch are written back to arr.

Parameters
  • arr (np.ndarray) – array to iterate over

  • patch_size (tuple of int or None) – size of patches to generate slices for, 0 or None selects whole dimension

  • start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension

  • copy_back (bool) – if True data from the yielded patches is copied back to arr once the generator completes

  • pad_mode (str, optional) – padding mode, see numpy.pad

  • pad_opts (dict, optional) – padding options, see numpy.pad

Yields

Patches of array data from arr which are views into a padded array which can be modified, if copy_back is True these changes will be reflected in arr once the iteration completes.

iter_patch_slices(dims, patch_size, start_pos=())[source]

Yield successive tuples of slices defining patches of size patch_size from an array of dimensions dims. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a first dimension as least significant ordering.

Parameters
  • dims (tuple of int) – dimensions of array to iterate over

  • patch_size (tuple of int or None) – size of patches to generate slices for, 0 or None selects whole dimension

  • start_pos (tuple of it, optional) – starting position in the array, default is 0 for each dimension

Yields

Tuples of slice objects defining each patch

list_data_collate(batch)[source]

Enhancement for PyTorch DataLoader default collate. If dataset already returns a list of batch data that generated in transforms, need to merge all data to 1 list. Then it’s same as the default collate behavior. .. note:: Need to use this collate if apply some transforms that can generate batch data.

rectify_header_sform_qform(img_nii)[source]

Look at the sform and qform of the nifti object and correct it if any incompatibilities with pixel dimensions

Adapted from https://github.com/NifTK/NiftyNet/blob/v0.6.0/niftynet/io/misc_io.py

to_affine_nd(r, affine)[source]

Using elements from affine, to create a new affine matrix by assigning the rotation/zoom/scaling matrix and the translation vector.

when r is an integer, output is an (r+1)x(r+1) matrix, where the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(r, len(affine) - 1).

when r is an affine matrix, the output has the same as r, the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(len(r) - 1, len(affine) - 1).

Parameters
  • r (int or matrix) – number of spatial dimensions or an output affine to be filled.

  • affine (matrix) – 2D affine matrix

Returns

a (r+1) x (r+1) matrix

zoom_affine(affine, scale, diagonal=True)[source]

To make column norm of affine the same as scale. if diagonal is False, returns an affine that combines orthogonal rotation and the new scale. This is done by first decomposing`affine`, then setting the zoom factors to scale, and composing a new affine; the shearing factors are removed. If diagonal is True, returns an diagonal matrix, the scaling factors are set to the diagonal elements. This function always return an affine with zero translations.

Parameters
  • affine (nxn matrix) – a square matrix.

  • scale (sequence of floats) – new scaling factor along each dimension.

  • diagonal (bool) – whether to return a diagonal scaling matrix. Defaults to True.

Returns

the updated n x n affine.