Data

Generic Interfaces

Dataset

class monai.data.Dataset(data, transform=None)[source]

A generic dataset with a length property and an optional callable data transform when fetching a data sample. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

For example, typical input data can be a list of dictionaries:

[{                            {                            {
     'img': 'image1.nii.gz',      'img': 'image2.nii.gz',      'img': 'image3.nii.gz',
     'seg': 'label1.nii.gz',      'seg': 'label2.nii.gz',      'seg': 'label3.nii.gz',
     'extra': 123                 'extra': 456                 'extra': 789
 },                           },                           }]
Parameters
  • data (Sequence) – input data to load and transform to generate dataset for model.

  • transform (Optional[Callable]) – a callable data transform on input data.

__getitem__(index)[source]

Returns a Subset if index is a slice or Sequence, a data item otherwise.

__init__(data, transform=None)[source]
Parameters
  • data (Sequence) – input data to load and transform to generate dataset for model.

  • transform (Optional[Callable]) – a callable data transform on input data.

IterableDataset

class monai.data.IterableDataset(data, transform=None)[source]

A generic dataset for iterable data source and an optional callable data transform when fetching a data sample. For example, typical input data can be web data stream which can support multi-process access.

Note that when used with DataLoader and num_workers > 0, each worker process will have a different copy of the dataset object, need to guarantee process-safe from data source or DataLoader.

Parameters
  • data (Iterable) – input data source to load and transform to generate dataset for model.

  • transform (Optional[Callable]) – a callable data transform on input data.

__init__(data, transform=None)[source]
Parameters
  • data (Iterable) – input data source to load and transform to generate dataset for model.

  • transform (Optional[Callable]) – a callable data transform on input data.

CSVIterableDataset

class monai.data.CSVIterableDataset(filename, chunksize=1000, col_names=None, col_types=None, col_groups=None, transform=None, **kwargs)[source]

Iterable dataset to load CSV files and generate dictionary data. It can be helpful when loading extremely big CSV files that can’t read into memory directly. To accelerate the loading process, it can support multi-processing based on PyTorch DataLoader workers, every process executes transforms on part of every loaded chunk. Note: the order of output data may not match data source in multi-processing mode.

It can load data from multiple CSV files and join the tables with additional kwargs arg. Support to only load specific columns. And it can also group several loaded columns to generate a new column, for example, set col_groups={“meta”: [“meta_0”, “meta_1”, “meta_2”]}, output can be:

[
    {"image": "./image0.nii", "meta_0": 11, "meta_1": 12, "meta_2": 13, "meta": [11, 12, 13]},
    {"image": "./image1.nii", "meta_0": 21, "meta_1": 22, "meta_2": 23, "meta": [21, 22, 23]},
]
Parameters
  • filename (Union[str, Sequence[str]]) – the filename of expected CSV file to load. if providing a list of filenames, it will load all the files and join tables.

  • chunksize (int) – rows of a chunk when loading iterable data from CSV files, default to 1000. more details: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html.

  • col_names (Optional[Sequence[str]]) – names of the expected columns to load. if None, load all the columns.

  • col_types (Optional[Dict[str, Optional[Dict[str, Any]]]]) –

    type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”]. for example:

    col_types = {
        "subject_id": {"type": str},
        "label": {"type": int, "default": 0},
        "ehr_0": {"type": float, "default": 0.0},
        "ehr_1": {"type": float, "default": 0.0},
        "image": {"type": str, "default": None},
    }
    

  • col_groups (Optional[Dict[str, Sequence[str]]]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}

  • transform (Optional[Callable]) – transform to apply on the loaded items of a dictionary data.

  • kwargs – additional arguments for pandas.merge() API to join tables.

  • data – input data source to load and transform to generate dataset for model.

  • transform – a callable data transform on input data.

PersistentDataset

class monai.data.PersistentDataset(data, transform, cache_dir, hash_func=<function pickle_hashing>)[source]

Persistent storage of pre-computed values to efficiently manage larger than memory dictionary format data, it can operate transforms for specific fields. Results from the non-random transform components are computed when first used, and stored in the cache_dir for rapid retrieval on subsequent uses. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

The transforms which are supposed to be cached must implement the monai.transforms.Transform interface and should not be Randomizable. This dataset will cache the outcomes before the first Randomizable Transform within a Compose instance.

For example, typical input data can be a list of dictionaries:

[{                            {                            {
    'image': 'image1.nii.gz',    'image': 'image2.nii.gz',    'image': 'image3.nii.gz',
    'label': 'label1.nii.gz',    'label': 'label2.nii.gz',    'label': 'label3.nii.gz',
    'extra': 123                 'extra': 456                 'extra': 789
},                           },                           }]

For a composite transform like

[ LoadImaged(keys=['image', 'label']),
Orientationd(keys=['image', 'label'], axcodes='RAS'),
ScaleIntensityRanged(keys=['image'], a_min=-57, a_max=164, b_min=0.0, b_max=1.0, clip=True),
RandCropByPosNegLabeld(keys=['image', 'label'], label_key='label', spatial_size=(96, 96, 96),
                        pos=1, neg=1, num_samples=4, image_key='image', image_threshold=0),
ToTensord(keys=['image', 'label'])]

Upon first use a filename based dataset will be processed by the transform for the [LoadImaged, Orientationd, ScaleIntensityRanged] and the resulting tensor written to the cache_dir before applying the remaining random dependant transforms [RandCropByPosNegLabeld, ToTensord] elements for use in the analysis.

Subsequent uses of a dataset directly read pre-processed results from cache_dir followed by applying the random dependant parts of transform processing.

During training call set_data() to update input data and recompute cache content.

Note

The input data must be a list of file paths and will hash them as cache keys.

When loading persistent cache content, it can’t guarantee the cached data matches current transform chain, so please make sure to use exactly the same non-random transforms and the args as the cache content, otherwise, it may cause unexpected errors.

Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_dir (Union[Path, str, None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

__init__(data, transform, cache_dir, hash_func=<function pickle_hashing>)[source]
Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_dir (Union[Path, str, None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

set_data(data)[source]

Set the input data and delete all the out-dated cache content.

CacheNTransDataset

class monai.data.CacheNTransDataset(data, transform, cache_n_trans, cache_dir, hash_func=<function pickle_hashing>)[source]

Extension of PersistentDataset, tt can also cache the result of first N transforms, no matter it’s random or not.

Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_n_trans (int) – cache the result of first N transforms.

  • cache_dir (Union[Path, str, None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

__init__(data, transform, cache_n_trans, cache_dir, hash_func=<function pickle_hashing>)[source]
Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. PersistentDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_n_trans (int) – cache the result of first N transforms.

  • cache_dir (Union[Path, str, None]) – If specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If cache_dir doesn’t exist, will automatically create it. If cache_dir is None, there is effectively no caching.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

LMDBDataset

class monai.data.LMDBDataset(data, transform, cache_dir='cache', hash_func=<function pickle_hashing>, db_name='monai_cache', progress=True, pickle_protocol=4, lmdb_kwargs=None)[source]

Extension of PersistentDataset using LMDB as the backend.

Examples

>>> items = [{"data": i} for i in range(5)]
# [{'data': 0}, {'data': 1}, {'data': 2}, {'data': 3}, {'data': 4}]
>>> lmdb_ds = monai.data.LMDBDataset(items, transform=monai.transforms.SimulateDelayd("data", delay_time=1))
>>> print(list(lmdb_ds))  # using the cached results
Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. LMDBDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_dir (Union[Path, str]) – if specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If the cache_dir doesn’t exist, will automatically create it. Defaults to “./cache”.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

  • db_name (str) – lmdb database file name. Defaults to “monai_cache”.

  • progress (bool) – whether to display a progress bar.

  • pickle_protocol – pickle protocol version. Defaults to pickle.HIGHEST_PROTOCOL. https://docs.python.org/3/library/pickle.html#pickle-protocols

  • lmdb_kwargs (Optional[dict]) – additional keyword arguments to the lmdb environment. for more details please visit: https://lmdb.readthedocs.io/en/release/#environment-class

__init__(data, transform, cache_dir='cache', hash_func=<function pickle_hashing>, db_name='monai_cache', progress=True, pickle_protocol=4, lmdb_kwargs=None)[source]
Parameters
  • data (Sequence) – input data file paths to load and transform to generate dataset for model. LMDBDataset expects input data to be a list of serializable and hashes them as cache keys using hash_func.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_dir (Union[Path, str]) – if specified, this is the location for persistent storage of pre-computed transformed data tensors. The cache_dir is computed once, and persists on disk until explicitly removed. Different runs, programs, experiments may share a common cache dir provided that the transforms pre-processing is consistent. If the cache_dir doesn’t exist, will automatically create it. Defaults to “./cache”.

  • hash_func (Callable[…, bytes]) – a callable to compute hash from data items to be cached. defaults to monai.data.utils.pickle_hashing.

  • db_name (str) – lmdb database file name. Defaults to “monai_cache”.

  • progress (bool) – whether to display a progress bar.

  • pickle_protocol – pickle protocol version. Defaults to pickle.HIGHEST_PROTOCOL. https://docs.python.org/3/library/pickle.html#pickle-protocols

  • lmdb_kwargs (Optional[dict]) – additional keyword arguments to the lmdb environment. for more details please visit: https://lmdb.readthedocs.io/en/release/#environment-class

info()[source]

Returns: dataset info dictionary.

set_data(data)[source]

Set the input data and delete all the out-dated cache content.

CacheDataset

class monai.data.CacheDataset(data, transform, cache_num=9223372036854775807, cache_rate=1.0, num_workers=None, progress=True)[source]

Dataset with cache mechanism that can load data and cache deterministic transforms’ result during training.

By caching the results of non-random preprocessing transforms, it accelerates the training data pipeline. If the requested data is not in the cache, all transforms will run normally (see also monai.data.dataset.Dataset).

Users can set the cache rate or number of items to cache. It is recommended to experiment with different cache_num or cache_rate to identify the best training speed.

The transforms which are supposed to be cached must implement the monai.transforms.Transform interface and should not be Randomizable. This dataset will cache the outcomes before the first Randomizable Transform within a Compose instance. So to improve the caching efficiency, please always put as many as possible non-random transforms before the randomized ones when composing the chain of transforms. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

For example, if the transform is a Compose of:

transforms = Compose([
    LoadImaged(),
    AddChanneld(),
    Spacingd(),
    Orientationd(),
    ScaleIntensityRanged(),
    RandCropByPosNegLabeld(),
    ToTensord()
])

when transforms is used in a multi-epoch training pipeline, before the first training epoch, this dataset will cache the results up to ScaleIntensityRanged, as all non-random transforms LoadImaged, AddChanneld, Spacingd, Orientationd, ScaleIntensityRanged can be cached. During training, the dataset will load the cached results and run RandCropByPosNegLabeld and ToTensord, as RandCropByPosNegLabeld is a randomized transform and the outcome not cached.

During training call set_data() to update input data and recompute cache content, note that it requires persistent_workers=False in the PyTorch DataLoader.

Note

CacheDataset executes non-random transforms and prepares cache content in the main process before the first epoch, then all the subprocesses of DataLoader will read the same cache content in the main process during training. it may take a long time to prepare cache content according to the size of expected cache data. So to debug or verify the program before real training, users can set cache_rate=0.0 or cache_num=0 to temporarily skip caching.

Parameters
  • data (Sequence) – input data to load and transform to generate dataset for model.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (Optional[int]) – the number of worker processes to use. If num_workers is None then the number returned by os.cpu_count() is used.

  • progress (bool) – whether to display a progress bar.

__init__(data, transform, cache_num=9223372036854775807, cache_rate=1.0, num_workers=None, progress=True)[source]
Parameters
  • data (Sequence) – input data to load and transform to generate dataset for model.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (Optional[int]) – the number of worker processes to use. If num_workers is None then the number returned by os.cpu_count() is used.

  • progress (bool) – whether to display a progress bar.

set_data(data)[source]

Set the input data and run deterministic transforms to generate cache content.

Note: should call this func after an entire epoch and must set persistent_workers=False in PyTorch DataLoader, because it needs to create new worker processes based on new generated cache content.

SmartCacheDataset

class monai.data.SmartCacheDataset(data, transform, replace_rate, cache_num=9223372036854775807, cache_rate=1.0, num_init_workers=None, num_replace_workers=None, progress=True, shuffle=True, seed=0)[source]

Re-implementation of the SmartCache mechanism in NVIDIA Clara-train SDK. At any time, the cache pool only keeps a subset of the whole dataset. In each epoch, only the items in the cache are used for training. This ensures that data needed for training is readily available, keeping GPU resources busy. Note that cached items may still have to go through a non-deterministic transform sequence before being fed to GPU. At the same time, another thread is preparing replacement items by applying the transform sequence to items not in cache. Once one epoch is completed, Smart Cache replaces the same number of items with replacement items. Smart Cache uses a simple running window algorithm to determine the cache content and replacement items. Let N be the configured number of objects in cache; and R be the number of replacement objects (R = ceil(N * r), where r is the configured replace rate). For more details, please refer to: https://docs.nvidia.com/clara/tlt-mi/clara-train-sdk-v3.0/nvmidl/additional_features/smart_cache.html#smart-cache If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

For example, if we have 5 images: [image1, image2, image3, image4, image5], and cache_num=4, replace_rate=0.25. so the actual training images cached and replaced for every epoch are as below:

epoch 1: [image1, image2, image3, image4]
epoch 2: [image2, image3, image4, image5]
epoch 3: [image3, image4, image5, image1]
epoch 3: [image4, image5, image1, image2]
epoch N: [image[N % 5] ...]

The usage of SmartCacheDataset contains 4 steps:

  1. Initialize SmartCacheDataset object and cache for the first epoch.

  2. Call start() to run replacement thread in background.

  3. Call update_cache() before every epoch to replace training items.

  4. Call shutdown() when training ends.

During training call set_data() to update input data and recompute cache content, note to call shutdown() to stop first, then update data and call start() to restart.

Note

This replacement will not work for below cases: 1. Set the multiprocessing_context of DataLoader to spawn. 2. Run on windows(the default multiprocessing method is spawn) with num_workers greater than 0. 3. Set the persistent_workers of DataLoader to True with num_workers greater than 0.

If using MONAI workflows, please add SmartCacheHandler to the handler list of trainer, otherwise, please make sure to call start(), update_cache(), shutdown() during training.

Parameters
  • data (Sequence) – input data to load and transform to generate dataset for model.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data.

  • replace_rate (float) – percentage of the cached items to be replaced in every epoch.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_init_workers (Optional[int]) – the number of worker threads to initialize the cache for first epoch. If num_init_workers is None then the number returned by os.cpu_count() is used.

  • num_replace_workers (Optional[int]) – the number of worker threads to prepare the replacement cache for every epoch. If num_replace_workers is None then the number returned by os.cpu_count() is used.

  • progress (bool) – whether to display a progress bar when caching for the first epoch.

  • shuffle (bool) – whether to shuffle the whole data list before preparing the cache content for first epoch. it will not modify the original input data sequence in-place.

  • seed (int) – random seed if shuffle is True, default to 0.

is_started()[source]

Check whether the replacement thread is already started.

manage_replacement()[source]

Background thread for replacement.

randomize(data)[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises

NotImplementedError – When the subclass does not override this method.

Return type

None

set_data(data)[source]

Set the input data and run deterministic transforms to generate cache content.

Note: should call shutdown() before calling this func.

shutdown()[source]

Shut down the background thread for replacement.

start()[source]

Start the background thread to replace training items for every epoch.

update_cache()[source]

Update cache items for current epoch, need to call this function before every epoch. If the cache has been shutdown before, need to restart the _replace_mgr thread.

ZipDataset

class monai.data.ZipDataset(datasets, transform=None)[source]

Zip several PyTorch datasets and output data(with the same index) together in a tuple. If the output of single dataset is already a tuple, flatten it and extend to the result. For example: if datasetA returns (img, imgmeta), datasetB returns (seg, segmeta), finally return (img, imgmeta, seg, segmeta). And if the datasets don’t have same length, use the minimum length of them as the length of ZipDataset. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

Examples:

>>> zip_data = ZipDataset([[1, 2, 3], [4, 5]])
>>> print(len(zip_data))
2
>>> for item in zip_data:
>>>    print(item)
[1, 4]
[2, 5]
Parameters
  • datasets (Sequence) – list of datasets to zip together.

  • transform (Optional[Callable]) – a callable data transform operates on the zipped item from datasets.

__init__(datasets, transform=None)[source]
Parameters
  • datasets (Sequence) – list of datasets to zip together.

  • transform (Optional[Callable]) – a callable data transform operates on the zipped item from datasets.

ArrayDataset

class monai.data.ArrayDataset(img, img_transform=None, seg=None, seg_transform=None, labels=None, label_transform=None)[source]

Dataset for segmentation and classification tasks based on array format input data and transforms. It ensures the same random seeds in the randomized transforms defined for image, segmentation and label. The transform can be monai.transforms.Compose or any other callable object. For example: If train based on Nifti format images without metadata, all transforms can be composed:

img_transform = Compose(
    [
        LoadImage(image_only=True),
        AddChannel(),
        RandAdjustContrast()
    ]
)
ArrayDataset(img_file_list, img_transform=img_transform)

If training based on images and the metadata, the array transforms can not be composed because several transforms receives multiple parameters or return multiple values. Then Users need to define their own callable method to parse metadata from LoadImage or set affine matrix to Spacing transform:

class TestCompose(Compose):
    def __call__(self, input_):
        img, metadata = self.transforms[0](input_)
        img = self.transforms[1](img)
        img, _, _ = self.transforms[2](img, metadata["affine"])
        return self.transforms[3](img), metadata
img_transform = TestCompose(
    [
        LoadImage(image_only=False),
        AddChannel(),
        Spacing(pixdim=(1.5, 1.5, 3.0)),
        RandAdjustContrast()
    ]
)
ArrayDataset(img_file_list, img_transform=img_transform)

Examples:

>>> ds = ArrayDataset([1, 2, 3, 4], lambda x: x + 0.1)
>>> print(ds[0])
1.1

>>> ds = ArrayDataset(img=[1, 2, 3, 4], seg=[5, 6, 7, 8])
>>> print(ds[0])
[1, 5]

Initializes the dataset with the filename lists. The transform img_transform is applied to the images and seg_transform to the segmentations.

Parameters
  • img (Sequence) – sequence of images.

  • img_transform (Optional[Callable]) – transform to apply to each element in img.

  • seg (Optional[Sequence]) – sequence of segmentations.

  • seg_transform (Optional[Callable]) – transform to apply to each element in seg.

  • labels (Optional[Sequence]) – sequence of labels.

  • label_transform (Optional[Callable]) – transform to apply to each element in labels.

__init__(img, img_transform=None, seg=None, seg_transform=None, labels=None, label_transform=None)[source]

Initializes the dataset with the filename lists. The transform img_transform is applied to the images and seg_transform to the segmentations.

Parameters
  • img (Sequence) – sequence of images.

  • img_transform (Optional[Callable]) – transform to apply to each element in img.

  • seg (Optional[Sequence]) – sequence of segmentations.

  • seg_transform (Optional[Callable]) – transform to apply to each element in seg.

  • labels (Optional[Sequence]) – sequence of labels.

  • label_transform (Optional[Callable]) – transform to apply to each element in labels.

randomize(data=None)[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises

NotImplementedError – When the subclass does not override this method.

Return type

None

ImageDataset

class monai.data.ImageDataset(image_files, seg_files=None, labels=None, transform=None, seg_transform=None, image_only=True, transform_with_metadata=False, dtype=<class 'numpy.float32'>, reader=None, *args, **kwargs)[source]

Loads image/segmentation pairs of files from the given filename lists. Transformations can be specified for the image and segmentation arrays separately. The difference between this dataset and ArrayDataset is that this dataset can apply transform chain to images and segs and return both the images and metadata, and no need to specify transform to load images from files. For more information, please see the image_dataset demo in the MONAI tutorial repo, https://github.com/Project-MONAI/tutorials/blob/master/modules/image_dataset.ipynb

Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.

Parameters
  • image_files (Sequence[str]) – list of image filenames

  • seg_files (Optional[Sequence[str]]) – if in segmentation task, list of segmentation filenames

  • labels (Optional[Sequence[float]]) – if in classification task, list of classification labels

  • transform (Optional[Callable]) – transform to apply to image arrays

  • seg_transform (Optional[Callable]) – transform to apply to segmentation arrays

  • image_only (bool) – if True return only the image volume, otherwise, return image volume and the metadata

  • transform_with_metadata (bool) – if True, the metadata will be passed to the transforms whenever possible.

  • dtype (Union[dtype, type, None]) – if not None convert the loaded image to this data type

  • reader (Union[ImageReader, str, None]) – register reader to load image file and meta data, if None, will use the default readers. If a string of reader name provided, will construct a reader object with the *args and **kwargs parameters, supported reader name: “NibabelReader”, “PILReader”, “ITKReader”, “NumpyReader”

  • args – additional parameters for reader if providing a reader name

  • kwargs – additional parameters for reader if providing a reader name

Raises

ValueError – When seg_files length differs from image_files

__init__(image_files, seg_files=None, labels=None, transform=None, seg_transform=None, image_only=True, transform_with_metadata=False, dtype=<class 'numpy.float32'>, reader=None, *args, **kwargs)[source]

Initializes the dataset with the image and segmentation filename lists. The transform transform is applied to the images and seg_transform to the segmentations.

Parameters
  • image_files (Sequence[str]) – list of image filenames

  • seg_files (Optional[Sequence[str]]) – if in segmentation task, list of segmentation filenames

  • labels (Optional[Sequence[float]]) – if in classification task, list of classification labels

  • transform (Optional[Callable]) – transform to apply to image arrays

  • seg_transform (Optional[Callable]) – transform to apply to segmentation arrays

  • image_only (bool) – if True return only the image volume, otherwise, return image volume and the metadata

  • transform_with_metadata (bool) – if True, the metadata will be passed to the transforms whenever possible.

  • dtype (Union[dtype, type, None]) – if not None convert the loaded image to this data type

  • reader (Union[ImageReader, str, None]) – register reader to load image file and meta data, if None, will use the default readers. If a string of reader name provided, will construct a reader object with the *args and **kwargs parameters, supported reader name: “NibabelReader”, “PILReader”, “ITKReader”, “NumpyReader”

  • args – additional parameters for reader if providing a reader name

  • kwargs – additional parameters for reader if providing a reader name

Raises

ValueError – When seg_files length differs from image_files

randomize(data=None)[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises

NotImplementedError – When the subclass does not override this method.

Return type

None

NPZDictItemDataset

class monai.data.NPZDictItemDataset(npzfile, keys, transform=None, other_keys=())[source]

Represents a dataset from a loaded NPZ file. The members of the file to load are named in the keys of keys and stored under the keyed name. All loaded arrays must have the same 0-dimension (batch) size. Items are always dicts mapping names to an item extracted from the loaded arrays. If passing slicing indices, will return a PyTorch Subset, for example: data: Subset = dataset[1:4], for more details, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.Subset

Parameters
  • npzfile (Union[str, IO]) – Path to .npz file or stream containing .npz file data

  • keys (Dict[str, str]) – Maps keys to load from file to name to store in dataset

  • transform (Optional[Callable[…, Dict[str, Any]]]) – Transform to apply to batch dict

  • other_keys (Optional[Sequence[str]]) – secondary data to load from file and store in dict other_keys, not returned by __getitem__

  • data – input data to load and transform to generate dataset for model.

  • transform – a callable data transform on input data.

CSVDataset

class monai.data.CSVDataset(filename, row_indices=None, col_names=None, col_types=None, col_groups=None, transform=None, **kwargs)[source]

Dataset to load data from CSV files and generate a list of dictionaries, every dictionary maps to a row of the CSV file, and the keys of dictionary map to the column names of the CSV file.

It can load multiple CSV files and join the tables with additional kwargs arg. Support to only load specific rows and columns. And it can also group several loaded columns to generate a new column, for example, set col_groups={“meta”: [“meta_0”, “meta_1”, “meta_2”]}, output can be:

[
    {"image": "./image0.nii", "meta_0": 11, "meta_1": 12, "meta_2": 13, "meta": [11, 12, 13]},
    {"image": "./image1.nii", "meta_0": 21, "meta_1": 22, "meta_2": 23, "meta": [21, 22, 23]},
]
Parameters
  • filename (Union[str, Sequence[str]]) – the filename of expected CSV file to load. if providing a list of filenames, it will load all the files and join tables.

  • row_indices (Optional[Sequence[Union[str, int]]]) – indices of the expected rows to load. it should be a list, every item can be a int number or a range [start, end) for the indices. for example: row_indices=[[0, 100], 200, 201, 202, 300]. if None, load all the rows in the file.

  • col_names (Optional[Sequence[str]]) – names of the expected columns to load. if None, load all the columns.

  • col_types (Optional[Dict[str, Optional[Dict[str, Any]]]]) –

    type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”]. for example:

    col_types = {
        "subject_id": {"type": str},
        "label": {"type": int, "default": 0},
        "ehr_0": {"type": float, "default": 0.0},
        "ehr_1": {"type": float, "default": 0.0},
        "image": {"type": str, "default": None},
    }
    

  • col_groups (Optional[Dict[str, Sequence[str]]]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}

  • transform (Optional[Callable]) – transform to apply on the loaded items of a dictionary data.

  • kwargs – additional arguments for pandas.merge() API to join tables.

  • data – input data to load and transform to generate dataset for model.

  • transform – a callable data transform on input data.

Patch-based dataset

GridPatchDataset

class monai.data.GridPatchDataset(dataset, patch_iter, transform=None, with_coordinates=True)[source]

Yields patches from images read from an image dataset. Typically used with PatchIter so that the patches are chosen in a contiguous grid sampling scheme.

import numpy as np

from monai.data import GridPatchDataset, DataLoader, PatchIter
from monai.transforms import RandShiftIntensity

# image-level dataset
images = [np.arange(16, dtype=float).reshape(1, 4, 4),
          np.arange(16, dtype=float).reshape(1, 4, 4)]
# image-level patch generator, "grid sampling"
patch_iter = PatchIter(patch_size=(2, 2), start_pos=(0, 0))
# patch-level intensity shifts
patch_intensity = RandShiftIntensity(offsets=1.0, prob=1.0)

# construct the dataset
ds = GridPatchDataset(dataset=images,
                      patch_iter=patch_iter,
                      transform=patch_intensity)
# use the grid patch dataset
for item in DataLoader(ds, batch_size=2, num_workers=2):
    print("patch size:", item[0].shape)
    print("coordinates:", item[1])

# >>> patch size: torch.Size([2, 1, 2, 2])
#     coordinates: tensor([[[0, 1], [0, 2], [0, 2]],
#                          [[0, 1], [2, 4], [0, 2]]])

Initializes this dataset in terms of the image dataset, patch generator, and an optional transform.

Parameters
  • dataset (Sequence) – the dataset to read image data from.

  • patch_iter (Callable) – converts an input image (item from dataset) into a iterable of image patches. patch_iter(dataset[idx]) must yield a tuple: (patches, coordinates). see also: monai.data.PatchIter.

  • transform (Optional[Callable]) – a callable data transform operates on the patches.

  • with_coordinates (bool) – whether to yield the coordinates of each patch, default to True.

__init__(dataset, patch_iter, transform=None, with_coordinates=True)[source]

Initializes this dataset in terms of the image dataset, patch generator, and an optional transform.

Parameters
  • dataset (Sequence) – the dataset to read image data from.

  • patch_iter (Callable) – converts an input image (item from dataset) into a iterable of image patches. patch_iter(dataset[idx]) must yield a tuple: (patches, coordinates). see also: monai.data.PatchIter.

  • transform (Optional[Callable]) – a callable data transform operates on the patches.

  • with_coordinates (bool) – whether to yield the coordinates of each patch, default to True.

PatchIter

class monai.data.PatchIter(patch_size, start_pos=(), mode=NumpyPadMode.WRAP, **pad_opts)[source]

A class to return a patch generator with predefined properties such as patch_size. Typically used with monai.data.GridPatchDataset.

Parameters
  • patch_size (Sequence[int]) – size of patches to generate slices for, 0/None selects whole dimension

  • start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension

  • mode (Union[NumpyPadMode, str]) – {"constant", "edge", "linear_ramp", "maximum", "mean", "median", "minimum", "reflect", "symmetric", "wrap", "empty"} One of the listed string values or a user supplied function. Defaults to "wrap". See also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html

  • pad_opts (Dict) – padding options, see numpy.pad

Note

The patch_size is the size of the patch to sample from the input arrays. It is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).

__init__(patch_size, start_pos=(), mode=NumpyPadMode.WRAP, **pad_opts)[source]
Parameters
  • patch_size (Sequence[int]) – size of patches to generate slices for, 0/None selects whole dimension

  • start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension

  • mode (Union[NumpyPadMode, str]) – {"constant", "edge", "linear_ramp", "maximum", "mean", "median", "minimum", "reflect", "symmetric", "wrap", "empty"} One of the listed string values or a user supplied function. Defaults to "wrap". See also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html

  • pad_opts (Dict) – padding options, see numpy.pad

Note

The patch_size is the size of the patch to sample from the input arrays. It is assumed the arrays first dimension is the channel dimension which will be yielded in its entirety so this should not be specified in patch_size. For example, for an input 3D array with 1 channel of size (1, 20, 20, 20) a regular grid sampling of eight patches (1, 10, 10, 10) would be specified by a patch_size of (10, 10, 10).

PatchDataset

class monai.data.PatchDataset(dataset, patch_func, samples_per_image=1, transform=None)[source]

returns a patch from an image dataset. The patches are generated by a user-specified callable patch_func, and are optionally post-processed by transform. For example, to generate random patch samples from an image dataset:

import numpy as np

from monai.data import PatchDataset, DataLoader
from monai.transforms import RandSpatialCropSamples, RandShiftIntensity

# image dataset
images = [np.arange(16, dtype=float).reshape(1, 4, 4),
          np.arange(16, dtype=float).reshape(1, 4, 4)]
# image patch sampler
n_samples = 5
sampler = RandSpatialCropSamples(roi_size=(3, 3), num_samples=n_samples,
                                 random_center=True, random_size=False)
# patch-level intensity shifts
patch_intensity = RandShiftIntensity(offsets=1.0, prob=1.0)
# construct the patch dataset
ds = PatchDataset(dataset=images,
                  patch_func=sampler,
                  samples_per_image=n_samples,
                  transform=patch_intensity)

# use the patch dataset, length: len(images) x samplers_per_image
print(len(ds))

>>> 10

for item in DataLoader(ds, batch_size=2, shuffle=True, num_workers=2):
    print(item.shape)

>>> torch.Size([2, 1, 3, 3])
Parameters
  • dataset (Sequence) – an image dataset to extract patches from.

  • patch_func (Callable) – converts an input image (item from dataset) into a sequence of image patches. patch_func(dataset[idx]) must return a sequence of patches (length samples_per_image).

  • samples_per_image (int) – patch_func should return a sequence of samples_per_image elements.

  • transform (Optional[Callable]) – transform applied to each patch.

__init__(dataset, patch_func, samples_per_image=1, transform=None)[source]
Parameters
  • dataset (Sequence) – an image dataset to extract patches from.

  • patch_func (Callable) – converts an input image (item from dataset) into a sequence of image patches. patch_func(dataset[idx]) must return a sequence of patches (length samples_per_image).

  • samples_per_image (int) – patch_func should return a sequence of samples_per_image elements.

  • transform (Optional[Callable]) – transform applied to each patch.

Image reader

ImageReader

class monai.data.ImageReader[source]

An abstract class defines APIs to load image files.

Typical usage of an implementation of this class is:

image_reader = MyImageReader()
img_obj = image_reader.read(path_to_image)
img_data, meta_data = image_reader.get_data(img_obj)
  • The read call converts image filenames into image objects,

  • The get_data call fetches the image data, as well as meta data.

  • A reader should implement verify_suffix with the logic of checking the input filename by the filename extensions.

abstract get_data(img)[source]

Extract data array and meta data from loaded image and return them. This function must return two objects, the first is a numpy array of image data, the second is a dictionary of meta data.

Parameters

img – an image object loaded from an image file or a list of image objects.

Return type

Tuple[ndarray, Dict]

abstract read(data, **kwargs)[source]

Read image data from specified file or files. Note that it returns a data object or a sequence of data objects.

Parameters
  • data (Union[Sequence[str], str]) – file name or a list of file names to read.

  • kwargs – additional args for actual read API of 3rd party libs.

Return type

Union[Sequence[Any], Any]

abstract verify_suffix(filename)[source]

Verify whether the specified filename is supported by the current reader. This method should return True if the reader is able to read the format suggested by the filename.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

ITKReader

class monai.data.ITKReader(channel_dim=None, series_name='', **kwargs)[source]

Load medical images based on ITK library. All the supported image formats can be found at: https://github.com/InsightSoftwareConsortium/ITK/tree/master/Modules/IO The loaded data array will be in C order, for example, a 3D image NumPy array index order will be CDWH.

Parameters
  • channel_dim (Optional[int]) –

    the channel dimension of the input image, default is None. This is used to set original_channel_dim in the meta data, EnsureChannelFirstD reads this field. If None, original_channel_dim will be either no_channel or -1.

    • Nifti file is usually “channel last”, so there is no need to specify this argument.

    • PNG file usually has GetNumberOfComponentsPerPixel()==3, so there is no need to specify this argument.

  • series_name (str) – the name of the DICOM series if there are multiple ones. used when loading DICOM series.

  • kwargs – additional args for itk.imread API. more details about available args: https://github.com/InsightSoftwareConsortium/ITK/blob/master/Wrapping/Generators/Python/itk/support/extras.py

get_data(img)[source]

Extract data array and meta data from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of meta data. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the meta data of the first image is used to represent the output meta data.

Parameters

img – an ITK image object loaded from an image file or a list of ITK image objects.

read(data, **kwargs)[source]

Read image data from specified file or files, it can read a list of no-channel images and stack them together as multi-channels data in get_data(). If passing directory path instead of file path, will treat it as DICOM images series and read. Note that the returned object is ITK image object or list of ITK image objects.

Parameters
verify_suffix(filename)[source]

Verify whether the specified file or files format is supported by ITK reader.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

NibabelReader

class monai.data.NibabelReader(as_closest_canonical=False, dtype=<class 'numpy.float32'>, **kwargs)[source]

Load NIfTI format images based on Nibabel library.

Parameters
get_data(img)[source]

Extract data array and meta data from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of meta data. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the meta data of the first image is used to present the output meta data.

Parameters

img – a Nibabel image object loaded from an image file or a list of Nibabel image objects.

read(data, **kwargs)[source]

Read image data from specified file or files, it can read a list of no-channel images and stack them together as multi-channels data in get_data(). Note that the returned object is Nibabel image object or list of Nibabel image objects.

Parameters
verify_suffix(filename)[source]

Verify whether the specified file or files format is supported by Nibabel reader.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

NumpyReader

class monai.data.NumpyReader(npz_keys=None, **kwargs)[source]

Load NPY or NPZ format data based on Numpy library, they can be arrays or pickled objects. A typical usage is to load the mask data for classification task. It can load part of the npz file with specified npz_keys.

Parameters
  • npz_keys (Union[Collection[Hashable], Hashable, None]) – if loading npz file, only load the specified keys, if None, load all the items. stack the loaded items together to construct a new first dimension.

  • kwargs – additional args for numpy.load API except allow_pickle. more details about available args: https://numpy.org/doc/stable/reference/generated/numpy.load.html

get_data(img)[source]

Extract data array and meta data from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of meta data. It constructs affine, original_affine, and spatial_shape and stores them in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the meta data of the first image is used to represent the output meta data.

Parameters

img – a Numpy array loaded from a file or a list of Numpy arrays.

read(data, **kwargs)[source]

Read image data from specified file or files, it can read a list of no-channel data files and stack them together as multi-channels data in get_data(). Note that the returned object is Numpy array or list of Numpy arrays.

Parameters
verify_suffix(filename)[source]

Verify whether the specified file or files format is supported by Numpy reader.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

PILReader

class monai.data.PILReader(converter=None, **kwargs)[source]

Load common 2D image format (supports PNG, JPG, BMP) file or files from provided path.

Parameters
get_data(img)[source]

Extract data array and meta data from loaded image and return them. This function returns two objects, first is numpy array of image data, second is dict of meta data. It computes spatial_shape and stores it in meta dict. When loading a list of files, they are stacked together at a new dimension as the first dimension, and the meta data of the first image is used to represent the output meta data.

Parameters

img – a PIL Image object loaded from a file or a list of PIL Image objects.

read(data, **kwargs)[source]

Read image data from specified file or files, it can read a list of no-channel images and stack them together as multi-channels data in get_data(). Note that the returned object is PIL image or list of PIL image.

Parameters
verify_suffix(filename)[source]

Verify whether the specified file or files format is supported by PIL reader.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

WSIReader

class monai.data.WSIReader(reader_lib='OpenSlide')[source]

Read whole slide imaging and extract patches.

Parameters

reader_lib (str) – backend library to load the images, available options: “OpenSlide” or “cuCIM”.

convert_to_rgb_array(raw_region, dtype=<class 'numpy.uint8'>)[source]

Convert to RGB mode and numpy array

get_data(img, location=(0, 0), size=None, level=0, dtype=<class 'numpy.uint8'>, grid_shape=(1, 1), patch_size=None)[source]

Extract regions as numpy array from WSI image and return them.

Parameters
  • img – a WSIReader image object loaded from a file, or list of CuImage objects

  • location (Tuple[int, int]) – (x_min, y_min) tuple giving the top left pixel in the level 0 reference frame,

  • tuples (or list of) –

  • size (Optional[Tuple[int, int]]) – (height, width) tuple giving the region size, or list of tuples (default to full image size)

  • level (int) –

  • level – the level number, or list of level numbers (default=0)

  • dtype (Union[dtype, type, None]) – the data type of output image

  • grid_shape (Tuple[int, int]) – (row, columns) tuple define a grid to extract patches on that

  • patch_size (Union[int, Tuple[int, int], None]) – (height, width) the size of extracted patches at the given level

read(data, **kwargs)[source]

Read image data from specified file or files. Note that the returned object is CuImage or list of CuImage objects.

Parameters

data (Union[Sequence[str], str, ndarray]) – file name or a list of file names to read.

verify_suffix(filename)[source]

Verify whether the specified file or files format is supported by WSI reader.

Parameters

filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

Return type

bool

Nifti format handling

Writing Nifti

class monai.data.NiftiSaver(output_dir='./', output_postfix='seg', output_ext='.nii.gz', resample=True, mode=GridSampleMode.BILINEAR, padding_mode=GridSamplePadMode.BORDER, align_corners=False, dtype=<class 'numpy.float64'>, output_dtype=<class 'numpy.float32'>, squeeze_end_dims=True, data_root_dir='', separate_folder=True, print_log=True)[source]

Save the data as NIfTI file, it can support single data content or a batch of data. Typically, the data can be segmentation predictions, call save for single data or call save_batch to save a batch of data together. The name of saved file will be {input_image_name}_{output_postfix}{output_ext}, where the input image name is extracted from the provided meta data dictionary. If no meta data provided, use index from 0 as the filename prefix.

Note: image should include channel dimension: [B],C,H,W,[D].

Parameters
  • output_dir (Union[Path, str]) – output image directory.

  • output_postfix (str) – a string appended to all output file names.

  • output_ext (str) – output file extension name.

  • resample (bool) – whether to resample before saving the data array.

  • mode (Union[GridSampleMode, str]) – {"bilinear", "nearest"} This option is used when resample = True. Interpolation mode to calculate output values. Defaults to "bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • padding_mode (Union[GridSamplePadMode, str]) – {"zeros", "border", "reflection"} This option is used when resample = True. Padding mode for outside grid values. Defaults to "border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • dtype (Union[dtype, type, None]) – data type for resampling computation. Defaults to np.float64 for best precision. If None, use the data type of input data.

  • output_dtype (Union[dtype, type, None]) – data type for saving data. Defaults to np.float32.

  • squeeze_end_dims (bool) – if True, any trailing singleton dimensions will be removed (after the channel has been moved to the end). So if input is (C,H,W,D), this will be altered to (H,W,D,C), and then if C==1, it will be saved as (H,W,D). If D also ==1, it will be saved as (H,W). If false, image will always be saved as (H,W,D,C).

  • data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. it’s used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names. for example: input_file_name: /foo/bar/test1/image.nii, postfix: seg output_ext: nii.gz output_dir: /output, data_root_dir: /foo/bar, output will be: /output/test1/image/image_seg.nii.gz

  • separate_folder (bool) – whether to save every file in a separate folder, for example: if input filename is image.nii, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.nii, if False, save as output/image_seg.nii. default to True.

  • print_log (bool) – whether to print log about the saved NIfTI file path, etc. default to True.

__init__(output_dir='./', output_postfix='seg', output_ext='.nii.gz', resample=True, mode=GridSampleMode.BILINEAR, padding_mode=GridSamplePadMode.BORDER, align_corners=False, dtype=<class 'numpy.float64'>, output_dtype=<class 'numpy.float32'>, squeeze_end_dims=True, data_root_dir='', separate_folder=True, print_log=True)[source]
Parameters
  • output_dir (Union[Path, str]) – output image directory.

  • output_postfix (str) – a string appended to all output file names.

  • output_ext (str) – output file extension name.

  • resample (bool) – whether to resample before saving the data array.

  • mode (Union[GridSampleMode, str]) – {"bilinear", "nearest"} This option is used when resample = True. Interpolation mode to calculate output values. Defaults to "bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • padding_mode (Union[GridSamplePadMode, str]) – {"zeros", "border", "reflection"} This option is used when resample = True. Padding mode for outside grid values. Defaults to "border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • dtype (Union[dtype, type, None]) – data type for resampling computation. Defaults to np.float64 for best precision. If None, use the data type of input data.

  • output_dtype (Union[dtype, type, None]) – data type for saving data. Defaults to np.float32.

  • squeeze_end_dims (bool) – if True, any trailing singleton dimensions will be removed (after the channel has been moved to the end). So if input is (C,H,W,D), this will be altered to (H,W,D,C), and then if C==1, it will be saved as (H,W,D). If D also ==1, it will be saved as (H,W). If false, image will always be saved as (H,W,D,C).

  • data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. it’s used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names. for example: input_file_name: /foo/bar/test1/image.nii, postfix: seg output_ext: nii.gz output_dir: /output, data_root_dir: /foo/bar, output will be: /output/test1/image/image_seg.nii.gz

  • separate_folder (bool) – whether to save every file in a separate folder, for example: if input filename is image.nii, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.nii, if False, save as output/image_seg.nii. default to True.

  • print_log (bool) – whether to print log about the saved NIfTI file path, etc. default to True.

save(data, meta_data=None)[source]

Save data into a Nifti file. The meta_data could optionally have the following keys:

  • 'filename_or_obj' – for output file name creation, corresponding to filename or object.

  • 'original_affine' – for data orientation handling, defaulting to an identity matrix.

  • 'affine' – for data output affine, defaulting to an identity matrix.

  • 'spatial_shape' – for data output shape.

  • 'patch_index' – if the data is a patch of big image, append the patch index to filename.

When meta_data is specified, the saver will try to resample batch data from the space defined by “affine” to the space defined by “original_affine”.

If meta_data is None, use the default index (starting from 0) as the filename.

Parameters
  • data (Union[Tensor, ndarray]) – target data content that to be saved as a NIfTI format file. Assuming the data shape starts with a channel dimension and followed by spatial dimensions.

  • meta_data (Optional[Dict]) – the meta data information corresponding to the data.

See Also

monai.data.nifti_writer.write_nifti()

Return type

None

save_batch(batch_data, meta_data=None)[source]

Save a batch of data into Nifti format files.

Spatially it supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively (with resampling supports for 2D and 3D only).

When saving multiple time steps or multiple channels batch_data, time and/or modality axes should be appended after the batch dimensions. For example, the shape of a batch of 2D eight-class segmentation probabilities to be saved could be (batch, 8, 64, 64); in this case each item in the batch will be saved as (64, 64, 1, 8) NIfTI file (the third dimension is reserved as a spatial dimension).

Parameters
  • batch_data (Union[Tensor, ndarray]) – target batch data content that save into NIfTI format.

  • meta_data (Optional[Dict]) – every key-value in the meta_data is corresponding to a batch of data.

Return type

None

monai.data.write_nifti(data, file_name, affine=None, target_affine=None, resample=True, output_spatial_shape=None, mode=GridSampleMode.BILINEAR, padding_mode=GridSamplePadMode.BORDER, align_corners=False, dtype=<class 'numpy.float64'>, output_dtype=<class 'numpy.float32'>)[source]

Write numpy data into NIfTI files to disk. This function converts data into the coordinate system defined by target_affine when target_affine is specified.

If the coordinate transform between affine and target_affine could be achieved by simply transposing and flipping data, no resampling will happen. otherwise this function will resample data using the coordinate transform computed from affine and target_affine. Note that the shape of the resampled data may subject to some rounding errors. For example, resampling a 20x20 pixel image from pixel size (1.5, 1.5)-mm to (3.0, 3.0)-mm space will return a 10x10-pixel image. However, resampling a 20x20-pixel image from pixel size (2.0, 2.0)-mm to (3.0, 3.0)-mma space will output a 14x14-pixel image, where the image shape is rounded from 13.333x13.333 pixels. In this case output_spatial_shape could be specified so that this function writes image data to a designated shape.

The saved affine matrix follows: - If affine equals to target_affine, save the data with target_affine. - If resample=False, transform affine to new_affine based on the orientation of target_affine and save the data with new_affine. - If resample=True, save the data with target_affine, if explicitly specify the output_spatial_shape, the shape of saved data is not computed by target_affine. - If target_affine is None, set target_affine=affine and save. - If affine and target_affine are None, the data will be saved with an identity matrix as the image affine.

This function assumes the NIfTI dimension notations. Spatially it supports up to three dimensions, that is, H, HW, HWD for 1D, 2D, 3D respectively. When saving multiple time steps or multiple channels data, time and/or modality axes should be appended after the first three dimensions. For example, shape of 2D eight-class segmentation probabilities to be saved could be (64, 64, 1, 8). Also, data in shape (64, 64, 8), (64, 64, 8, 1) will be considered as a single-channel 3D image.

Parameters
  • data (ndarray) – input data to write to file.

  • file_name (str) – expected file name that saved on disk.

  • affine (Optional[ndarray]) – the current affine of data. Defaults to np.eye(4)

  • target_affine (Optional[ndarray]) – before saving the (data, affine) as a Nifti1Image, transform the data into the coordinates defined by target_affine.

  • resample (bool) – whether to run resampling when the target affine could not be achieved by swapping/flipping data axes.

  • output_spatial_shape (Union[Sequence[int], ndarray, None]) – spatial shape of the output image. This option is used when resample = True.

  • mode (Union[GridSampleMode, str]) – {"bilinear", "nearest"} This option is used when resample = True. Interpolation mode to calculate output values. Defaults to "bilinear". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • padding_mode (Union[GridSamplePadMode, str]) – {"zeros", "border", "reflection"} This option is used when resample = True. Padding mode for outside grid values. Defaults to "border". See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • align_corners (bool) – Geometrically, we consider the pixels of the input as squares rather than points. See also: https://pytorch.org/docs/stable/nn.functional.html#grid-sample

  • dtype (Union[dtype, type, None]) – data type for resampling computation. Defaults to np.float64 for best precision. If None, use the data type of input data.

  • output_dtype (Union[dtype, type, None]) – data type for saving data. Defaults to np.float32.

Return type

None

PNG format handling

Writing PNG

class monai.data.PNGSaver(output_dir='./', output_postfix='seg', output_ext='.png', resample=True, mode=InterpolateMode.NEAREST, scale=None, data_root_dir='', separate_folder=True, print_log=True)[source]

Save the data as png file, it can support single data content or a batch of data. Typically, the data can be segmentation predictions, call save for single data or call save_batch to save a batch of data together. The name of saved file will be {input_image_name}_{output_postfix}{output_ext}, where the input image name is extracted from the provided meta data dictionary. If no meta data provided, use index from 0 as the filename prefix.

Parameters
  • output_dir (Union[Path, str]) – output image directory.

  • output_postfix (str) – a string appended to all output file names.

  • output_ext (str) – output file extension name.

  • resample (bool) – whether to resample and resize if providing spatial_shape in the metadata.

  • mode (Union[InterpolateMode, str]) – {"nearest", "linear", "bilinear", "bicubic", "trilinear", "area"} The interpolation mode. Defaults to "nearest". See also: https://pytorch.org/docs/stable/nn.functional.html#interpolate

  • scale (Optional[int]) – {255, 65535} postprocess data by clipping to [0, 1] and scaling [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.

  • data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. it’s used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names. for example: input_file_name: /foo/bar/test1/image.png, postfix: seg output_ext: png output_dir: /output, data_root_dir: /foo/bar, output will be: /output/test1/image/image_seg.png

  • separate_folder (bool) – whether to save every file in a separate folder, for example: if input filename is image.png, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.png, if False, save as output/image_seg.nii. default to True.

  • print_log (bool) – whether to print log about the saved PNG file path, etc. default to True.

__init__(output_dir='./', output_postfix='seg', output_ext='.png', resample=True, mode=InterpolateMode.NEAREST, scale=None, data_root_dir='', separate_folder=True, print_log=True)[source]
Parameters
  • output_dir (Union[Path, str]) – output image directory.

  • output_postfix (str) – a string appended to all output file names.

  • output_ext (str) – output file extension name.

  • resample (bool) – whether to resample and resize if providing spatial_shape in the metadata.

  • mode (Union[InterpolateMode, str]) – {"nearest", "linear", "bilinear", "bicubic", "trilinear", "area"} The interpolation mode. Defaults to "nearest". See also: https://pytorch.org/docs/stable/nn.functional.html#interpolate

  • scale (Optional[int]) – {255, 65535} postprocess data by clipping to [0, 1] and scaling [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.

  • data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. it’s used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names. for example: input_file_name: /foo/bar/test1/image.png, postfix: seg output_ext: png output_dir: /output, data_root_dir: /foo/bar, output will be: /output/test1/image/image_seg.png

  • separate_folder (bool) – whether to save every file in a separate folder, for example: if input filename is image.png, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.png, if False, save as output/image_seg.nii. default to True.

  • print_log (bool) – whether to print log about the saved PNG file path, etc. default to True.

save(data, meta_data=None)[source]

Save data into a png file. The meta_data could optionally have the following keys:

  • 'filename_or_obj' – for output file name creation, corresponding to filename or object.

  • 'spatial_shape' – for data output shape.

  • 'patch_index' – if the data is a patch of big image, append the patch index to filename.

If meta_data is None, use the default index (starting from 0) as the filename.

Parameters
  • data (Union[Tensor, ndarray]) – target data content that to be saved as a png format file. Assuming the data shape are spatial dimensions. Shape of the spatial dimensions (C,H,W). C should be 1, 3 or 4

  • meta_data (Optional[Dict]) – the meta data information corresponding to the data.

Raises

ValueError – When data channels is not one of [1, 3, 4].

See Also

monai.data.png_writer.write_png()

Return type

None

save_batch(batch_data, meta_data=None)[source]

Save a batch of data into png format files.

Parameters
  • batch_data (Union[Tensor, ndarray]) – target batch data content that save into png format.

  • meta_data (Optional[Dict]) – every key-value in the meta_data is corresponding to a batch of data.

Return type

None

monai.data.write_png(data, file_name, output_spatial_shape=None, mode=InterpolateMode.BICUBIC, scale=None)[source]

Write numpy data into png files to disk. Spatially it supports HW for 2D.(H,W) or (H,W,3) or (H,W,4). If scale is None, expect the input data in np.uint8 or np.uint16 type. It’s based on the Image module in PIL library: https://pillow.readthedocs.io/en/stable/reference/Image.html

Parameters
  • data (ndarray) – input data to write to file.

  • file_name (str) – expected file name that saved on disk.

  • output_spatial_shape (Optional[Sequence[int]]) – spatial shape of the output image.

  • mode (Union[InterpolateMode, str]) – {"nearest", "linear", "bilinear", "bicubic", "trilinear", "area"} The interpolation mode. Defaults to "bicubic". See also: https://pytorch.org/docs/stable/nn.functional.html#interpolate

  • scale (Optional[int]) – {255, 65535} postprocess data by clipping to [0, 1] and scaling to [0, 255] (uint8) or [0, 65535] (uint16). Default is None to disable scaling.

Raises

ValueError – When scale is not one of [255, 65535].

Return type

None

Synthetic

monai.data.synthetic.create_test_image_2d(width, height, num_objs=12, rad_max=30, rad_min=5, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]

Return a noisy 2D image with num_objs circles and a 2D mask image. The maximum and minimum radii of the circles are given as rad_max and rad_min. The mask will have num_seg_classes number of classes for segmentations labeled sequentially from 1, plus a background class represented as 0. If noise_max is greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). If channel_dim is None, will create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim.

Parameters
  • width (int) – width of the image. The value should be larger than 2 * rad_max.

  • height (int) – height of the image. The value should be larger than 2 * rad_max.

  • num_objs (int) – number of circles to generate. Defaults to 12.

  • rad_max (int) – maximum circle radius. Defaults to 30.

  • rad_min (int) – minimum circle radius. Defaults to 5.

  • noise_max (float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.

  • num_seg_classes (int) – number of classes for segmentations. Defaults to 5.

  • channel_dim (Optional[int]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.

  • random_state (Optional[RandomState]) – the random generator to use. Defaults to np.random.

Return type

Tuple[ndarray, ndarray]

monai.data.synthetic.create_test_image_3d(height, width, depth, num_objs=12, rad_max=30, rad_min=5, noise_max=0.0, num_seg_classes=5, channel_dim=None, random_state=None)[source]

Return a noisy 3D image and segmentation.

Parameters
  • height (int) – height of the image. The value should be larger than 2 * rad_max.

  • width (int) – width of the image. The value should be larger than 2 * rad_max.

  • depth (int) – depth of the image. The value should be larger than 2 * rad_max.

  • num_objs (int) – number of circles to generate. Defaults to 12.

  • rad_max (int) – maximum circle radius. Defaults to 30.

  • rad_min (int) – minimum circle radius. Defaults to 5.

  • noise_max (float) – if greater than 0 then noise will be added to the image taken from the uniform distribution on range [0,noise_max). Defaults to 0.

  • num_seg_classes (int) – number of classes for segmentations. Defaults to 5.

  • channel_dim (Optional[int]) – if None, create an image without channel dimension, otherwise create an image with channel dimension as first dim or last dim. Defaults to None.

  • random_state (Optional[RandomState]) – the random generator to use. Defaults to np.random.

Return type

Tuple[ndarray, ndarray]

Utilities

monai.data.utils.compute_importance_map(patch_size, mode=BlendMode.CONSTANT, sigma_scale=0.125, device='cpu')[source]

Get importance map for different weight modes.

Parameters
  • patch_size (Tuple[int, …]) – Size of the required importance map. This should be either H, W [,D].

  • mode (Union[BlendMode, str]) –

    {"constant", "gaussian"} How to blend output of overlapping windows. Defaults to "constant".

    • "constant”: gives equal weight to all predictions.

    • "gaussian”: gives less weight to predictions on edges of windows.

  • sigma_scale (Union[Sequence[float], float]) – Sigma_scale to calculate sigma for each dimension (sigma = sigma_scale * dim_size). Used for gaussian mode only.

  • device (Union[device, int, str]) – Device to put importance map on.

Raises

ValueError – When mode is not one of [“constant”, “gaussian”].

Return type

Tensor

Returns

Tensor of size patch_size.

monai.data.utils.compute_shape_offset(spatial_shape, in_affine, out_affine)[source]

Given input and output affine, compute appropriate shapes in the output space based on the input array’s shape. This function also returns the offset to put the shape in a good position with respect to the world coordinate system.

Parameters
  • spatial_shape (Union[ndarray, Sequence[int]]) – input array’s shape

  • in_affine (matrix) – 2D affine matrix

  • out_affine (matrix) – 2D affine matrix

Return type

Tuple[ndarray, ndarray]

monai.data.utils.convert_tables_to_dicts(dfs, row_indices=None, col_names=None, col_types=None, col_groups=None, **kwargs)[source]

Utility to join pandas tables, select rows, columns and generate groups. Will return a list of dictionaries, every dictionary maps to a row of data in tables.

Parameters
  • dfs – data table in pandas Dataframe format. if providing a list of tables, will join them.

  • row_indices (Optional[Sequence[Union[str, int]]]) – indices of the expected rows to load. it should be a list, every item can be a int number or a range [start, end) for the indices. for example: row_indices=[[0, 100], 200, 201, 202, 300]. if None, load all the rows in the file.

  • col_names (Optional[Sequence[str]]) – names of the expected columns to load. if None, load all the columns.

  • col_types (Optional[Dict[str, Optional[Dict[str, Any]]]]) –

    type and default value to convert the loaded columns, if None, use original data. it should be a dictionary, every item maps to an expected column, the key is the column name and the value is None or a dictionary to define the default value and data type. the supported keys in dictionary are: [“type”, “default”], and note that the value of default should not be None. for example:

    col_types = {
        "subject_id": {"type": str},
        "label": {"type": int, "default": 0},
        "ehr_0": {"type": float, "default": 0.0},
        "ehr_1": {"type": float, "default": 0.0},
    }
    

  • col_groups (Optional[Dict[str, Sequence[str]]]) – args to group the loaded columns to generate a new column, it should be a dictionary, every item maps to a group, the key will be the new column name, the value is the names of columns to combine. for example: col_groups={“ehr”: [f”ehr_{i}” for i in range(10)], “meta”: [“meta_1”, “meta_2”]}

  • kwargs – additional arguments for pandas.merge() API to join tables.

Return type

List[Dict[str, Any]]

monai.data.utils.correct_nifti_header_if_necessary(img_nii)[source]

Check nifti object header’s format, update the header if needed. In the updated image pixdim matches the affine.

Parameters

img_nii – nifti image object

monai.data.utils.create_file_basename(postfix, input_file_name, folder_path, data_root_dir='', separate_folder=True, patch_index=None)[source]

Utility function to create the path to the output file based on the input filename (file name extension is not added by this function). When data_root_dir is not specified, the output file name is:

folder_path/input_file_name (no ext.) /input_file_name (no ext.)[_postfix]

otherwise the relative path with respect to data_root_dir will be inserted, for example: input_file_name: /foo/bar/test1/image.png, postfix: seg folder_path: /output, data_root_dir: /foo/bar, output will be: /output/test1/image/image_seg

Parameters
  • postfix (str) – output name’s postfix

  • input_file_name (str) – path to the input image file.

  • folder_path (Union[Path, str]) – path for the output file

  • data_root_dir (str) – if not empty, it specifies the beginning parts of the input file’s absolute path. This is used to compute input_file_rel_path, the relative path to the file from data_root_dir to preserve folder structure when saving in case there are files in different folders with the same file names.

  • separate_folder (bool) – whether to save every file in a separate folder, for example: if input filename is image.nii, postfix is seg and folder_path is output, if True, save as: output/image/image_seg.nii, if False, save as output/image_seg.nii. default to True.

  • patch_index (Optional[int]) – if not None, append the patch index to filename.

Return type

str

monai.data.utils.decollate_batch(batch, detach=True)[source]

De-collate a batch of data (for example, as produced by a DataLoader).

Returns a list of structures with the original tensor’s 0-th dimension sliced into elements using torch.unbind.

Images originally stored as (B,C,H,W,[D]) will be returned as (C,H,W,[D]). Other information, such as metadata, may have been stored in a list (or a list inside nested dictionaries). In this case we return the element of the list corresponding to the batch idx.

Return types aren’t guaranteed to be the same as the original, since numpy arrays will have been converted to torch.Tensor, sequences may be converted to lists of tensors, mappings may be converted into dictionaries.

For example:

batch_data = {
    "image": torch.rand((2,1,10,10)),
    "image_meta_dict": {"scl_slope": torch.Tensor([0.0, 0.0])}
}
out = decollate_batch(batch_data)
print(len(out))
>>> 2

print(out[0])
>>> {'image': tensor([[[4.3549e-01...43e-01]]]), 'image_meta_dict': {'scl_slope': 0.0}}

batch_data = [torch.rand((2,1,10,10)), torch.rand((2,3,5,5))]
out = decollate_batch(batch_data)
print(out[0])
>>> [tensor([[[4.3549e-01...43e-01]]], tensor([[[5.3435e-01...45e-01]]])]

batch_data = torch.rand((2,1,10,10))
out = decollate_batch(batch_data)
print(out[0])
>>> tensor([[[4.3549e-01...43e-01]]])
Parameters
  • batch – data to be de-collated.

  • detach (bool) – whether to detach the tensors. Scalars tensors will be detached into number types instead of torch tensors.

monai.data.utils.dense_patch_slices(image_size, patch_size, scan_interval)[source]

Enumerate all slices defining ND patches of size patch_size from an image_size input image.

Parameters
  • image_size (Sequence[int]) – dimensions of image to iterate over

  • patch_size (Sequence[int]) – size of patches to generate slices

  • scan_interval (Sequence[int]) – dense patch sampling interval

Return type

List[Tuple[slice, …]]

Returns

a list of slice objects defining each patch

monai.data.utils.get_random_patch(dims, patch_size, rand_state=None)[source]

Returns a tuple of slices to define a random patch in an array of shape dims with size patch_size or the as close to it as possible within the given dimension. It is expected that patch_size is a valid patch for a source of shape dims as returned by get_valid_patch_size.

Parameters
  • dims (Sequence[int]) – shape of source array

  • patch_size (Sequence[int]) – shape of patch size to generate

  • rand_state (Optional[RandomState]) – a random state object to generate random numbers from

Returns

a tuple of slice objects defining the patch

Return type

(tuple of slice)

monai.data.utils.get_valid_patch_size(image_size, patch_size)[source]

Given an image of dimensions image_size, return a patch size tuple taking the dimension from patch_size if this is not 0/None. Otherwise, or if patch_size is shorter than image_size, the dimension from image_size is taken. This ensures the returned patch size is within the bounds of image_size. If patch_size is a single number this is interpreted as a patch of the same dimensionality of image_size with that size in each dimension.

Return type

Tuple[int, …]

monai.data.utils.is_supported_format(filename, suffixes)[source]

Verify whether the specified file or files format match supported suffixes. If supported suffixes is None, skip the verification and return True.

Parameters
  • filename (Union[Sequence[str], str]) – file name or a list of file names to read. if a list of files, verify all the suffixes.

  • suffixes (Sequence[str]) – all the supported image suffixes of current reader, must be a list of lower case suffixes.

Return type

bool

monai.data.utils.iter_patch(arr, patch_size=0, start_pos=(), copy_back=True, mode=NumpyPadMode.WRAP, **pad_opts)[source]

Yield successive patches from arr of size patch_size. The iteration can start from position start_pos in arr but drawing from a padded array extended by the patch_size in each dimension (so these coordinates can be negative to start in the padded region). If copy_back is True the values from each patch are written back to arr.

Parameters
  • arr (ndarray) – array to iterate over

  • patch_size (Union[Sequence[int], int]) – size of patches to generate slices for, 0 or None selects whole dimension

  • start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension

  • copy_back (bool) – if True data from the yielded patches is copied back to arr once the generator completes

  • mode (Union[NumpyPadMode, str]) – {"constant", "edge", "linear_ramp", "maximum", "mean", "median", "minimum", "reflect", "symmetric", "wrap", "empty"} One of the listed string values or a user supplied function. Defaults to "wrap". See also: https://numpy.org/doc/1.18/reference/generated/numpy.pad.html

  • pad_opts (Dict) – padding options, see numpy.pad

Yields

Patches of array data from arr which are views into a padded array which can be modified, if copy_back is True these changes will be reflected in arr once the iteration completes.

Note

coordinate format is:

[1st_dim_start, 1st_dim_end,

2nd_dim_start, 2nd_dim_end, …, Nth_dim_start, Nth_dim_end]]

monai.data.utils.iter_patch_slices(dims, patch_size, start_pos=())[source]

Yield successive tuples of slices defining patches of size patch_size from an array of dimensions dims. The iteration starts from position start_pos in the array, or starting at the origin if this isn’t provided. Each patch is chosen in a contiguous grid using a first dimension as least significant ordering.

Parameters
  • dims (Sequence[int]) – dimensions of array to iterate over

  • patch_size (Union[Sequence[int], int]) – size of patches to generate slices for, 0 or None selects whole dimension

  • start_pos (Sequence[int]) – starting position in the array, default is 0 for each dimension

Yields

Tuples of slice objects defining each patch

Return type

Generator[Tuple[slice, …], None, None]

monai.data.utils.json_hashing(item)[source]
Parameters

item – data item to be hashed

Returns: the corresponding hash key

Return type

bytes

monai.data.utils.list_data_collate(batch)[source]

Enhancement for PyTorch DataLoader default collate. If dataset already returns a list of batch data that generated in transforms, need to merge all data to 1 list. Then it’s same as the default collate behavior.

Note

Need to use this collate if apply some transforms that can generate batch data.

monai.data.utils.no_collation(x)[source]

No any collation operation.

monai.data.utils.pad_list_data_collate(batch, method=Method.SYMMETRIC, mode=NumpyPadMode.CONSTANT, **np_kwargs)[source]

Function version of monai.transforms.croppad.batch.PadListDataCollate.

Same as MONAI’s list_data_collate, except any tensors are centrally padded to match the shape of the biggest tensor in each dimension. This transform is useful if some of the applied transforms generate batch data of different sizes.

This can be used on both list and dictionary data. In the case of the dictionary data, this transform will be added to the list of invertible transforms.

The inverse can be called using the static method: monai.transforms.croppad.batch.PadListDataCollate.inverse.

Parameters
monai.data.utils.partition_dataset(data, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]

Split the dataset into N partitions. It can support shuffle based on specified random seed. Will return a set of datasets, every dataset contains 1 partition of original dataset. And it can split the dataset based on specified ratios or evenly split into num_partitions. Refer to: https://pytorch.org/docs/stable/distributed.html#module-torch.distributed.launch.

Note

It also can be used to partition dataset for ranks in distributed training. For example, partition dataset before training and use CacheDataset, every rank trains with its own data. It can avoid duplicated caching content in each rank, but will not do global shuffle before every epoch:

data_partition = partition_dataset(
    data=train_files,
    num_partitions=dist.get_world_size(),
    shuffle=True,
    even_divisible=True,
)[dist.get_rank()]

train_ds = SmartCacheDataset(
    data=data_partition,
    transform=train_transforms,
    replace_rate=0.2,
    cache_num=15,
)
Parameters
  • data (Sequence) – input dataset to split, expect a list of data.

  • ratios (Optional[Sequence[float]]) – a list of ratio number to split the dataset, like [8, 1, 1].

  • num_partitions (Optional[int]) – expected number of the partitions to evenly split, only works when ratios not specified.

  • shuffle (bool) – whether to shuffle the original dataset before splitting.

  • seed (int) – random seed to shuffle the dataset, only works when shuffle is True.

  • drop_last (bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.

  • even_divisible (bool) – if True, guarantee every partition has same length.

Examples:

>>> data = [1, 2, 3, 4, 5]
>>> partition_dataset(data, ratios=[0.6, 0.2, 0.2], shuffle=False)
[[1, 2, 3], [4], [5]]
>>> partition_dataset(data, num_partitions=2, shuffle=False)
[[1, 3, 5], [2, 4]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=True)
[[1, 3], [2, 4]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=False)
[[1, 3, 5], [2, 4, 1]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=False, drop_last=False)
[[1, 3, 5], [2, 4]]
monai.data.utils.partition_dataset_classes(data, classes, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]

Split the dataset into N partitions based on the given class labels. It can make sure the same ratio of classes in every partition. Others are same as monai.data.partition_dataset.

Parameters
  • data (Sequence) – input dataset to split, expect a list of data.

  • classes (Sequence[int]) – a list of labels to help split the data, the length must match the length of data.

  • ratios (Optional[Sequence[float]]) – a list of ratio number to split the dataset, like [8, 1, 1].

  • num_partitions (Optional[int]) – expected number of the partitions to evenly split, only works when no ratios.

  • shuffle (bool) – whether to shuffle the original dataset before splitting.

  • seed (int) – random seed to shuffle the dataset, only works when shuffle is True.

  • drop_last (bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.

  • even_divisible (bool) – if True, guarantee every partition has same length.

Examples:

>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> classes = [2, 0, 2, 1, 3, 2, 2, 0, 2, 0, 3, 3, 1, 3]
>>> partition_dataset_classes(data, classes, shuffle=False, ratios=[2, 1])
[[2, 8, 4, 1, 3, 6, 5, 11, 12], [10, 13, 7, 9, 14]]
monai.data.utils.pickle_hashing(item, protocol=4)[source]
Parameters
  • item – data item to be hashed

  • protocol – protocol version used for pickling, defaults to pickle.HIGHEST_PROTOCOL.

Returns: the corresponding hash key

Return type

bytes

monai.data.utils.rectify_header_sform_qform(img_nii)[source]

Look at the sform and qform of the nifti object and correct it if any incompatibilities with pixel dimensions

Adapted from https://github.com/NifTK/NiftyNet/blob/v0.6.0/niftynet/io/misc_io.py

Parameters

img_nii – nifti image object

monai.data.utils.rep_scalar_to_batch(batch_data)[source]

Utility tp replicate the scalar items of a list or dictionary to ensure all the items have batch dimension. It leverages decollate_batch(detach=False) to filter out the scalar items.

Return type

Union[List, Dict]

monai.data.utils.select_cross_validation_folds(partitions, folds)[source]

Select cross validation data based on data partitions and specified fold index. if a list of fold indices is provided, concatenate the partitions of these folds.

Parameters
  • partitions (Sequence[Iterable]) – a sequence of datasets, each item is a iterable

  • folds (Union[Sequence[int], int]) – the indices of the partitions to be combined.

Return type

List

Returns

A list of combined datasets.

Example:

>>> partitions = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
>>> select_cross_validation_folds(partitions, 2)
[5, 6]
>>> select_cross_validation_folds(partitions, [1, 2])
[3, 4, 5, 6]
>>> select_cross_validation_folds(partitions, [-1, 2])
[9, 10, 5, 6]
monai.data.utils.set_rnd(obj, seed)[source]

Set seed or random state for all randomisable properties of obj.

Parameters

seed (int) – set the random state with an integer seed.

Return type

int

monai.data.utils.sorted_dict(item, key=None, reverse=False)[source]

Return a new sorted dictionary from the item.

monai.data.utils.to_affine_nd(r, affine)[source]

Using elements from affine, to create a new affine matrix by assigning the rotation/zoom/scaling matrix and the translation vector.

when r is an integer, output is an (r+1)x(r+1) matrix, where the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(r, len(affine) - 1).

when r is an affine matrix, the output has the same as r, the top left kxk elements are copied from affine, the last column of the output affine is copied from affine’s last column. k is determined by min(len(r) - 1, len(affine) - 1).

Parameters
  • r (int or matrix) – number of spatial dimensions or an output affine to be filled.

  • affine (matrix) – 2D affine matrix

Raises
  • ValueError – When affine dimensions is not 2.

  • ValueError – When r is nonpositive.

Return type

ndarray

Returns

an (r+1) x (r+1) matrix

monai.data.utils.worker_init_fn(worker_id)[source]

Callback function for PyTorch DataLoader worker_init_fn. It can set different random seed for the transforms in different workers.

Return type

None

monai.data.utils.zoom_affine(affine, scale, diagonal=True)[source]

To make column norm of affine the same as scale. If diagonal is False, returns an affine that combines orthogonal rotation and the new scale. This is done by first decomposing affine, then setting the zoom factors to scale, and composing a new affine; the shearing factors are removed. If diagonal is True, returns a diagonal matrix, the scaling factors are set to the diagonal elements. This function always return an affine with zero translations.

Parameters
  • affine (nxn matrix) – a square matrix.

  • scale (Sequence[float]) – new scaling factor along each dimension. if the components of the scale are non-positive values, will use the corresponding components of the original pixdim, which is computed from the affine.

  • diagonal (bool) – whether to return a diagonal scaling matrix. Defaults to True.

Raises
  • ValueError – When affine is not a square matrix.

  • ValueError – When scale contains a nonpositive scalar.

Returns

the updated n x n affine.

Partition Dataset

monai.data.partition_dataset(data, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]

Split the dataset into N partitions. It can support shuffle based on specified random seed. Will return a set of datasets, every dataset contains 1 partition of original dataset. And it can split the dataset based on specified ratios or evenly split into num_partitions. Refer to: https://pytorch.org/docs/stable/distributed.html#module-torch.distributed.launch.

Note

It also can be used to partition dataset for ranks in distributed training. For example, partition dataset before training and use CacheDataset, every rank trains with its own data. It can avoid duplicated caching content in each rank, but will not do global shuffle before every epoch:

data_partition = partition_dataset(
    data=train_files,
    num_partitions=dist.get_world_size(),
    shuffle=True,
    even_divisible=True,
)[dist.get_rank()]

train_ds = SmartCacheDataset(
    data=data_partition,
    transform=train_transforms,
    replace_rate=0.2,
    cache_num=15,
)
Parameters
  • data (Sequence) – input dataset to split, expect a list of data.

  • ratios (Optional[Sequence[float]]) – a list of ratio number to split the dataset, like [8, 1, 1].

  • num_partitions (Optional[int]) – expected number of the partitions to evenly split, only works when ratios not specified.

  • shuffle (bool) – whether to shuffle the original dataset before splitting.

  • seed (int) – random seed to shuffle the dataset, only works when shuffle is True.

  • drop_last (bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.

  • even_divisible (bool) – if True, guarantee every partition has same length.

Examples:

>>> data = [1, 2, 3, 4, 5]
>>> partition_dataset(data, ratios=[0.6, 0.2, 0.2], shuffle=False)
[[1, 2, 3], [4], [5]]
>>> partition_dataset(data, num_partitions=2, shuffle=False)
[[1, 3, 5], [2, 4]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=True)
[[1, 3], [2, 4]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=True, drop_last=False)
[[1, 3, 5], [2, 4, 1]]
>>> partition_dataset(data, num_partitions=2, shuffle=False, even_divisible=False, drop_last=False)
[[1, 3, 5], [2, 4]]

Partition Dataset based on classes

monai.data.partition_dataset_classes(data, classes, ratios=None, num_partitions=None, shuffle=False, seed=0, drop_last=False, even_divisible=False)[source]

Split the dataset into N partitions based on the given class labels. It can make sure the same ratio of classes in every partition. Others are same as monai.data.partition_dataset.

Parameters
  • data (Sequence) – input dataset to split, expect a list of data.

  • classes (Sequence[int]) – a list of labels to help split the data, the length must match the length of data.

  • ratios (Optional[Sequence[float]]) – a list of ratio number to split the dataset, like [8, 1, 1].

  • num_partitions (Optional[int]) – expected number of the partitions to evenly split, only works when no ratios.

  • shuffle (bool) – whether to shuffle the original dataset before splitting.

  • seed (int) – random seed to shuffle the dataset, only works when shuffle is True.

  • drop_last (bool) – only works when even_divisible is False and no ratios specified. if True, will drop the tail of the data to make it evenly divisible across partitions. if False, will add extra indices to make the data evenly divisible across partitions.

  • even_divisible (bool) – if True, guarantee every partition has same length.

Examples:

>>> data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> classes = [2, 0, 2, 1, 3, 2, 2, 0, 2, 0, 3, 3, 1, 3]
>>> partition_dataset_classes(data, classes, shuffle=False, ratios=[2, 1])
[[2, 8, 4, 1, 3, 6, 5, 11, 12], [10, 13, 7, 9, 14]]

DistributedSampler

class monai.data.DistributedSampler(dataset, even_divisible=True, num_replicas=None, rank=None, shuffle=True, **kwargs)[source]

Enhance PyTorch DistributedSampler to support non-evenly divisible sampling.

Parameters
  • dataset (Dataset) – Dataset used for sampling.

  • even_divisible (bool) – if False, different ranks can have different data length. for example, input data: [1, 2, 3, 4, 5], rank 0: [1, 3, 5], rank 1: [2, 4].

  • num_replicas (Optional[int]) – number of processes participating in distributed training. by default, world_size is retrieved from the current distributed group.

  • rank (Optional[int]) – rank of the current process within num_replicas. by default, rank is retrieved from the current distributed group.

  • shuffle (bool) – if True, sampler will shuffle the indices, default to True.

  • kwargs – additional arguments for DistributedSampler super class, can be seed and drop_last.

More information about DistributedSampler, please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler.

DistributedWeightedRandomSampler

class monai.data.DistributedWeightedRandomSampler(dataset, weights, num_samples_per_rank=None, generator=None, even_divisible=True, num_replicas=None, rank=None, shuffle=True, **kwargs)[source]

Extend the DistributedSampler to support weighted sampling. Refer to torch.utils.data.WeightedRandomSampler, for more details please check: https://pytorch.org/docs/stable/data.html#torch.utils.data.WeightedRandomSampler.

Parameters
  • dataset (Dataset) – Dataset used for sampling.

  • weights (Sequence[float]) – a sequence of weights, not necessary summing up to one, length should exactly match the full dataset.

  • num_samples_per_rank (Optional[int]) – number of samples to draw for every rank, sample from the distributed subset of dataset. if None, default to the length of dataset split by DistributedSampler.

  • generator (Optional[Generator]) – PyTorch Generator used in sampling.

  • even_divisible (bool) – if False, different ranks can have different data length. for example, input data: [1, 2, 3, 4, 5], rank 0: [1, 3, 5], rank 1: [2, 4].’

  • num_replicas (Optional[int]) – number of processes participating in distributed training. by default, world_size is retrieved from the current distributed group.

  • rank (Optional[int]) – rank of the current process within num_replicas. by default, rank is retrieved from the current distributed group.

  • shuffle (bool) – if True, sampler will shuffle the indices, default to True.

  • kwargs – additional arguments for DistributedSampler super class, can be seed and drop_last.

DatasetSummary

class monai.data.DatasetSummary(dataset, image_key='image', label_key='label', meta_key_postfix='meta_dict', num_workers=0, **kwargs)[source]

This class provides a way to calculate a reasonable output voxel spacing according to the input dataset. The achieved values can used to resample the input in 3d segmentation tasks (like using as the pixdim parameter in monai.transforms.Spacingd). In addition, it also supports to count the mean, std, min and max intensities of the input, and these statistics are helpful for image normalization (like using in monai.transforms.ScaleIntensityRanged and monai.transforms.NormalizeIntensityd).

The algorithm for calculation refers to: Automated Design of Deep Learning Methods for Biomedical Image Segmentation.

Parameters
  • dataset (Dataset) – dataset from which to load the data.

  • image_key (Optional[str]) – key name of images (default: image).

  • label_key (Optional[str]) – key name of labels (default: label).

  • meta_key_postfix (str) – use {image_key}_{meta_key_postfix} to fetch the meta data from dict, the meta data is a dictionary object (default: meta_dict).

  • num_workers (int) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process (default: 0).

  • kwargs – other parameters (except batch_size) for DataLoader (this class forces to use batch_size=1).

Decathlon Datalist

monai.data.load_decathlon_datalist(data_list_file_path, is_segmentation=True, data_list_key='training', base_dir=None)[source]

Load image/label paths of decathlon challenge from JSON file

Json file is similar to what you get from http://medicaldecathlon.com/ Those dataset.json files

Parameters
  • data_list_file_path (str) – the path to the json file of datalist.

  • is_segmentation (bool) – whether the datalist is for segmentation task, default is True.

  • data_list_key (str) – the key to get a list of dictionary to be used, default is “training”.

  • base_dir (Optional[str]) – the base directory of the dataset, if None, use the datalist directory.

Raises
  • ValueError – When data_list_file_path does not point to a file.

  • ValueError – When data_list_key is not specified in the data list file.

Returns a list of data items, each of which is a dict keyed by element names, for example:

[
    {'image': '/workspace/data/chest_19.nii.gz',  'label': 0},
    {'image': '/workspace/data/chest_31.nii.gz',  'label': 1}
]
Return type

List[Dict]

DataLoader

class monai.data.DataLoader(dataset, num_workers=0, **kwargs)[source]

Provides an iterable over the given dataset. It inherits the PyTorch DataLoader and adds enhanced collate_fn and worker_fn by default.

Although this class could be configured to be the same as torch.utils.data.DataLoader, its default configuration is recommended, mainly for the following extra features:

  • It handles MONAI randomizable objects with appropriate random state managements for deterministic behaviour.

  • It is aware of the patch-based transform (such as monai.transforms.RandSpatialCropSamplesDict) samples for preprocessing with enhanced data collating behaviour. See: monai.transforms.Compose.

For more details about torch.utils.data.DataLoader, please see: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader.

For example, to construct a randomized dataset and iterate with the data loader:

import torch

from monai.data import DataLoader
from monai.transforms import Randomizable


class RandomDataset(torch.utils.data.Dataset, Randomizable):
    def __getitem__(self, index):
        return self.R.randint(0, 1000, (1,))

    def __len__(self):
        return 16


dataset = RandomDataset()
dataloader = DataLoader(dataset, batch_size=2, num_workers=4)
for epoch in range(2):
    for i, batch in enumerate(dataloader):
        print(epoch, i, batch.data.numpy().flatten().tolist())
Parameters
  • dataset (Dataset) – dataset from which to load the data.

  • num_workers (int) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)

  • kwargs – other parameters for PyTorch DataLoader.

ThreadBuffer

class monai.data.ThreadBuffer(src, buffer_size=1, timeout=0.01)[source]

Iterates over values from self.src in a separate thread but yielding them in the current thread. This allows values to be queued up asynchronously. The internal thread will continue running so long as the source has values or until the stop() method is called.

One issue raised by using a thread in this way is that during the lifetime of the thread the source object is being iterated over, so if the thread hasn’t finished another attempt to iterate over it will raise an exception or yield unexpected results. To ensure the thread releases the iteration and proper cleanup is done the stop() method must be called which will join with the thread.

Parameters
  • src – Source data iterable

  • buffer_size (int) – Number of items to buffer from the source

  • timeout (float) – Time to wait for an item from the buffer, or to wait while the buffer is full when adding items

TestTimeAugmentation

class monai.data.TestTimeAugmentation(transform, batch_size, num_workers, inferrer_fn, device='cpu', image_key='image', orig_key='label', nearest_interp=True, orig_meta_keys=None, meta_key_postfix='meta_dict', return_full_data=False, progress=True)[source]

Class for performing test time augmentations. This will pass the same image through the network multiple times.

The user passes transform(s) to be applied to each realisation, and provided that at least one of those transforms is random, the network’s output will vary. Provided that inverse transformations exist for all supplied spatial transforms, the inverse can be applied to each realisation of the network’s output. Once in the same spatial reference, the results can then be combined and metrics computed.

Test time augmentations are a useful feature for computing network uncertainty, as well as observing the network’s dependency on the applied random transforms.

Reference:

Wang et al., Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks, https://doi.org/10.1016/j.neucom.2019.01.103

Parameters
  • transform (InvertibleTransform) – transform (or composed) to be applied to each realisation. At least one transform must be of type Randomizable. All random transforms must be of type InvertibleTransform.

  • batch_size (int) – number of realisations to infer at once.

  • num_workers (int) – how many subprocesses to use for data.

  • inferrer_fn (Callable) – function to use to perform inference.

  • device (Union[str, device]) – device on which to perform inference.

  • image_key – key used to extract image from input dictionary.

  • orig_key – the key of the original input data in the dict. will get the applied transform information for this input data, then invert them for the expected data with image_key.

  • orig_meta_keys (Optional[str]) – the key of the meta data of original input data, will get the affine, data_shape, etc. the meta data is a dictionary object which contains: filename, original_shape, etc. if None, will try to construct meta_keys by {orig_key}_{meta_key_postfix}.

  • meta_key_postfix – use key_{postfix} to to fetch the meta data according to the key data, default is meta_dict, the meta data is a dictionary object. For example, to handle key image, read/write affine matrices from the metadata image_meta_dict dictionary’s affine field. this arg only works when meta_keys=None.

  • return_full_data (bool) – normally, metrics are returned (mode, mean, std, vvc). Setting this flag to True will return the full data. Dimensions will be same size as when passing a single image through inferrer_fn, with a dimension appended equal in size to num_examples (N), i.e., [N,C,H,W,[D]].

  • progress (bool) – whether to display a progress bar.

Example

transform = RandAffined(keys, ...)
post_trans = Compose([Activations(sigmoid=True), AsDiscrete(threshold_values=True)])

tt_aug = TestTimeAugmentation(
    transform, batch_size=5, num_workers=0, inferrer_fn=lambda x: post_trans(model(x)), device=device
)
mode, mean, std, vvc = tt_aug(test_data)