Applications

Datasets

class monai.apps.MedNISTDataset(root_dir, section, transform=<monai.transforms.io.dictionary.LoadPNGd object>, download=False, seed=0, val_frac=0.1, test_frac=0.1, cache_num=9223372036854775807, cache_rate=1.0, num_workers=0)[source]

The Dataset to automatically download MedNIST data and generate items for training, validation or test. It’s based on CacheDataset to accelerate the training process.

Parameters
  • root_dir (str) – target directory to download and load MedNIST dataset.

  • section (str) – expected data section, can be: training, validation or test.

  • transform (Callable[…, Any]) – transforms to execute operations on input data. the default transform is LoadPNGd, which can load data into numpy array with [H, W] shape. for further usage, use AddChanneld to convert the shape to [C, H, W, D].

  • download (bool) – whether to download and extract the MedNIST from resource link, default is False. if expected file already exists, skip downloading even set it to True. user can manually copy MedNIST.tar.gz file or MedNIST folder to root directory.

  • seed (int) – random seed to randomly split training, validation and test datasets, defaut is 0.

  • val_frac (float) – percentage of of validation fraction in the whole dataset, default is 0.1.

  • test_frac (float) – percentage of of test fraction in the whole dataset, default is 0.1.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (int) – the number of worker threads to use. if 0 a single thread will be used. Default is 0.

Raises
  • ValueError – root_dir must be a directory.

  • RuntimeError – can not find dataset directory, please use download=True to download it.

randomize()[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can optionally take additional arguments so that the random factors are generated based on properties of the input data.

Raises

NotImplementedError – Subclass {self.__class__.__name__} must implement the compute method

class monai.apps.DecathlonDataset(root_dir, task, section, transform=<monai.transforms.io.dictionary.LoadNiftid object>, download=False, seed=0, val_frac=0.2, cache_num=9223372036854775807, cache_rate=1.0, num_workers=0)[source]

The Dataset to automatically download the data of Medical Segmentation Decathlon challenge (http://medicaldecathlon.com/) and generate items for training, validation or test. It’s based on monai.data.CacheDataset to accelerate the training process.

Parameters
  • root_dir (str) – user’s local directory for caching and loading the MSD datasets.

  • task (str) – which task to download and execute: one of list (“Task01_BrainTumour”, “Task02_Heart”, “Task03_Liver”, “Task04_Hippocampus”, “Task05_Prostate”, “Task06_Lung”, “Task07_Pancreas”, “Task08_HepaticVessel”, “Task09_Spleen”, “Task10_Colon”).

  • section (str) – expected data section, can be: training, validation or test.

  • transform (Callable[…, Any]) – transforms to execute operations on input data. the default transform is LoadNiftid, which can load Nifit format data into numpy array with [H, W, D] or [H, W, D, C] shape. for further usage, use AddChanneld or AsChannelFirstd to convert the shape to [C, H, W, D].

  • download (bool) – whether to download and extract the Decathlon from resource link, default is False. if expected file already exists, skip downloading even set it to True. user can manually copy tar file or dataset folder to the root directory.

  • seed (int) – random seed to randomly split training, validation and test datasets, defaut is 0.

  • val_frac (float) – percentage of of validation fraction from the training section, default is 0.2. Decathlon data only contains training section with labels and test section without labels, so randomly select fraction from the training section as the validation section.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (int) – the number of worker threads to use. if 0 a single thread will be used. Default is 0.

Example:

transform = Compose(
    [
        LoadNiftid(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        ScaleIntensityd(keys="image"),
        ToTensord(keys=["image", "label"]),
    ]
)

data = DecathlonDataset(
    root_dir="./", task="Task09_Spleen", transform=transform, section="validation", download=True
)

print(data[0]["image"], data[0]["label"])
Raises
  • ValueError – root_dir must be a directory.

  • ValueError – unsupported task.

  • RuntimeError – can not find dataset directory, please use download=True to download it.

randomize()[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can optionally take additional arguments so that the random factors are generated based on properties of the input data.

Raises

NotImplementedError – Subclass {self.__class__.__name__} must implement the compute method

Utilities

monai.apps.check_md5(filepath, md5_value=None)[source]

check MD5 signature of specified file.

Parameters
  • filepath (str) – path of source file to verify MD5.

  • md5_value (Optional[str]) – expected MD5 value of the file.

monai.apps.download_url(url, filepath, md5_value=None)[source]

Download file from specified URL link, support process bar and MD5 check.

Parameters
  • url (str) – source URL link to download file.

  • filepath (str) – target filepath to save the downloaded file.

  • md5_value (Optional[str]) – expected MD5 value to validate the downloaded file. if None, skip MD5 validation.

Raises
  • RuntimeError – MD5 check of existing file {filepath} failed, please delete it and try again.

  • URLError – See urllib.request.urlopen

  • IOError – See urllib.request.urlopen

  • RuntimeError – MD5 check of downloaded file failed, URL={url}, filepath={filepath}, expected MD5={md5_value}.

monai.apps.extractall(filepath, output_dir, md5_value=None)[source]

Extract file to the output directory. Expected file types are: zip, tar.gz and tar.

Parameters
  • filepath (str) – the file path of compressed file.

  • output_dir (str) – target directory to save extracted files.

  • md5_value (Optional[str]) – expected MD5 value to validate the compressed file. if None, skip MD5 validation.

Raises
  • RuntimeError – MD5 check of compressed file {filepath} failed.

  • TypeError – unsupported compressed file type.

monai.apps.download_and_extract(url, filepath, output_dir, md5_value=None)[source]

Download file from URL and extract it to the output directory.

Parameters
  • url (str) – source URL link to download file.

  • filepath (str) – the file path of compressed file.

  • output_dir (str) – target directory to save extracted files. defaut is None to save in current directory.

  • md5_value (Optional[str]) – expected MD5 value to validate the downloaded file. if None, skip MD5 validation.