Applications

Datasets

class monai.apps.MedNISTDataset(root_dir, section, transform=<monai.transforms.io.dictionary.LoadPNGd object>, download=False, seed=0, val_frac=0.1, test_frac=0.1, cache_num=9223372036854775807, cache_rate=1.0, num_workers=0)[source]

The Dataset to automatically download MedNIST data and generate items for training, validation or test. It’s based on CacheDataset to accelerate the training process.

Parameters
  • root_dir (str) – target directory to download and load MedNIST dataset.

  • section (str) – expected data section, can be: training, validation or test.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data. the default transform is LoadPNGd, which can load data into numpy array with [H, W] shape. for further usage, use AddChanneld to convert the shape to [C, H, W, D].

  • download (bool) – whether to download and extract the MedNIST from resource link, default is False. if expected file already exists, skip downloading even set it to True. user can manually copy MedNIST.tar.gz file or MedNIST folder to root directory.

  • seed (int) – random seed to randomly split training, validation and test datasets, defaut is 0.

  • val_frac (float) – percentage of of validation fraction in the whole dataset, default is 0.1.

  • test_frac (float) – percentage of of test fraction in the whole dataset, default is 0.1.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (int) – the number of worker threads to use. if 0 a single thread will be used. Default is 0.

Raises
  • ValueError – When root_dir is not a directory.

  • RuntimeError – When dataset_dir doesn’t exist and downloading is not selected (download=False).

randomize(data=None)[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises

NotImplementedError – When the subclass does not override this method.

Return type

None

class monai.apps.DecathlonDataset(root_dir, task, section, transform=<monai.transforms.io.dictionary.LoadNiftid object>, download=False, seed=0, val_frac=0.2, cache_num=9223372036854775807, cache_rate=1.0, num_workers=0)[source]

The Dataset to automatically download the data of Medical Segmentation Decathlon challenge (http://medicaldecathlon.com/) and generate items for training, validation or test. It’s based on monai.data.CacheDataset to accelerate the training process.

Parameters
  • root_dir (str) – user’s local directory for caching and loading the MSD datasets.

  • task (str) – which task to download and execute: one of list (“Task01_BrainTumour”, “Task02_Heart”, “Task03_Liver”, “Task04_Hippocampus”, “Task05_Prostate”, “Task06_Lung”, “Task07_Pancreas”, “Task08_HepaticVessel”, “Task09_Spleen”, “Task10_Colon”).

  • section (str) – expected data section, can be: training, validation or test.

  • transform (Union[Sequence[Callable], Callable]) – transforms to execute operations on input data. the default transform is LoadNiftid, which can load Nifit format data into numpy array with [H, W, D] or [H, W, D, C] shape. for further usage, use AddChanneld or AsChannelFirstd to convert the shape to [C, H, W, D].

  • download (bool) – whether to download and extract the Decathlon from resource link, default is False. if expected file already exists, skip downloading even set it to True. user can manually copy tar file or dataset folder to the root directory.

  • seed (int) – random seed to randomly split training, validation and test datasets, defaut is 0.

  • val_frac (float) – percentage of of validation fraction from the training section, default is 0.2. Decathlon data only contains training section with labels and test section without labels, so randomly select fraction from the training section as the validation section.

  • cache_num (int) – number of items to be cached. Default is sys.maxsize. will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • cache_rate (float) – percentage of cached data in total, default is 1.0 (cache all). will take the minimum of (cache_num, data_length x cache_rate, data_length).

  • num_workers (int) – the number of worker threads to use. if 0 a single thread will be used. Default is 0.

Raises
  • ValueError – When root_dir is not a directory.

  • ValueError – When task is not one of [“Task01_BrainTumour”, “Task02_Heart”, “Task03_Liver”, “Task04_Hippocampus”, “Task05_Prostate”, “Task06_Lung”, “Task07_Pancreas”, “Task08_HepaticVessel”, “Task09_Spleen”, “Task10_Colon”].

  • RuntimeError – When dataset_dir doesn’t exist and downloading is not selected (download=False).

Example:

transform = Compose(
    [
        LoadNiftid(keys=["image", "label"]),
        AddChanneld(keys=["image", "label"]),
        ScaleIntensityd(keys="image"),
        ToTensord(keys=["image", "label"]),
    ]
)

data = DecathlonDataset(
    root_dir="./", task="Task09_Spleen", transform=transform, section="validation", download=True
)

print(data[0]["image"], data[0]["label"])
randomize(data=None)[source]

Within this method, self.R should be used, instead of np.random, to introduce random factors.

all self.R calls happen here so that we have a better chance to identify errors of sync the random state.

This method can generate the random factors based on properties of the input data.

Raises

NotImplementedError – When the subclass does not override this method.

Return type

None

Utilities

monai.apps.check_md5(filepath, md5_value=None)[source]

check MD5 signature of specified file.

Parameters
  • filepath (str) – path of source file to verify MD5.

  • md5_value (Optional[str]) – expected MD5 value of the file.

Return type

bool

monai.apps.download_url(url, filepath, md5_value=None)[source]

Download file from specified URL link, support process bar and MD5 check.

Parameters
  • url (str) – source URL link to download file.

  • filepath (str) – target filepath to save the downloaded file.

  • md5_value (Optional[str]) – expected MD5 value to validate the downloaded file. if None, skip MD5 validation.

Raises
  • RuntimeError – When the MD5 validation of the filepath existing file fails.

  • RuntimeError – When a network issue or denied permission prevents the file download from url to filepath.

  • URLError – See urllib.request.urlretrieve.

  • HTTPError – See urllib.request.urlretrieve.

  • ContentTooShortError – See urllib.request.urlretrieve.

  • IOError – See urllib.request.urlretrieve.

  • RuntimeError – When the MD5 validation of the url downloaded file fails.

Return type

None

monai.apps.extractall(filepath, output_dir, md5_value=None)[source]

Extract file to the output directory. Expected file types are: zip, tar.gz and tar.

Parameters
  • filepath (str) – the file path of compressed file.

  • output_dir (str) – target directory to save extracted files.

  • md5_value (Optional[str]) – expected MD5 value to validate the compressed file. if None, skip MD5 validation.

Raises
  • RuntimeError – When the MD5 validation of the filepath compressed file fails.

  • ValueError – When the filepath file extension is not one of [zip”, “tar.gz”, “tar”].

Return type

None

monai.apps.download_and_extract(url, filepath, output_dir, md5_value=None)[source]

Download file from URL and extract it to the output directory.

Parameters
  • url (str) – source URL link to download file.

  • filepath (str) – the file path of compressed file.

  • output_dir (str) – target directory to save extracted files. defaut is None to save in current directory.

  • md5_value (Optional[str]) – expected MD5 value to validate the downloaded file. if None, skip MD5 validation.

Return type

None