Auto3dseg#

class monai.auto3dseg.Algo[source]#

An algorithm in this context is loosely defined as a data processing pipeline consisting of multiple components such as image preprocessing, followed by deep learning model training and evaluation.

get_output_path(*args, **kwargs)[source]#

Returns the algo output paths for scripts location

get_score(*args, **kwargs)[source]#

Returns the model quality measurement based on training and validation datasets.

predict(predict_files, predict_params=None)[source]#

Read test data and output model predictions.

Parameters
  • predict_files (list) – list of files for the predicting pipeline.

  • predict_params (Optional[dict]) – key-value pairs of input parameters for the predicting pipeline.

set_data_stats(*args, **kwargs)[source]#

Provide dataset (and summaries) so that the model creation can depend on the input datasets.

train(params)[source]#

Read training/validation data and output a model.

Parameters

params (dict) – key-value pairs of input parameters for the training pipeline.

class monai.auto3dseg.AlgoGen[source]#

A data-driven algorithm generator. It optionally takes the following inputs:

  • training dataset properties (such as data statistics from monai.auto3dseg.analyzer),

  • previous algorithm’s scores measuring the model quality,

  • computational budgets,

and generates Algo instances. The generated algos are to be trained with the training datasets:

                          scores
                +------------------------+
                |   +---------+          |
+-----------+   +-->|         |    +-----+----+
| Dataset,  |       | AlgoGen |--->|   Algo   |
| summaries |------>|         |    +----------+
+-----+-----+       +---------+          ^
      |                                  |
      +----------------------------------+

This class also maintains a history of previously generated Algo and their corresponding validation scores. The Algo generation process may be stochastic (using Randomizable.R as the source random state).

generate()[source]#

Generate new Algo – based on data_stats, budget, and history of previous algo generations.

get_budget(*args, **kwargs)[source]#

Get the current computational budget.

get_data_stats(*args, **kwargs)[source]#

Get current dataset summaries.

get_history(*args, **kwargs)[source]#

Get the previously generated algo.

run_algo(*args, **kwargs)[source]#

Launch the Algos. This is useful for light-weight Algos where there’s no need to distribute the training jobs.

If the generated Algos require significant scheduling of parallel executions, a job scheduler/controller implemented separately is preferred to run them. In this case the controller should also report back the scores and the algo history, so that the future AlgoGen.generate can leverage the information.

set_budget(*args, **kwargs)[source]#

Provide computational budget so that the generator outputs algorithms that requires reasonable resources.

set_data_stats(*args, **kwargs)[source]#

Provide dataset summaries/properties so that the generator can be conditioned on the input datasets.

set_score(*args, **kwargs)[source]#

Feedback from the previously generated algo, the score can be used for new Algo generations.

class monai.auto3dseg.Analyzer(stats_name, report_format)[source]#

The Analyzer component is a base class. Other classes inherit this class will provide a callable with the same class name and produces one pre-formatted dictionary for the input data. The format is pre-defined by the init function of the class that inherit this base class. Function operations can also be registered before the runtime of the callable.

Parameters

report_format (dict) – a dictionary that outlines the key structures of the report format.

get_report_format()[source]#

Get the report format by resolving the registered operations recursively.

Returns

None} pairs.

Return type

a dictionary with {keys

resolve_format(report)[source]#

Resolve the format of the pre-defined report.

Parameters

report (dict) – the dictionary to resolve. Values will be replaced in-place.

static unwrap_ops(func)[source]#

Unwrap a function value and generates the same set keys in a dict when the function is actually called in runtime

Parameters

func – Operation sub-class object that represents statistical operations. The func object should have a data dictionary which stores the statistical operation information. For some operations (ImageStats for example), it may also contain the data_addon property, which is part of the update process.

Returns

a dict with a set of keys.

update_ops(key, op)[source]#

Register a statistical operation to the Analyzer and update the report_format.

Parameters
  • key (str) – value key in the report.

  • op – Operation sub-class object that represents statistical operations.

update_ops_nested_label(nested_key, op)[source]#

Update operations for nested label format. Operation value in report_format will be resolved to a dict with only keys.

Parameters
  • nested_key (str) – str that has format of ‘key1#0#key2’.

  • op – Operation sub-class object that represents statistical operations.

class monai.auto3dseg.FgImageStats(image_key, label_key, stats_name='image_foreground_stats')[source]#

Analyzer to extract foreground label properties for each case(image and label).

Parameters
  • image_key (str) – the key to find image data in the callable function input (data)

  • label_key (str) – the key to find label data in the callable function input (data)

Examples:

import numpy as np
from monai.auto3dseg import FgImageStats

input = {}
input['image'] = np.random.rand(1,30,30,30)
input['label'] = np.ones([30,30,30])
analyzer = FgImageStats(image_key='image', label_key='label')
print(analyzer(input)["image_foreground_stats"])
class monai.auto3dseg.FgImageStatsSumm(stats_name='image_foreground_stats', average=True)[source]#

This summary analyzer processes the values of specific key stats_name in a list of dict. Typically, the list of dict is the output of case analyzer under the similar name (FgImageStats).

Parameters
  • stats_name (str) – the key of the to-process value in the dict.

  • average (Optional[bool]) – whether to average the statistical value across different image modalities.

class monai.auto3dseg.FilenameStats(key, stats_name)[source]#

This class finds the file path for the loaded image/label and writes the info into the data pipeline as a monai transforms.

Parameters
  • key (str) – the key to fetch the filename (for example, “image”, “label”).

  • stats_name (str) – the key to store the filename in the output stats report.

class monai.auto3dseg.ImageStats(image_key, stats_name='image_stats')[source]#

Analyzer to extract image stats properties for each case(image).

Parameters

image_key (str) – the key to find image data in the callable function input (data)

Examples:

import numpy as np
from monai.auto3dseg import ImageStats
from monai.data import MetaTensor

input = {}
input['image'] = np.random.rand(1,30,30,30)
input['image'] = MetaTensor(np.random.rand(1,30,30,30))  # MetaTensor
analyzer = ImageStats(image_key="image")
print(analyzer(input)["image_stats"])

Notes

if the image data is NumPy array, the spacing stats will be [1.0] * ndims of the array, where the ndims is the lesser value between the image dimension and 3.

class monai.auto3dseg.ImageStatsSumm(stats_name='image_stats', average=True)[source]#

This summary analyzer processes the values of specific key stats_name in a list of dict. Typically, the list of dict is the output of case analyzer under the same prefix (ImageStats).

Parameters
  • stats_name (str) – the key of the to-process value in the dict.

  • average (Optional[bool]) – whether to average the statistical value across different image modalities.

class monai.auto3dseg.LabelStats(image_key, label_key, stats_name='label_stats', do_ccp=True)[source]#

Analyzer to extract label stats properties for each case(image and label).

Parameters
  • image_key (str) – the key to find image data in the callable function input (data)

  • label_key (str) – the key to find label data in the callable function input (data)

  • do_ccp (Optional[bool]) – performs connected component analysis. Default is True.

Examples:

import numpy as np
from monai.auto3dseg import LabelStats

input = {}
input['image'] = np.random.rand(1,30,30,30)
input['label'] = np.ones([30,30,30])
analyzer = LabelStats(image_key='image', label_key='label')
print(analyzer(input)["label_stats"])
class monai.auto3dseg.LabelStatsSumm(stats_name='label_stats', average=True, do_ccp=True)[source]#

This summary analyzer processes the values of specific key stats_name in a list of dict. Typically, the list of dict is the output of case analyzer under the similar name (LabelStats).

Parameters
  • stats_name (str) – the key of the to-process value in the dict.

  • average (Optional[bool]) – whether to average the statistical value across different image modalities.

class monai.auto3dseg.Operations(**kwargs)[source]#

Base class of operation interface

evaluate(data, **kwargs)[source]#

For key-value pairs in the self.data, if the value is a callable, then this function will apply the callable to the input data. The result will be written under the same key under the output dict.

Parameters

data (Any) – input data.

Return type

dict

Returns

a dictionary which has same keys as the self.data if the value

is callable.

class monai.auto3dseg.SampleOperations[source]#

Apply statistical operation to a sample (image/ndarray/tensor).

Notes

Percentile operation uses a partial function that embeds different kwargs (q). In order to print the result nicely, data_addon is added to map the numbers generated by percentile to different keys (“percentile_00_5” for example). Annotation of the postfix means the percentage for percentile computation. For example, _00_5 means 0.5% and _99_5 means 99.5%.

Example

# use the existing operations
import numpy as np
op = SampleOperations()
data_np = np.random.rand(10, 10).astype(np.float64)
print(op.evaluate(data_np))

# add a new operation
op.update({"sum": np.sum})
print(op.evaluate(data_np))
evaluate(data, **kwargs)[source]#

Applies the callables to the data, and convert the numerics to list or Python numeric types (int/float).

Parameters

data (Any) – input data

Return type

dict

class monai.auto3dseg.SegSummarizer(image_key, label_key, average=True, do_ccp=True, hist_bins=None, hist_range=None, histogram_only=False)[source]#

SegSummarizer serializes the operations for data analysis in Auto3Dseg pipeline. It loads two types of analyzer functions and execute differently. The first type of analyzer is CaseAnalyzer which is similar to traditional monai transforms. It can be composed with other transforms to process the data dict which has image/label keys. The second type of analyzer is SummaryAnalyzer which works only on a list of dictionary. Each dictionary is the output of the case analyzers on a single dataset.

Parameters
  • image_key (str) – a string that user specify for the image. The DataAnalyzer will look it up in the datalist to locate the image files of the dataset.

  • label_key (str) – a string that user specify for the label. The DataAnalyzer will look it up in the datalist to locate the label files of the dataset. If label_key is None, the DataAnalyzer will skip looking for labels and all label-related operations.

  • do_ccp (bool) – apply the connected component algorithm to process the labels/images.

  • hist_bins (Optional[list]) – list of positive integers (one for each channel) for setting the number of bins used to compute the histogram. Defaults to [100].

  • hist_range (Optional[list]) – list of lists of two floats (one for each channel) setting the intensity range to compute the histogram. Defaults to [-500, 500].

  • histogram_only (bool) – whether to only compute histograms. Defaults to False.

Examples

# imports

summarizer = SegSummarizer("image", "label")
transform_list = [
    LoadImaged(keys=keys),
    EnsureChannelFirstd(keys=keys),  # this creates label to be (1,H,W,D)
    ToDeviced(keys=keys, device=device, non_blocking=True),
    Orientationd(keys=keys, axcodes="RAS"),
    EnsureTyped(keys=keys, data_type="tensor"),
    Lambdad(keys="label", func=lambda x: torch.argmax(x, dim=0, keepdim=True) if x.shape[0] > 1 else x),
    SqueezeDimd(keys=["label"], dim=0),
    summarizer,
]
...
# skip some steps to set up data loader
dataset = data.DataLoader(ds, batch_size=1, shuffle=False, num_workers=n_workers, collate_fn=no_collation)
transform = Compose(transform_list)
stats = []
for batch_data in dataset:
    d = transform(batch_data[0])
    stats.append(d)
report = summarizer.summarize(stats)
add_analyzer(case_analyzer, summary_analyzer)[source]#

Add new analyzers to the engine so that the callable and summarize functions will utilize the new analyzers for stats computations.

Parameters
  • case_analyzer – analyzer that works on each data.

  • summary_analyzer – analyzer that works on list of stats dict (output from case_analyzers).

Examples

from monai.auto3dseg import Analyzer
from monai.auto3dseg.utils import concat_val_to_np
from monai.auto3dseg.analyzer_engine import SegSummarizer

class UserAnalyzer(Analyzer):
    def __init__(self, image_key="image", stats_name="user_stats"):
        self.image_key = image_key
        report_format = {"ndims": None}
        super().__init__(stats_name, report_format)

    def __call__(self, data):
        d = dict(data)
        report = deepcopy(self.get_report_format())
        report["ndims"] = d[self.image_key].ndim
        d[self.stats_name] = report
        return d

class UserSummaryAnalyzer(Analyzer):
    def __init__(stats_name="user_stats"):
        report_format = {"ndims": None}
        super().__init__(stats_name, report_format)
        self.update_ops("ndims", SampleOperations())

    def __call__(self, data):
        report = deepcopy(self.get_report_format())
        v_np = concat_val_to_np(data, [self.stats_name, "ndims"])
        report["ndims"] = self.ops["ndims"].evaluate(v_np)
        return report

summarizer = SegSummarizer()
summarizer.add_analyzer(UserAnalyzer, UserSummaryAnalyzer)
Return type

None

summarize(data)[source]#

Summarize the input list of data and generates a report ready for json/yaml export.

Parameters

data (List[Dict]) – a list of data dicts.

Returns

a dict that summarizes the stats across data samples.

Examples

stats_summary:
image_foreground_stats:

intensity: {…}

image_stats:

channels: {…} cropped_shape: {…} …

label_stats:

image_intensity: {…} label: - image_intensity: {…} - image_intensity: {…} - image_intensity: {…} - image_intensity: {…}

class monai.auto3dseg.SummaryOperations[source]#

Apply statistical operation to summarize a dict. The key-value looks like: {“max”, “min” ,”mean”, ….}. The value may contain multiple values in a list format. Then this operation will apply the operation to the list. Typically, the dict is generated by multiple SampleOperation and concat_multikeys_to_dict functions.

Examples

import numpy as np
data = {
    "min": np.random.rand(4),
    "max": np.random.rand(4),
    "mean": np.random.rand(4),
    "sum": np.random.rand(4),
}
op = SummaryOperations()
print(op.evaluate(data)) # "sum" is not registered yet, so it won't contain "sum"

op.update({"sum", np.sum})
print(op.evaluate(data)) # output has "sum"
evaluate(data, **kwargs)[source]#

Applies the callables to the data, and convert the numerics to list or Python numeric types (int/float).

Parameters

data (Any) – input data

Return type

dict

monai.auto3dseg.algo_from_pickle(pkl_filename, **kwargs)[source]#

Import the Algo object from a pickle file

Parameters
  • pkl_filename (str) – name of the pickle file

  • algo_templates_dir – the algorithm script folder which is needed to instantiate the object. If it is None, the function will use the internal 'algo_templates_dir in the object dict.

Returns

Algo-like object

Return type

algo

Raises

ValueError if the pkl_filename does not contain a dict, or the dict does not containtemplate_path or algo_bytes

monai.auto3dseg.algo_to_pickle(algo, **algo_meta_data)[source]#

Export the Algo object to pickle file

Parameters
  • algo (Algo) – Algo-like object

  • algo_meta_data – additional keyword to save into the dictionary. It may include template_path which is used to instantiate the class. It may also include model training info such as acc/best_metrics

Return type

str

Returns

filename of the pickled Algo object

monai.auto3dseg.concat_multikeys_to_dict(data_list, fixed_keys, keys, zero_insert=True, **kwargs)[source]#

Get the nested value in a list of dictionary that shares the same structure iteratively on all keys. It returns a dictionary with keys with the found values in nd.ndarray.

Parameters
  • data_list (List[Dict]) – a list of dictionary {key1: {key2: np.ndarray}}.

  • fixed_keys (List[Union[str, int]]) – a list of keys that records to path to the value in the dict elements.

  • keys (List[str]) – a list of string keys that will be iterated to generate a dict output.

  • zero_insert (bool) – insert a zero in the list so that it can find the value in element 0 before getting the keys

  • flatten – if True, numbers are flattened before concat.

Returns

a dict with keys - nd.array of concatenated array pair.

monai.auto3dseg.concat_val_to_np(data_list, fixed_keys, ragged=False, allow_missing=False, **kwargs)[source]#

Get the nested value in a list of dictionary that shares the same structure.

Parameters
  • data_list (List[Dict]) – a list of dictionary {key1: {key2: np.ndarray}}.

  • fixed_keys (List[Union[str, int]]) – a list of keys that records to path to the value in the dict elements.

  • ragged (Optional[bool]) – if True, numbers can be in list of lists or ragged format so concat mode needs change.

  • allow_missing (Optional[bool]) – if True, it will return a None if the value cannot be found.

Returns

nd.array of concatenated array.

monai.auto3dseg.datafold_read(datalist, basedir, fold=0, key='training')[source]#

Read a list of data dictionary datalist

Parameters
  • datalist (Union[str, Dict]) – the name of a JSON file listing the data, or a dictionary.

  • basedir (str) – directory of image files.

  • fold (int) – which fold to use (0..1 if in training set).

  • key (str) – usually ‘training’ , but can try ‘validation’ or ‘testing’ to get the list data without labels (used in challenges).

Return type

Tuple[List, List]

Returns

A tuple of two arrays (training, validation).

monai.auto3dseg.get_foreground_image(image)[source]#

Get a foreground image by removing all-zero rectangles on the edges of the image Note for the developer: update select_fn if the foreground is defined differently.

Parameters

image (MetaTensor) – ndarray image to segment.

Returns

ndarray of foreground image by removing all-zero edges.

Notes

the size of the output is smaller than the input.

monai.auto3dseg.get_foreground_label(image, label)[source]#

Get foreground image pixel values and mask out the non-labeled area.

Args

image: ndarray image to segment. label: ndarray the image input and annotated with class IDs.

Return type

MetaTensor

Returns

1D array of foreground image with label > 0

monai.auto3dseg.get_label_ccp(mask_index, use_gpu=True)[source]#

Find all connected components and their bounding shape. Backend can be cuPy/cuCIM or Numpy depending on the hardware.

Parameters
  • mask_index (MetaTensor) – a binary mask.

  • use_gpu (bool) – a switch to use GPU/CUDA or not. If GPU is unavailable, CPU will be used regardless of this setting.

Return type

Tuple[List[Any], int]

monai.auto3dseg.verify_report_format(report, report_format)[source]#

Compares the report and the report_format that has only keys.

Parameters
  • report (dict) – dict that has real values.

  • report_format (dict) – dict that only has keys and list-nested value.