# Metrics¶

## Mean Dice¶

monai.metrics.compute_meandice(y_pred, y, include_background=True)[source]

Computes Dice score metric from full size Tensor and collects average.

Parameters
• y_pred (Tensor) – input data to compute, typical segmentation model output. It must be one-hot format and first dim is batch, example shape: [16, 3, 32, 32]. The values should be binarized.

• y (Tensor) – ground truth to compute mean dice metric. It must be one-hot format and first dim is batch. The values should be binarized.

• include_background (bool) – whether to skip Dice computation on the first channel of the predicted output. Defaults to True.

Return type

Tensor

Returns

Dice scores per batch and per class, (shape [batch_size, n_classes]).

Raises

ValueError – when y_pred and y have different shapes.

class monai.metrics.DiceMetric(include_background=True, reduction=<MetricReduction.MEAN: 'mean'>)[source]

Compute average Dice loss between two tensors. It can support both multi-classes and multi-labels tasks. Input y_pred (BNHW[D] where N is number of classes) is compared with ground truth y (BNHW[D]). y_preds is expected to have binarized predictions and y should be in one-hot format. You can use suitable transforms in monai.transforms.post first to achieve binarized values. The include_background parameter can be set to False for an instance of DiceLoss to exclude the first category (channel index 0) which is by convention assumed to be background. If the non-background segmentations are small compared to the total image size they can get overwhelmed by the signal from the background so excluding it in such cases helps convergence.

Parameters
• include_background (bool) – whether to skip Dice computation on the first channel of the predicted output. Defaults to True.

• reduction (Union[MetricReduction, str]) – {"none", "mean", "sum", "mean_batch", "sum_batch", "mean_channel", "sum_channel"} Define the mode to reduce computation result of 1 batch data. Defaults to "mean".

## Area under the ROC curve¶

monai.metrics.compute_roc_auc(y_pred, y, to_onehot_y=False, softmax=False, other_act=None, average=<Average.MACRO: 'macro'>)[source]

Computes Area Under the Receiver Operating Characteristic Curve (ROC AUC). Referring to: sklearn.metrics.roc_auc_score.

Parameters
• y_pred (Tensor) – input data to compute, typical classification model output. it must be One-Hot format and first dim is batch, example shape: [16] or [16, 2].

• y (Tensor) – ground truth to compute ROC AUC metric, the first dim is batch. example shape: [16, 1] will be converted into [16, 2] (where 2 is inferred from y_pred).

• to_onehot_y (bool) – whether to convert y into the one-hot format. Defaults to False.

• softmax (bool) – whether to add softmax function to y_pred before computation. Defaults to False.

• other_act (Optional[Callable]) – callable function to replace softmax as activation layer if needed, Defaults to None. for example: other_act = lambda x: torch.log_softmax(x).

• average (Union[Average, str]) –

{"macro", "weighted", "micro", "none"} Type of averaging performed if not binary classification. Defaults to "macro".

• "macro": calculate metrics for each label, and find their unweighted mean.

This does not take label imbalance into account.

• "weighted": calculate metrics for each label, and find their average,

weighted by support (the number of true instances for each label).

• "micro": calculate metrics globally by considering each element of the label

indicator matrix as a label.

• "none": the scores for each class are returned.

Raises
• ValueError – When y_pred dimension is not one of [1, 2].

• ValueError – When y dimension is not one of [1, 2].

• ValueError – When softmax=True and other_act is not None. Incompatible values.

• TypeError – When other_act is not an Optional[Callable].

• ValueError – When average is not one of [“macro”, “weighted”, “micro”, “none”].

Note

ROCAUC expects y to be comprised of 0’s and 1’s. y_pred must be either prob. estimates or confidence values.

Return type

Union[ndarray, List[float], float]

## Confusion matrix¶

monai.metrics.get_confusion_matrix(y_pred, y, include_background=True)[source]

Compute confusion matrix. A tensor with the shape [BC4] will be returned. Where, the third dimension represents the number of true positive, false positive, true negative and false negative values for each channel of each sample within the input batch. Where, B equals to the batch size and C equals to the number of classes that need to be computed.

Parameters
• y_pred (Tensor) – input data to compute. It must be one-hot format and first dim is batch. The values should be binarized.

• y (Tensor) – ground truth to compute the metric. It must be one-hot format and first dim is batch. The values should be binarized.

• include_background (bool) – whether to skip metric computation on the first channel of the predicted output. Defaults to True.

Raises

ValueError – when y_pred and y have different shapes.

class monai.metrics.ConfusionMatrixMetric(include_background=True, metric_name='hit_rate', compute_sample=False, reduction=<MetricReduction.MEAN: 'mean'>)[source]

Compute confusion matrix related metrics. This function supports to calculate all metrics mentioned in: Confusion matrix. It can support both multi-classes and multi-labels classification and segmentation tasks. y_preds is expected to have binarized predictions and y should be in one-hot format. You can use suitable transforms in monai.transforms.post first to achieve binarized values. The include_background parameter can be set to False for an instance to exclude the first category (channel index 0) which is by convention assumed to be background. If the non-background segmentations are small compared to the total image size they can get overwhelmed by the signal from the background so excluding it in such cases helps convergence.

Parameters
• include_background (bool) – whether to skip metric computation on the first channel of the predicted output. Defaults to True.

• metric_name (Union[Sequence[str], str]) – ["sensitivity", "specificity", "precision", "negative predictive value", "miss rate", "fall out", "false discovery rate", "false omission rate", "prevalence threshold", "threat score", "accuracy", "balanced accuracy", "f1 score", "matthews correlation coefficient", "fowlkes mallows index", "informedness", "markedness"] Some of the metrics have multiple aliases (as shown in the wikipedia page aforementioned), and you can also input those names instead. Except for input only one metric, multiple metrics are also supported via input a sequence of metric names, such as (“sensitivity”, “precision”, “recall”), if compute_sample is True, multiple f and not_nans will be returned with the same order as input names when calling the class.

• compute_sample (bool) – if True, each sample’s metric will be computed first. If False, the confusion matrix for each image (the output of function get_confusion_matrix) will be returned. In this way, users should achieve the confusion matrixes for all images during an epoch and then use compute_confusion_matrix_metric to calculate the metric. Defaults to False.

• reduction (Union[MetricReduction, str]) – {"none", "mean", "sum", "mean_batch", "sum_batch", "mean_channel", "sum_channel"} Define the mode to reduce computation result of 1 batch data. Reduction will only be employed when compute_sample is True. Defaults to "mean".

## Hausdorff distance¶

monai.metrics.compute_hausdorff_distance(y_pred, y, include_background=False, distance_metric='euclidean', percentile=None, directed=False)[source]

Compute the Hausdorff distance.

Parameters
• y_pred (Union[ndarray, Tensor]) – input data to compute, typical segmentation model output. It must be one-hot format and first dim is batch, example shape: [16, 3, 32, 32]. The values should be binarized.

• y (Union[ndarray, Tensor]) – ground truth to compute mean the distance. It must be one-hot format and first dim is batch. The values should be binarized.

• include_background (bool) – whether to skip distance computation on the first channel of the predicted output. Defaults to False.

• distance_metric (str) – : ["euclidean", "chessboard", "taxicab"] the metric used to compute surface distance. Defaults to "euclidean".

• percentile (Optional[float]) – an optional float number between 0 and 100. If specified, the corresponding percentile of the Hausdorff Distance rather than the maximum result will be achieved. Defaults to None.

• directed (bool) – whether to calculate directed Hausdorff distance. Defaults to False.

class monai.metrics.HausdorffDistanceMetric(include_background=False, distance_metric='euclidean', percentile=None, directed=False, reduction=<MetricReduction.MEAN: 'mean'>)[source]

Compute Hausdorff Distance between two tensors. It can support both multi-classes and multi-labels tasks. It supports both directed and non-directed Hausdorff distance calculation. In addition, specify the percentile parameter can get the percentile of the distance. Input y_pred (BNHW[D] where N is number of classes) is compared with ground truth y (BNHW[D]). y_preds is expected to have binarized predictions and y should be in one-hot format. You can use suitable transforms in monai.transforms.post first to achieve binarized values.

Parameters
• include_background (bool) – whether to include distance computation on the first channel of the predicted output. Defaults to False.

• distance_metric (str) – : ["euclidean", "chessboard", "taxicab"] the metric used to compute surface distance. Defaults to "euclidean".

• percentile (Optional[float]) – an optional float number between 0 and 100. If specified, the corresponding percentile of the Hausdorff Distance rather than the maximum result will be achieved. Defaults to None.

• directed (bool) – whether to calculate directed Hausdorff distance. Defaults to False.

• reduction (Union[MetricReduction, str]) – {"none", "mean", "sum", "mean_batch", "sum_batch", "mean_channel", "sum_channel"} Define the mode to reduce computation result of 1 batch data. Defaults to "mean".

## Average surface distance¶

monai.metrics.compute_average_surface_distance(y_pred, y, include_background=False, symmetric=False, distance_metric='euclidean')[source]

This function is used to compute the Average Surface Distance from y_pred to y under the default setting. In addition, if sets symmetric = True, the average symmetric surface distance between these two inputs will be returned.

Parameters
• y_pred (Union[ndarray, Tensor]) – input data to compute, typical segmentation model output. It must be one-hot format and first dim is batch, example shape: [16, 3, 32, 32]. The values should be binarized.

• y (Union[ndarray, Tensor]) – ground truth to compute mean the distance. It must be one-hot format and first dim is batch. The values should be binarized.

• include_background (bool) – whether to skip distance computation on the first channel of the predicted output. Defaults to False.

• symmetric (bool) – whether to calculate the symmetric average surface distance between seg_pred and seg_gt. Defaults to False.

• distance_metric (str) – : ["euclidean", "chessboard", "taxicab"] the metric used to compute surface distance. Defaults to "euclidean".

class monai.metrics.SurfaceDistanceMetric(include_background=False, symmetric=False, distance_metric='euclidean', reduction=<MetricReduction.MEAN: 'mean'>)[source]

Compute Surface Distance between two tensors. It can support both multi-classes and multi-labels tasks. It supports both symmetric and asymmetric surface distance calculation. Input y_pred (BNHW[D] where N is number of classes) is compared with ground truth y (BNHW[D]). y_preds is expected to have binarized predictions and y should be in one-hot format. You can use suitable transforms in monai.transforms.post first to achieve binarized values.

Parameters
• include_background (bool) – whether to skip distance computation on the first channel of the predicted output. Defaults to False.

• symmetric (bool) – whether to calculate the symmetric average surface distance between seg_pred and seg_gt. Defaults to False.

• distance_metric (str) – : ["euclidean", "chessboard", "taxicab"] the metric used to compute surface distance. Defaults to "euclidean".

• reduction (Union[MetricReduction, str]) – {"none", "mean", "sum", "mean_batch", "sum_batch", "mean_channel", "sum_channel"} Define the mode to reduce computation result of 1 batch data. Defaults to "mean".

## Occlusion sensitivity¶

monai.metrics.compute_occlusion_sensitivity(model, image, label, pad_val=0.0, margin=2, n_batch=128, b_box=None, stride=1, upsample_mode='nearest')[source]

This function computes the occlusion sensitivity for a model’s prediction of a given image. By occlusion sensitivity, we mean how the probability of a given prediction changes as the occluded section of an image changes. This can be useful to understand why a network is making certain decisions.

The result is given as baseline (the probability of a certain output) minus the probability of the output with the occluded area.

Therefore, higher values in the output image mean there was a greater the drop in certainty, indicating the occluded region was more important in the decision process.

See: R. R. Selvaraju et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. https://doi.org/10.1109/ICCV.2017.74

Parameters
• model (Module) – classification model to use for inference

• image (Tensor) – image to test. Should be tensor consisting of 1 batch, can be 2- or 3D.

• label (Union[int, Tensor]) – classification label to check for changes (normally the true label, but doesn’t have to be)

• pad_val (float) – when occluding part of the image, which values should we put in the image?

• margin (Union[int, Sequence]) – we’ll create a cuboid/cube around the voxel to be occluded. if margin==2, then we’ll create a cube that is +/- 2 voxels in all directions (i.e., a cube of 5 x 5 x 5 voxels). A Sequence can be supplied to have a margin of different sizes (i.e., create a cuboid).

• n_batch (int) – number of images in a batch before inference.

• b_box (Optional[Sequence]) – Bounding box on which to perform the analysis. The output image will also match in size. There should be a minimum and maximum for all dimensions except batch: [min1, max1, min2, max2,...]. * By default, the whole image will be used. Decreasing the size will speed the analysis up, which might be useful for larger images. * Min and max are inclusive, so [0, 63, …] will have size (64, …). * Use -ve to use 0 for min values and im.shape[x]-1 for xth dimension.

• stride (Union[int, Sequence]) – Stride for performing occlusions. Can be single value or sequence (for varying stride in the different directions). Should be >= 1.

• upsample_mode (str) – If stride != 1 is used, we’ll upsample such that the size of the voxels in the output image match the input. Upsampling is done with torch.nn.Upsample, and mode can be set to: * nearest, linear, bilinear, bicubic and trilinear * default is nearest.

Return type

ndarray

Returns

Numpy array. If no bounding box is supplied, this will be the same size as the input image. If a bounding box is used, the output image will be cropped to this size.