Metrics¶
Mean Dice¶

monai.metrics.
compute_meandice
(y_pred, y, include_background=True, to_onehot_y=False, mutually_exclusive=False, sigmoid=False, other_act=None, logit_thresh=0.5)[source]¶ Computes Dice score metric from full size Tensor and collects average.
 Parameters
y_pred (
Tensor
) – input data to compute, typical segmentation model output. it must be onehot format and first dim is batch, example shape: [16, 3, 32, 32].y (
Tensor
) – ground truth to compute mean dice metric, the first dim is batch. example shape: [16, 1, 32, 32] will be converted into [16, 3, 32, 32]. alternative shape: [16, 3, 32, 32] and set to_onehot_y=False to use 3class labels directly.include_background (
bool
) – whether to skip Dice computation on the first channel of the predicted output. Defaults to True.to_onehot_y (
bool
) – whether to convert y into the onehot format. Defaults to False.mutually_exclusive (
bool
) – if True, y_pred will be converted into a binary matrix using a combination of argmax and to_onehot. Defaults to False.sigmoid (
bool
) – whether to add sigmoid function to y_pred before computation. Defaults to False.other_act (
Optional
[Callable
]) – callable function to replace sigmoid as activation layer if needed, Defaults toNone
. for example: other_act = torch.tanh.logit_thresh (
float
) – the threshold value used to convert (for example, after sigmoid if sigmoid=True) y_pred into a binary matrix. Defaults to 0.5.
 Raises
ValueError – When
sigmoid=True
andother_act is not None
. Incompatible values.TypeError – When
other_act
is not anOptional[Callable]
.ValueError – When
sigmoid=True
andmutually_exclusive=True
. Incompatible values.
 Return type
Tensor
 Returns
Dice scores per batch and per class, (shape [batch_size, n_classes]).
Note
 This method provides two options to convert y_pred into a binary matrix
when mutually_exclusive is True, it uses a combination of
argmax
andto_onehot
,when mutually_exclusive is False, it uses a threshold
logit_thresh
(optionally with asigmoid
function before thresholding).

class
monai.metrics.
DiceMetric
(include_background=True, to_onehot_y=False, mutually_exclusive=False, sigmoid=False, other_act=None, logit_thresh=0.5, reduction=<MetricReduction.MEAN: 'mean'>)[source]¶ Compute average Dice loss between two tensors. It can support both multiclasses and multilabels tasks. Input logits y_pred (BNHW[D] where N is number of classes) is compared with ground truth y (BNHW[D]). Axis N of y_preds is expected to have logit predictions for each class rather than being image channels, while the same axis of y can be 1 or N (onehot format). The include_background class attribute can be set to False for an instance of DiceLoss to exclude the first category (channel index 0) which is by convention assumed to be background. If the nonbackground segmentations are small compared to the total image size they can get overwhelmed by the signal from the background so excluding it in such cases helps convergence.
 Parameters
include_background (
bool
) – whether to skip Dice computation on the first channel of the predicted output. Defaults to True.to_onehot_y (
bool
) – whether to convert y into the onehot format. Defaults to False.mutually_exclusive (
bool
) – if True, y_pred will be converted into a binary matrix using a combination of argmax and to_onehot. Defaults to False.sigmoid (
bool
) – whether to add sigmoid function to y_pred before computation. Defaults to False.other_act (
Optional
[Callable
]) – callable function to replace sigmoid as activation layer if needed, Defaults toNone
. for example: other_act = torch.tanh.logit_thresh (
float
) – the threshold value used to convert (for example, after sigmoid if sigmoid=True) y_pred into a binary matrix. Defaults to 0.5.reduction (
Union
[MetricReduction
,str
]) – {"none"
,"mean"
,"sum"
,"mean_batch"
,"sum_batch"
,"mean_channel"
,"sum_channel"
} Define the mode to reduce computation result of 1 batch data. Defaults to"mean"
.
 Raises
ValueError – When
sigmoid=True
andother_act is not None
. Incompatible values.
Area under the ROC curve¶

monai.metrics.
compute_roc_auc
(y_pred, y, to_onehot_y=False, softmax=False, other_act=None, average=<Average.MACRO: 'macro'>)[source]¶ Computes Area Under the Receiver Operating Characteristic Curve (ROC AUC). Referring to: sklearn.metrics.roc_auc_score.
 Parameters
y_pred (
Tensor
) – input data to compute, typical classification model output. it must be OneHot format and first dim is batch, example shape: [16] or [16, 2].y (
Tensor
) – ground truth to compute ROC AUC metric, the first dim is batch. example shape: [16, 1] will be converted into [16, 2] (where 2 is inferred from y_pred).to_onehot_y (
bool
) – whether to convert y into the onehot format. Defaults to False.softmax (
bool
) – whether to add softmax function to y_pred before computation. Defaults to False.other_act (
Optional
[Callable
]) – callable function to replace softmax as activation layer if needed, Defaults toNone
. for example: other_act = lambda x: torch.log_softmax(x).average (
Union
[Average
,str
]) –{
"macro"
,"weighted"
,"micro"
,"none"
} Type of averaging performed if not binary classification. Defaults to"macro"
."macro"
: calculate metrics for each label, and find their unweighted mean.This does not take label imbalance into account.
"weighted"
: calculate metrics for each label, and find their average,weighted by support (the number of true instances for each label).
"micro"
: calculate metrics globally by considering each element of the labelindicator matrix as a label.
"none"
: the scores for each class are returned.
 Raises
ValueError – When
y_pred
dimension is not one of [1, 2].ValueError – When
y
dimension is not one of [1, 2].ValueError – When
softmax=True
andother_act is not None
. Incompatible values.TypeError – When
other_act
is not anOptional[Callable]
.ValueError – When
average
is not one of [“macro”, “weighted”, “micro”, “none”].
Note
ROCAUC expects y to be comprised of 0’s and 1’s. y_pred must be either prob. estimates or confidence values.
 Return type
Union
[ndarray
,List
[float
],float
]
Confusion Matrix¶

monai.metrics.
compute_confusion_matrix
(y_pred, y, to_onehot_y=False, activation=None, bin_mode='threshold', bin_threshold=0.5, metric_name='hit_rate', average=<Average.MACRO: 'macro'>, zero_division=0)[source]¶ Compute confusion matrix related metrics. This function supports to calculate all metrics mentioned in: Confusion matrix. Before calculating, an activation function and/or a binarization manipulation can be employed to preprocess the original inputs. Zero division is handled by replacing the result into a single value. Referring to: sklearn.metrics.
 Parameters
y_pred (
Tensor
) – predictions. As for classification tasks, y_pred should has the shape [B] or [BN]. As for segmentation tasks, the shape should be [BNHW] or [BNHWD].y (
Tensor
) – ground truth, the first dim is batch.to_onehot_y (
bool
) – whether to convert y into the onehot format. Defaults to False.activation (
Union
[str
,Callable
,None
]) – ["sigmoid"
,"softmax"
] Activation method, if specified, an activation function will be employed for y_pred. Defaults to None. The parameter can also be a callable function, for example:activation = lambda x: torch.log_softmax(x)
.bin_mode (
Optional
[str
]) –[
"threshold"
,"mutually_exclusive"
] Binarization method, if specified, a binarization manipulation will be employed for y_pred."threshold"
, a single threshold or a sequence of thresholds should be set."mutually_exclusive"
, y_pred will be converted by a combination of argmax and to_onehot.
bin_threshold (
Union
[float
,Sequence
[float
]]) – the threshold for binarization, can be a single value or a sequence of values that each one of the value represents a threshold for a class.metric_name (
str
) – ["sensitivity"
,"specificity"
,"precision"
,"negative predictive value"
,"miss rate"
,"fall out"
,"false discovery rate"
,"false omission rate"
,"prevalence threshold"
,"threat score"
,"accuracy"
,"balanced accuracy"
,"f1 score"
,"matthews correlation coefficient"
,"fowlkes mallows index"
,"informedness"
,"markedness"
] Some of the metrics have multiple aliases (as shown in the wikipedia page aforementioned), and you can also input those names instead.average (
Union
[Average
,str
]) –[
"macro"
,"weighted"
,"micro"
,"none"
] Type of averaging performed if not binary classification. Defaults to"macro"
."macro"
: calculate metrics for each label, and find their unweighted mean.This does not take label imbalance into account.
"weighted"
: calculate metrics for each label, and find their average,weighted by support (the number of true instances for each label).
"micro"
: calculate metrics globally by considering each element of the labelindicator matrix as a label.
"none"
: the scores for each class are returned.
zero_division (
int
) – the value to return when there is a zero division, for example, when all predictions and labels are negative. Defaults to 0.
 Raises
AssertionError – when data shapes of y_pred and y do not match.
AssertionError – when specify activation function and
mutually_exclusive
mode at the same time.
 Return type
Union
[ndarray
,List
[float
],float
]
Hausdorff Distance¶

monai.metrics.
compute_hausdorff_distance
(seg_pred, seg_gt, label_idx, distance_metric='euclidean', percentile=None, directed=False)[source]¶ Compute the Hausdorff distance. The user has the option to calculate the directed or nondirected Hausdorff distance. By default, the nondirected Hausdorff distance is calculated. In addition, specify the percentile parameter can get the percentile of the distance.
 Parameters
seg_pred (
Union
[ndarray
,Tensor
]) – the predicted binary or labelfield image.seg_gt (
Union
[ndarray
,Tensor
]) – the actual binary or labelfield image.label_idx (
int
) – for labelfield images, convert to binary with seg_pred = seg_pred == label_idx.distance_metric (
str
) – : ["euclidean"
,"chessboard"
,"taxicab"
] the metric used to compute surface distance. Defaults to"euclidean"
.percentile (
Optional
[float
]) – an optional float number between 0 and 100. If specified, the corresponding percentile of the Hausdorff Distance rather than the maximum result will be achieved. Defaults toNone
.directed (
bool
) – calculate directed Hausdorff distance. Defaults toFalse
.
Average Surface Distance¶

monai.metrics.
compute_average_surface_distance
(seg_pred, seg_gt, label_idx, symmetric=False, distance_metric='euclidean')[source]¶ This function is used to compute the Average Surface Distance from seg_pred to seg_gt under the default setting. In addition, if sets
symmetric = True
, the average symmetric surface distance between these two inputs will be returned. Parameters
seg_pred (
Union
[ndarray
,Tensor
]) – first binary or labelfield image.seg_gt (
Union
[ndarray
,Tensor
]) – second binary or labelfield image.label_idx (
int
) – for labelfield images, convert to binary with seg_pred = seg_pred == label_idx.symmetric (
bool
) – if calculate the symmetric average surface distance between seg_pred and seg_gt. Defaults toFalse
.distance_metric (
str
) – : ["euclidean"
,"chessboard"
,"taxicab"
] the metric used to compute surface distance. Defaults to"euclidean"
.
Occlusion sensitivity¶

monai.metrics.
compute_occlusion_sensitivity
(model, image, label, pad_val=0.0, margin=2, n_batch=128, b_box=None)[source]¶ This function computes the occlusion sensitivity for a model’s prediction of a given image. By occlusion sensitivity, we mean how the probability of a given prediction changes as the occluded section of an image changes. This can be useful to understand why a network is making certain decisions.
The result is given as
baseline
(the probability of a certain output) minus the probability of the output with the occluded area.Therefore, higher values in the output image mean there was a greater the drop in certainty, indicating the occluded region was more important in the decision process.
See: R. R. Selvaraju et al. GradCAM: Visual Explanations from Deep Networks via Gradientbased Localization. https://doi.org/10.1109/ICCV.2017.74
 Parameters
model (
Module
) – classification model to use for inferenceimage (
Tensor
) – image to test. Should be tensor consisting of 1 batch, can be 2 or 3D.label (
Union
[int
,Tensor
]) – classification label to check for changes (normally the true label, but doesn’t have to be)pad_val (
float
) – when occluding part of the image, which values should we put in the image?margin (
Union
[int
,Sequence
]) – we’ll create a cuboid/cube around the voxel to be occluded. ifmargin==2
, then we’ll create a cube that is +/ 2 voxels in all directions (i.e., a cube of 5 x 5 x 5 voxels). ASequence
can be supplied to have a margin of different sizes (i.e., create a cuboid).n_batch (
int
) – number of images in a batch before inference.b_box (
Optional
[Sequence
]) – Bounding box on which to perform the analysis. The output image will also match in size. There should be a minimum and maximum for all dimensions except batch:[min1, max1, min2, max2,...]
. * By default, the whole image will be used. Decreasing the size will speed the analysis up, which might be useful for larger images. * Min and max are inclusive, so [0, 63, …] will have size (64, …). * Use ve to use 0 for min values and im.shape[x]1 for xth dimension.
 Return type
ndarray
 Returns
Numpy array. If no bounding box is supplied, this will be the same size as the input image. If a bounding box is used, the output image will be cropped to this size.