Semantic segmentation models implementing the DeepLabV3 architecture from Rethinking Atrous Convolution for Semantic Image Segmentation. These models use Atrous Spatial Pyramid Pooling (ASPP) to capture multi-scale context, and are available with ResNet-50 and ResNet-101 backbones.
All models are trained on a 20-class subset of COCO that corresponds to Pascal VOC categories, plus background (21 classes total).
| Model | mIoU | Pixel Acc | Params | GFLOPS | File Size | Weights Used |
|---------------------------|-------|-----------|--------|--------|-----------|---------------------------|
| model_deeplabv3_resnet50 | 66.4% | 92.4% | 42.0M | 178.72 | 161 MB | COCO_WITH_VOC_LABELS_V1 |
| model_deeplabv3_resnet101 | 67.4% | 92.4% | 61.0M | 258.74 | 233 MB | COCO_WITH_VOC_LABELS_V1 |All models use COCO_WITH_VOC_LABELS_V1 weights, trained on COCO with the
20 Pascal VOC categories (+ background = 21 classes).
Backbone weights default to IMAGENET1K_V1 (supervised ImageNet-1k) when
pretrained = FALSE and pretrained_backbone = TRUE.
When pretrained = TRUE, backbone weights are overridden by the full
segmentation model weights and pretrained_backbone is ignored.
The auxiliary classifier branch (aux_loss) is automatically enabled when
loading pretrained weights; set explicitly when training from scratch.
model_deeplabv3_resnet50(
pretrained = FALSE,
progress = TRUE,
num_classes = 21,
aux_loss = NULL,
pretrained_backbone = FALSE,
...
)
model_deeplabv3_resnet101(
pretrained = FALSE,
progress = TRUE,
num_classes = 21,
aux_loss = NULL,
pretrained_backbone = FALSE,
...
)(bool): If TRUE, returns a model pre-trained on ImageNet.
(bool): If TRUE, displays a progress bar of the download to stderr.
Integer. Number of output segmentation classes including
background. Default: 21 (Pascal VOC). Set to NULL to infer from
pretrained weights.
Logical or NULL. If TRUE, adds an auxiliary FCN classifier
head at an intermediate backbone layer, used as a secondary loss during
training. If NULL (default), inferred from pretrained weights.
Logical. If TRUE and pretrained = FALSE, loads
IMAGENET1K_V1 weights for the backbone only. Ignored when pretrained = TRUE.
Default: TRUE.
Other parameters passed to the resnet model.
model_deeplabv3_resnet50(): DeepLabV3 with ResNet-50 backbone
model_deeplabv3_resnet101(): DeepLabV3 with ResNet-101 backbone
Other semantic_segmentation_model:
model_convnext_segmentation,
model_fcn_resnet
if (FALSE) { # \dontrun{
library(magrittr)
norm_mean <- c(0.485, 0.456, 0.406)
norm_std <- c(0.229, 0.224, 0.225)
url <- paste0("https://upload.wikimedia.org/wikipedia/commons/thumb/",
"e/ea/Morsan_Normande_vache.jpg/120px-Morsan_Normande_vache.jpg")
img <- base_loader(url)
input <- img %>%
transform_to_tensor() %>%
transform_resize(c(520, 520)) %>%
transform_normalize(norm_mean, norm_std)
batch <- input$unsqueeze(1) # Add batch dimension: (1, 3, H, W)
# --- ResNet-50 backbone ---
model <- model_deeplabv3_resnet50(pretrained = TRUE)
model$eval()
output <- model(batch)
segmented <- draw_segmentation_masks(input, output$out$squeeze(1))
tensor_image_browse(segmented)
# Show most frequent class
mask_id <- output$out$argmax(dim = 2) # (1, H, W)
class_contingency_with_background <- mask_id$view(-1)$bincount()
class_contingency_with_background[1] <- 0L # we clean the counter for background class id 1
top_class_index <- class_contingency_with_background$argmax()$item()
cli::cli_inform("Majority class {.pkg ResNet-50}: {.emph {pascal_voc_classes(top_class_index)}}")
# --- ResNet-101 backbone ---
model <- model_deeplabv3_resnet101(pretrained = TRUE)
model$eval()
output <- model(batch)
segmented <- draw_segmentation_masks(input, output$out$squeeze(1))
tensor_image_browse(segmented)
# Show most frequent class
mask_id <- output$out$argmax(dim = 2) # (1, H, W)
class_contingency_with_background <- mask_id$view(-1)$bincount()
class_contingency_with_background[1] <- 0L # we clean the counter for background class id 1
top_class_index <- class_contingency_with_background$argmax()$item()
cli::cli_inform("Majority class {.pkg ResNet-50}: {.emph {pascal_voc_classes(top_class_index)}}")
} # }