Construct Mask R-CNN model variants for instance segmentation task. Mask R-CNN extends Faster R-CNN by adding a mask prediction branch that outputs segmentation masks for each detected object.

model_maskrcnn_resnet50_fpn(
  pretrained = FALSE,
  progress = TRUE,
  num_classes = 91,
  score_thresh = 0.05,
  nms_thresh = 0.5,
  detections_per_img = 100,
  ...
)

model_maskrcnn_resnet50_fpn_v2(
  pretrained = FALSE,
  progress = TRUE,
  num_classes = 91,
  score_thresh = 0.05,
  nms_thresh = 0.5,
  detections_per_img = 100,
  ...
)

Arguments

pretrained

Logical. If TRUE, loads pretrained weights from local file.

progress

Logical. Show progress bar during download (unused).

num_classes

Number of output classes (default: 91 for COCO).

score_thresh

Numeric. Minimum score threshold for detections (default: 0.05).

nms_thresh

Numeric. Non-Maximum Suppression (NMS) IoU threshold for removing overlapping boxes (default: 0.5).

detections_per_img

Integer. Maximum number of detections per image (default: 100).

...

Other arguments (unused).

Value

A maskrcnn_model nn_module.

Functions

  • model_maskrcnn_resnet50_fpn(): Mask R-CNN with ResNet-50 FPN

  • model_maskrcnn_resnet50_fpn_v2(): Mask R-CNN with ResNet-50 FPN V2

Task

Instance segmentation over images with bounding boxes, class labels, and segmentation masks.

Input Format

Input images should be torch_tensors of shape (batch_size, 3, H, W) where H and W are typically around 800.

Output Format

Returns a list with:

  • features: Feature maps from the backbone

  • detections: List containing:

    • boxes: Bounding boxes (N, 4)

    • labels: Class labels (N)

    • scores: Confidence scores (N)

    • masks: Segmentation masks (N, 28, 28)

Available Models

  • model_maskrcnn_resnet50_fpn()

  • model_maskrcnn_resnet50_fpn_v2()

See also

Examples

if (FALSE) { # \dontrun{
library(magrittr)
norm_mean <- c(0.485, 0.456, 0.406)
norm_std  <- c(0.229, 0.224, 0.225)

# Load an image
url <- paste0("https://upload.wikimedia.org/wikipedia/commons/thumb/",
              "e/ea/Morsan_Normande_vache.jpg/120px-Morsan_Normande_vache.jpg")
img <- base_loader(url)

input <- img %>%
  transform_to_tensor() %>%
  transform_resize(c(800, 800)) %>%
  transform_normalize(norm_mean, norm_std)
batch <- input$unsqueeze(1)

# Mask R-CNN ResNet-50 FPN
model <- model_maskrcnn_resnet50_fpn(pretrained = TRUE)
model$eval()
pred <- model(batch)$detections

# Access predictions
boxes <- pred$boxes
labels <- pred$labels
scores <- pred$scores
masks <- pred$masks  # Segmentation masks (N, 28, 28)

# Visualize boxes
if (boxes$size(1) > 0) {
  boxed <- draw_bounding_boxes(input, boxes[1:5, ])
  tensor_image_browse(boxed)
}
} # }