Loads the MS COCO dataset for object detection and segmentation.
Root directory where the dataset is stored or will be downloaded to.
Logical. If TRUE, loads the training split; otherwise, loads the validation split.
Character. Dataset version year. One of "2014"
or "2017"
.
Logical. If TRUE, downloads the dataset if it's not already present in the root
directory.
Optional transform function applied to the image.
Optional transform function applied to the target (labels, boxes, etc.).
An object of class coco_detection_dataset
. Each item is a list:
x
: a (C, H, W)
torch_tensor
representing the image.
y$boxes
: a (N, 4)
torch_tensor
of bounding boxes in the format c(x_min, y_min, x_max, y_max)
.
y$labels
: an integer torch_tensor
with the class label for each object.
y$area
: a float torch_tensor
indicating the area of each object.
y$iscrowd
: a boolean torch_tensor
, where TRUE
marks the object as part of a crowd.
y$segmentation
: a list of segmentation polygons for each object.
y$masks
: a (N, H, W)
boolean torch_tensor
containing binary segmentation masks.
The returned object has S3 classes "image_with_bounding_box"
and "image_with_segmentation_mask"
to enable automatic dispatch by visualization functions such as draw_bounding_boxes()
and draw_segmentation_masks()
.
The returned image is in CHW format (channels, height, width), matching the torch convention.
The dataset y
offers object detection annotations such as bounding boxes, labels,
areas, crowd indicators, and segmentation masks from the official COCO annotations.
if (FALSE) { # \dontrun{
ds <- coco_detection_dataset(
train = FALSE,
year = "2017",
download = TRUE
)
item <- ds[1]
# Visualize bounding boxes
boxed <- draw_bounding_boxes(item)
tensor_image_browse(boxed)
# Visualize segmentation masks (if present)
masked <- draw_segmentation_masks(item)
tensor_image_browse(masked)
} # }