Loads the MS COCO dataset for image captioning.
Root directory where the dataset is stored or will be downloaded to.
Logical. If TRUE, loads the training split; otherwise, loads the validation split.
Character. Dataset version year. One of "2014"
.
Logical. If TRUE, downloads the dataset if it's not already present in the root
directory.
Optional transform function applied to the image.
Optional transform function applied to the target (labels, boxes, etc.).
An object of class coco_caption_dataset
. Each item is a list:
x
: an (H, W, C)
numeric array containing the RGB image.
y
: a character string with the image caption.
Other caption_dataset:
flickr_caption_dataset
if (FALSE) { # \dontrun{
ds <- coco_caption_dataset(
train = FALSE,
download = TRUE
)
example <- ds[1]
# Access image and caption
x <- example$x
y <- example$y
# Prepare image for plotting
image_array <- as.numeric(x)
dim(image_array) <- dim(x)
plot(as.raster(image_array))
title(main = y, col.main = "black")
} # }