The Pascal Visual Object Classes (VOC) dataset is a widely used benchmark for object detection and semantic segmentation tasks in computer vision.

pascal_voc_classes(class_id = 1:21)

Arguments

class_id

Integer vector of 1-based class identifiers. Must be within [1, 21].

Details

This dataset provides RGB images along with per-pixel class segmentation masks for 20 object categories, plus a background class. Each pixel in the mask is labeled with a class index corresponding to one of the predefined semantic categories.

The VOC dataset was released in yearly editions (2007 to 2012), with slight variations in data splits and annotation formats. Notably, only the 2007 edition includes a separate test split; all other years (2008–2012) provide only the train, val, and trainval splits.

The dataset defines 21 semantic classes: "background", "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "dining table", "dog", "horse", "motorbike", "person", "potted plant", "sheep", "sofa", "train", and "tv/monitor". They are available through the classes variable of the dataset object.

See also