R/models-mobilenetv3.R, R/models-mobilenetv3_large.R
model_mobilenet_v3.RdMobileNetV3 is a state-of-the-art lightweight convolutional neural network architecture designed for mobile and embedded vision applications. This implementation follows the design and optimizations presented in the original paper:MobileNetV3: Searching for MobileNetV3
This function mirrors torchvision::quantization::mobilenet_v3_large and
loads quantized weights when pretrained is TRUE.
model_mobilenet_v3_large(
pretrained = FALSE,
progress = TRUE,
num_classes = 1000,
width_mult = 1
)
model_mobilenet_v3_small(
pretrained = FALSE,
progress = TRUE,
num_classes = 1000,
width_mult = 1
)
model_mobilenet_v3_large_quantized(pretrained = FALSE, progress = TRUE, ...)(bool): If TRUE, returns a model pre-trained on ImageNet.
(bool): If TRUE, displays a progress bar of the download to stderr.
number of output classes (default: 1000).
width multiplier for model scaling (default: 1.0).
Other parameters passed to the model implementation.
The model includes two variants:
model_mobilenet_v3_large()
model_mobilenet_v3_small()
Both variants utilize efficient blocks such as inverted residuals, squeeze-and-excitation (SE) modules, and hard-swish activations for improved accuracy and efficiency.
| Model | Top-1 Acc | Top-5 Acc | Params | GFLOPS | File Size | Notes |
|------------------------|-----------|-----------|---------|--------|-----------|-------------------------------------|
| MobileNetV3 Large | 74.04% | 91.34% | 5.48M | 0.22 | 21.1 MB | Trained from scratch, simple recipe |
| MobileNetV3 Small | 67.67% | 87.40% | 2.54M | 0.06 | 9.8 MB | Improved recipe over original paper |model_mobilenet_v3_large(): MobileNetV3 Large model with about 5.5 million parameters.
model_mobilenet_v3_small(): MobileNetV3 Small model with about 2.5 million parameters.
Other classification_model:
model_alexnet(),
model_convnext,
model_efficientnet,
model_efficientnet_v2,
model_facenet,
model_inception_v3(),
model_maxvit(),
model_mobilenet_v2(),
model_resnet,
model_vgg,
model_vit
Other classification_model:
model_alexnet(),
model_convnext,
model_efficientnet,
model_efficientnet_v2,
model_facenet,
model_inception_v3(),
model_maxvit(),
model_mobilenet_v2(),
model_resnet,
model_vgg,
model_vit
if (FALSE) { # \dontrun{
# 1. Download sample image (dog)
norm_mean <- c(0.485, 0.456, 0.406) # ImageNet normalization constants, see
# https://pytorch.org/vision/stable/models.html
norm_std <- c(0.229, 0.224, 0.225)
img_url <- "https://en.wikipedia.org/wiki/Special:FilePath/Felis_catus-cat_on_snow.jpg"
img <- base_loader(img_url)
# 2. Convert to tensor (RGB only), resize and normalize
input <- img %>%
transform_to_tensor() %>%
transform_resize(c(224, 224)) %>%
transform_normalize(norm_mean, norm_std)
batch <- input$unsqueeze(1)
# 3. Load pretrained models
model_small <- model_mobilenet_v3_small(pretrained = TRUE)
model_small$eval()
# 4. Forward pass
output_s <- model_small(batch)
# 5. Top-5 printing helper
topk <- output_s$topk(k = 5, dim = 2)
indices <- as.integer(topk[[2]][1, ])
scores <- as.numeric(topk[[1]][1, ])
# 6. Show Top-5 predictions
glue::glue("{seq_along(indices)}. {imagenet_label(indices)} ({round(scores, 2)}%)")
# 7. Same with large model
model_large <- model_mobilenet_v3_large(pretrained = TRUE)
model_large$eval()
output_l <- model_large(input)
topk <- output_l$topk(k = 5, dim = 2)
indices <- as.integer(topk[[2]][1, ])
scores <- as.numeric(topk[[1]][1, ])
glue::glue("{seq_along(indices)}. {imagenet_label(indices)} ({round(scores, 2)}%)")
} # }