TL;DR: EfficientNets are a class of convolutional neural networks that achieve SotA accuracy on ImageNet while being ~8x smaller and ~6x faster on inference than the previous best architectures. They are based on a NAS approach that optimizes both for accuracy and compute efficiency. The output of this architecture search is then scaled up in a smart way to achieve better performance.
Imagine that you have a convolutional network that does not reach the accuracy you'd like to reach on given problem. Just increase the model size, right?
How do you scale the depth, width and resolution, though? Which of these do you increase by how much?
Tan and Le propose to scale models along all of these dimensions at the same time according to compound scaling coefficient: If you want to increase your resource usage by \(2^N\), increase your width by \(\beta^N\), your depth by \(\alpha^N\) and your image size by \(\gamma^N\), where, \(\alpha, \beta, \gamma\) are identified by a grid search on the original baseline model. They demonstrate that this approach works well to scale up MobileNets and ResNets.
The authors then use a neural architecture search approach similar to Tan et al. (2019), to find a baseline architecture that maximizes an objective function that includes both the model accuracy and FLOPS.
They then use the scaling approach describe above on the results of the architecture search (relatively similar to a MnasNet) to generate a range of larger models and correspondingly better performance:
Impressively, this approach yields models that match large ResNets in their ImageNet performance at 10x fewer FLOPS, 5x fewer parameters and about 6x faster inference.
The authors point out that EfficientNets also perform well in transfer learning settings.
Modifications of EfficientNets are still in the top-10 of the best performing models on ImageNet on Papers with Code.
There's an official TensorFlow implementation including pre-trained checkpoints available. There are numerous in-official PyTorch implementations, for instance Luke Melas-Kyriazi's or Ross Wightman's.