MCUNet embeds deep learning neural networks on the off-shelf microcontrollers to reduce memory usage.
Artificial Intelligence is a technology which is getting heavily researched on a routine basis. Researchers all around the world are working to make the application and implementation of AI faster and better. Over the years, humans have encountered instances where AI led to potential breakthroughs. Be it in the early detection of heart diseases or in discovering historical events, AI has come far since its inception.
In a move to ease out routine chores, researchers at Massachusetts Institute of Technology (MIT) and National Taiwan University have collaborated to embed deep neural networks over microcontrollers. This implies that AI in the form of tiny chips can be implemented in smart wearable devices and home appliances, leading to an advanced amalgamation between Internet-of-Things (devices) and AI. The research paper titled “MCUNet: Tiny Deep Learning on IoT Devices” is set to be presented at the Conference on Neural Information Processing Systems in December. The researchers, through this approach, are expecting to perform data analytics near the sensors of IoT devices thus widening the scope of AI applications.
The device created by the researchers is termed as MCUnet. It is a neural network architecture that enables ImageNet-scale deep learning on off-the-shelf microcontrollers. ImageNet is an image database with each node depicted by thousands of images. In the model, the deep learning design and the inference library is jointly optimized to eliminate the challenges for on-chip limited memory of traditional microcontrollers and to reduce the memory usage.
TinyNAS is the deep learning design, having a two-stage neural architecture search (NAS) method which handles the tiny and diverse memory constraints on various microcontrollers. The research cites that TinyNAS addresses the problem by first automatically optimizing the search space to fit tiny resource constraints, and then perform neural architecture search in the optimized space. TinyNAS generates different search spaces by scaling the input resolution and model width and then collects the computation FLOPs distribution of satisfying networks within the search space to evaluate its priority. Moreover, TinyNAS relies on the insight that a search space can accommodate. Higher FLOPs under memory constraint produces better deep learning model. Experiments show that the optimized space leads to better accuracy of the NAS searched model. TinyNAS automatically handles the diverse constraints such as device, latency, energy, memory, etc. associated with traditional microcontrollers under low search costs.
The researchers state that TinyEngine which is the memory-efficient inference library eliminates the unnecessary memory overhead, so the search space gets expanded to fit a larger deep learning model with higher accuracy. As inference library is interpretation based and requires extra runtime memory, TinyEngine compiles the code generator-based method eliminating memory overhead and adapts to memory scheduling, rather than layer-wise optimization to better strategize in reducing the memory usage. Lastly, it performs specialized computation optimization that is loop tiling, loop unrolling, and op fusion for different layers, which accelerates the inference.
The researchers observed that compared to traditional deep learning, MCUNet better utilizes the resources by system-algorithm co-design. The researchers conclude that the existing model achieved a record ImageNet accuracy of 70.7% on the off-shell microcontrollers.