Google’s AI Lab Unveiled A New Framework For Efficient Chip Designing

Google’s AI Lab Unveiled A New Framework For Efficient Chip Designing

Apollo, by Google AI, uses deep learning to select parameters for chip designing that can improve performance by 25%

Google Brain, Google's AI lab, has advanced its in-house development of custom chips with deep learning as it can make better decisions than humans with regards to how to lay out circuitry in a chip.

March saw Google AI unveil to the world their new framework, Apollo, for optimizing AI accelerator chip design. Apollo uses revolutionary algorithms for choosing chip parameters that minimize deep learning inference latency with minimum chip area. With Apollo, researchers came up with designs that achieved 24.6% speedup over those designs selected by baseline algorithm.

According to author and research scientist Amir Yazdanbakhsh's blog post, Apollo looks for a set of hardware parameters like memory size, I/O bandwidth, and processor units, that give the output of the best inference performance for the given deep learning model. Apollo can efficiently explore the parameters using evolutionary algorithms and transfer learning. This will result in reduced overall time and decreased cost of production for the design.

"We believe that this research is an exciting path forward to further explore ML-driven techniques for architecture design and co-optimization (like compiler, mapping, and scheduling) across the computing stack to invent efficient accelerators with new capabilities for the next generation of applications", said Yazdanbakhsh.

Deep learning models are being developed to solve problems ranging from computer vision to natural language processing (NLP). The drawback to this is that these models require heavy computing and many resources at interference time, straining the hardware limitations of edge and mobile devices. Custom accelerator hardware can improve model inference latency but at the cost of several modifications to the model.

In contrast, the Apollo team's strategy is to modify the accelerator hardware to optimize performance for a particular deep learning model. The accelerator will be based on a 2D array of processing elements that will contain many single instructions and multiple data cores, each. The basic pattern can be modified by choosing values for many data parameters like the size of the Pe array, the number of cores per processing element, etc. On the whole, there are up to 500 million parameter combinations in the design space. As the proposed accelerator's design must be simulated in software, judging its performance on a deep-learning model is computing and time-intensive.

Apollo builds on Google's internal Vizier, a service for black-box optimization, along with Vizier's Bayesian method that is used as a baseline comparison for evaluating Apollo's performance. The Apollo framework developed by Google's AI lab supports a wide range of optimization strategies like random search, evolutionary search, and a method called population-based black-box optimization.

The design of accelerator hardware for scaling AI inference in an active study area. Apple's new M1 processor has a neural engine designed to improve AI computations. Researchers at Stanford published a recent article in their journal Nature, talking about a system called Illusion. Illusion uses a network of smaller chips to mimic a single large accelerator.

Similar to Google's Apollo, Movidius's Myriad 2 is a multi core chip that supports computational imaging and visual awareness for mobile devices, wearables, and embedded applications. One of its features is to provide highly sustainable performance efficiency for a range of computational imaging and vision apps, including those that have low latency requirements on milliseconds.

Related Stories

No stories found.
logo
Analytics Insight
www.analyticsinsight.net