Ten cool, trending processors for IoT applications
The Internet of Things (IoT) has sparked the proliferation of connected devices. These devices, which house sensors to collect data of the day-to-day activities or monitoring purposes, are embedded with microcontrollers and microprocessors chips. These chips are mounted based on the data sensor needed to complete an assigned task. So we don’t have a one processor fits all architecture. For example, some devices will perform a limited amount of processing on data sets such as temperature, humidity, pressure, or gravity; more complicated systems, however, will need to handle (multiple) high-resolution sound or video streams. While high-performance delivery is a priority, consumption of low power is a must too. To reduce power consumption, engineers employ techniques like adaptive voltage scaling, power gating, and multiple reduced-power operating modes. Besides, nowadays, engineers are trying to design processor chips that can help in bridging Artificial Intelligence with IoT. Let us see some of the top 10 promising processors in the market today.
Developed by the Irish startup Movidius that was bought by Intel in 2016, the Myriad X is the company’s third-generation vision-processing unit and the first to feature a dedicated neural network compute engine, offering one tera operations per second (TOPS) of dedicated deep neural network (DNN) compute. The neural compute engine directly interfaces with a high-throughput intelligent memory fabric to avoid any memory bottleneck when transferring data. It supports FP16 and INT8 calculations. The Myriad X also features a cluster of 16 proprietary SHAVE cores and upgraded and expanded vision accelerators.
The Myriad X is available in Intel’s Neural Compute Stick 2, effectively an evaluation platform in the form of a USB thumb drive. It can be plugged into any workstation to allow AI and computer-vision applications to be up and running on the dedicated Movidius hardware very quickly.
Nvidia revealed the Jetson Xavier NX system-on-chip last fall as the “world’s smallest supercomputer,” offering “server-class performance” in a 10-watt power envelope for a variety of IoT form factors. The chip is the smallest form factor in Nvidia’s Jetson computing board lineup, measuring at roughly the size of credit card, and it comes with 384 CUDA cores and 48 tensor cores, allowing it to deliver up to 21 tera operations per second.
Thanks to Nvidia’s engineering and design, the Jetson Xavier NX provides up to 15 times higher performance than its Jetson TX2 in a smaller form factor with the same power draw. The Jetson Xavier NX also comes with Nvidia’s Deep Learning Accelerator, up to a six-core Carmel Arm CPU, up to six CSI cameras, 12 lanes for the MIPI CSI-2 camera serial interface, 8 GB of 128-bit LPDDR4x memory, gigabit Ethernet and Ubuntu-based Linux.
Part of the Jacinto 7 series for automotive advanced driver-assistance systems (ADAS), the TDA4VM is TI’s first system-on-chip (SoC) with a dedicated deep-learning accelerator on-chip. This block is based on the C7x DSP plus an in-house developed matrix multiply accelerator (MMA), which can achieve 8 TOPS.
The SoC can handle a video stream from a front-mounted camera at up to 8 MP or a combination of four to six 3-MP cameras plus radar, LiDAR, and ultrasonic sensors. The MMA might be used to perform sensor fusion on these inputs in an automated valet parking system, for example. The TDA4VM is designed for ADAS systems between 5 and 20 W.
The device is still in pre-production, but development kits are available now.
The RZ/A2M combines a proprietary accelerator to process image data with a 528-MHz Arm Cortex-A9 and 4-MB SRAM for machine-vision jobs.
Renesas designed a dynamically reconfigurable processor (DRP) made up of multiple cores that can exploit the parallelism in imaging algorithms. It expects that the DRP, described as similar to a GPU, will handle a wide variety of jobs, initially around inference tasks. Future products will target neural-net training at the edge.
As with all parallel processors, programming can be the big bugaboo. Renesas says that its DRP can be programmed in C using compilers and tools that it provides.
It is a 32-bit MCU with Bluetooth 5.0 for IoT endpoint devices such as home appliances and health-care equipment. The MCU also includes Renesas’s Trusted Secure IP, featured in its RX MCU family, to address Bluetooth security risks such as eavesdropping, tampering, and viruses.
The RX23W is based on Renesas’s RXv2 core, which achieves the high performance of 4.33 Coremark/MHz, with an improved floating-point unit (FPU) and DSP functions. The chip operates at a maximum clock frequency of 54 MHz. Optimized for system control and wireless communication, the RX23W provides full Bluetooth 5.0 Low Energy support, including long-range and mesh networking functions, and claims the industry’s lowest-level reception mode peak power consumption at three mA. The RX23W is available now in 7 × 7-mm 56-pin QFN and 5.5 × 5.5-mm 85-pin BGA packages with 512 KB of on-chip flash memory.
6. Kneron KL520
The first offering from American-Taiwanese startup Kneron is the KL520 neural network processor, designed for image processing and facial recognition in applications such as smart homes, security systems, and mobile devices. It’s optimized to run convolutional neural networks (CNNs), the type commonly used in image processing today.
Its KL520 AI system-on-chip last fall that combines dual Arm Cortex M4 CPUs with the company’s neural processing unit to provide high-performance inference in low-power devices such as smart locks, security cameras, and intelligent home appliances. Thanks to Kneron’s Reconfigurable Artificial Neural Network technology, the chip can adapt to processing and analyzing audio, 2D images and 3D images on the fly while also supporting AI frameworks like TensorFlow and PyTorch as well as neural networks like ResNet and MobileNet. The chip is available in edge AI modules made by Asus-owned AAEON.
The KL520 can run 0.3 TOPS and consumes 0.5 W (equivalent to 0.6 TOPS/W), which the company said is sufficient for accurate facial recognition, given that the chip’s MAC efficiency is high (over 90 percent).
The CEVA-X1™ is a multi-purpose combined DSP and control processor aimed at multi-mode IoT hub devices for handling cellular, LPWA, short-range communication, positioning, always-on sensor-fusion, and speech processing concurrently.
The CEVA-X1 is ideal for M2M protocol stack and baseband PHY control, including LTE Cat-NB1, Cat-M1, Sigfox, LoRa, Wi-Fi 802.11n, 802.11ah, Bluetooth, Bluetooth Low Energy, and Zigbee/Thread. It also supports positioning and motion-sensing functions, including GNSS (GPS, Beidou, GLONASS, Galileo), a fusion of multiple indoor positioning and activity sensors, voice activation, and sound processing.
The CEVA-X1 has explicitly been designed as a single-core IoT hub solution with dedicated instructions to optimize overall system power, performance, and chip area for baseband channel coding/decoding functions, as well as a fusion of multiple always-on sensors. Thanks to these optimizations, 5 to 10-year single battery operation at a meager cost is easily achievable.
The new Arm Cortex-M55 technology offers the enhanced ML performance and efficiency needed for the next generation of ST microcontrollers. The British chipmaker called the Cortex-M55 its most AI-capable Cortex-M processors to date, improving machine learning performance by up to 15 times and digital signal processing performance by five times compared to previous Cortex-M generations. The company also revealed the Ethos-u55, the company’s first micro neural processing unit that can be paired with the Cortexx-M55 to provide 480 times higher machine learning performance over previous Cortex-M chips. The Ethos-U55 is highly configurable and uses advanced compression techniques to lower energy use and reduce machine learning model sizes.
Microsoft worked with MediaTek to make this processor the reference chip for Azure Sphere, it’s all-in-one node-to-cloud IoT offering announced in April. It’s part of a wave of integrated solutions emerging from cloud providers, including Alibaba and Amazon.
Microsoft distinguished its approach by defining a so-called Pluton security block, implemented in the MT3260 on an Arm Cortex-M4F core that handles security operations. The part also includes a 500-MHz Cortex-A7 apps processor with 4-MB SRAM, a Wi-Fi subsystem, and support for 16-MB external flash.
AMD Ryzen™ Embedded V1000 processors provide an ideal balance of performance and power with integrated graphics enabling IoT gateways with learning and decision-making capabilities within a robust security platform.
Delivering discrete-GPU caliber graphics and multimedia processing, and compute performance up to 3.61 TFLOPS with thermal design power (TDP) as low as 12W and as high as 54W, AMD Ryzen Embedded V1000 SoCs equip system designers to achieve new levels of processing efficiency and design versatility. An industrial temperature processor option is also available that can operate in temperatures as low as -40°C. It provides integration of high-performance CPU, GPU, and extensive I/O in a single SoC with power scaling from 12W to 54W and enables high performance in small form factor – smaller board, lower power – lower TCO.