News information

Domestic edge AI chips landed in the field of artificial intelligence

Release on : Oct 13, 2021

Domestic edge AI chips landed in the field of artificial intelligence
Edge AI chip
Nowadays, due to the data shock caused by the rise of the Internet of Things, the processing of important Internet of Things sensor data is getting closer and closer to the original location of the data, and there is a demand for machine learning technology based on edge computing. In the past two years, the theme of artificial intelligence development has been very clear, that is, landing applications. Among the many AI technologies, whether it is the upper-level algorithm application or the product, it ultimately depends on the guarantee of the underlying computing power, that is, the AI ​​chip.
Edge AI market landing trend
Since last year, the edge AI market has entered an explosive period, and the number of AIoT terminals has surpassed that of smartphones. From the perspective of AI landing scenarios, the two major workloads of AI chips are mainly data training and inference calculations. Among them, the growth rate of inference computing market share is catching up with the data training market.
Now an algorithm model can achieve only 2KB, but the performance is more than 4MB neural network algorithm. The miniaturization of deep learning algorithms shows that the super-large model represented by GPT-3 is not suitable for the enterprise market.
In the new IoT framework, data can be processed locally on the device. This requires consideration of the limited deployment conditions of the algorithm model in different scenarios in the design of the underlying chip, including the allocation of computing power, power consumption, and silicon area. From the perspective of the cost utility of unit computing power, inference calculation can better represent the landing direction required by the enterprise-level market.
Edge AI chip landing challenge
The first is that the scenarios of edge AI are diversified, and different scenarios have different requirements for power consumption and performance of the chip. On the other hand, the market capacity of fragmented scenarios is uncertain, and it is necessary to strike a balance between the engineering costs and benefits of chip research and development. This is a test that all chip manufacturers need to face.
The acquisition of high-quality data is also a major difficulty, that is, how to filter reliable data from big data. Big data does not mean high-quality data. Performing deep learning tasks on AI chips requires continuous optimization of hardware sensors.
In addition, for traditional customers, in addition to the procurement cost of smart products, there are also usage costs. The power consumption of the chip and whether the product is easy to deploy all affect the implementation and promotion of AI. How chip manufacturers customize chips with different computing powers according to different scenarios is also a pain point in the landing link.
At present, the edge AI chip lacks a highly available development platform, the design of software compilation tools is complicated, and the user's development and use threshold is high, but these are foreseeable will continue to be improved and iterated in the process of landing.
Commercial Edge AI Chip
Sunburst series
For AIoT, Horizon has launched the Rising Sun series of edge AI chips. Through the joint efforts of IC design and software, a balance between performance, power consumption, flexibility and cost has been achieved. The Rising Sun 2 edge AI chip adopts the BPU Bernoulli 1.0 architecture, which can provide 4TOPS equivalent computing power, and perform real-time detection and accurate recognition of multiple types of targets. Rising Sun 2 integrates Dual-Cortex A53, which can efficiently support a variety of mainstream AI tasks. It also supports EMCC and SPI Flash.
Rising Sun 3 is launched by Horizon, focusing on a new generation of AIoT edge AI chips with low power consumption and high performance. It integrates Horizon's most advanced Bernoulli 2.0 architecture AI engine (BPU), which can provide equivalent computing power of 5TOPS.
The new BPU architecture greatly improves the support for advanced CNN network architecture and greatly reduces the occupancy rate of DDR bandwidth by AI operations. Supplemented by the Horizon Tiangong Kaiwu AI development platform, it greatly simplifies the algorithm development and deployment process and reduces the landing cost of AI products.
Under the Bernoulli 2.0 BPU structure, DDR Utilization is increased by 5 times. The advanced ISP processing algorithm makes it possible to obtain 1200 w pixel high-quality images in wide dynamic and low illumination scenes. Rising Sun 3 can simultaneously process 4 ~ 8 Camera Sensor inputs of different resolutions, and supports a variety of image post-processing. It also supports H.264 / H.265 encoding and decoding, with a performance of [email protected]

(Sunrise 3)
Kanzhi K210 /K510
The first-generation chip Kanzhi K210 is specifically designed for machine vision tasks. The floating-point computing capability can reach 1.28TFLOPS, which is comparable to mainstream development options in the embedded field. At the same time, its power consumption is only 0.3W, the power consumption of typical working scenarios is less than 1W, and the power consumption per unit of computing power is low, which is a very economical choice. The second generation Kanzhi K510 is upgraded according to the actual landing situation and customer feedback. This IP core is re-architected to realize the requirements for computing resources, storage and bandwidth at different levels in the neural network, and to increase the data reuse rate and reduce the power consumption of the chip.
Secondly, for the speed and quality of data acquisition, Kanzhi K510 is equipped with a brand new vision module. Compared with the first-generation chip, the K510 has been greatly optimized in terms of frame rate and peripherals. The frame rate per terabyte has reached the industry's leading level. In addition, K510 also supports floating-point BF16 calculation, which has advantages over similar products in scenarios that are not suitable for model quantization.

(Schematic diagram of Video subsystem)
In the long run, edge AI chips will elevate enterprise IoT applications to a whole new level. Smart devices driven by AI chips will help expand existing markets while changing the value distribution methods of various industries such as manufacturing, construction, logistics, agriculture, and energy.