Google Coral Edge TPU competes with NVIDIA Jetson Nano! This article compares the two newly launched Edge AI chips and analyzes their respective advantages and disadvantages.
Edge intelligence is called the last mile of artificial intelligence.
Google just launched Coral Edge TPU in March. It is a development board (Coral Dev Board) priced at less than RMB 1,000. It consists of Edge TPU module and Baseboard. The parameters are as follows:
Nvidia also released the latest NVIDIA Jetson Nano last month. Jetson Nano is an embedded computer device similar to the Raspberry Pi. It is equipped with a quad-core Cortex-A57 processor and the GPU is an NVIDIA Maxwell-based graphics card with 128 NVIDIA CUDA cores, 4GB of LPDDR4 memory, 16GB eMMC 5.1 storage, and supports 4K 60Hz video decoding. At present, there are not many evaluation reports on these two products. Today, we bring you a review of two products by netizen Sam Sterckval. In addition, he also tested i7-7700K + GTX1080 (2560CUDA), Raspberry Pi 3B +, and a 2014 MacBook pro including an i7-4870HQ (without CUDA-capable kernel).
Sam uses MobileNetV2 as a classifier, pre-trains on the imagenet dataset, uses this model directly from Keras, and uses TensorFlow on the back end. Use the floating-point weights of the GPU and the 8-bit quantized tflite version of the CPU and Coral Edge TPU.
First, load the model and a magpie image. Performing a prediction first as a warm-up, Sam found that the first prediction is always more telling than the subsequent prediction. Then Sleep for 1 second to ensure that all thread activities are terminated, and then classify the same image 250 times. Using the same image for all classifications can ensure that the data bus is kept close throughout the test.
Sam found that the quantitative tflite model score using the CPU is different, but it seems that it always returns the same prediction results as other products. He suspects that the model is a bit strange, but can ensure that it does not affect performance.
Comparative analysis
In the first histogram, we can see that there are three more prominent data, two of which are implemented by Google Coral Edge TPU USB accelerator, and the third is implemented by Intel i7-7700K assisted by NVIDIA GTX1080.
If we compare it carefully, we will find that GTX1080 is actually completely unable to compete with Google's Coral. To know that the maximum power of GTX1080 is 180W, while Coral Edge TPU is only 2.5W.
NVIDIA Jetson Nano's score is not high. Although it has a GPU that supports CUDA, it is actually not much faster than the 2014 MBP i7-4870HQ, but it is still faster than this quad-core, hyper-threaded CPU. However, compared with the i7 50W energy consumption, the average energy consumption of Jetson Nano is always maintained at 12.5W, which means that the power consumption is reduced by 75% and the performance is improved by 10%.
NVIDIA Jetson Nano
Although Jetson Nano does not show an impressive FPS rate in the MobileNetV2 classifier, its advantages are very obvious:
It is very cheap, low energy consumption. And more importantly, it runs the operation of TensorFlow-gpu or any other ML platform, just like other devices we usually use. As long as our script does not go deep into the CPU architecture, we can run the exact same script as i7 + CUDA GPU, and we can also train! Sam strongly hopes that NVIDIA should use TensorFlow to preload L4T.
Google Coral Edge TPU
Sam unabashedly expressed his love for the elaborate design and high efficiency of Google Coral Edge TPU. Below we can compare how small the Edge TPU is. Edge TPU is the so-called "ASIC" (application-specific integrated circuit), which means that it has small electronic components such as FETs and can be fired directly on the silicon layer, so that it can accelerate the thrust speed in specific scenarios. But Edge TPU cannot perform back propagation.
This means multiplying each element (pixel) of the image with each pixel of the kernel and then adding these results to create a new "image" (feature map). This is the main job of Edge TPU. Multiply all the contents at the same time, then add all the contents at crazy speed. There is no CPU behind this, as long as you pump the data into the left buffer.
We see that Coral’s performance/watts comparison is so different. It’s a bunch of electronic devices that are designed to perform the required bitwise operations without any overhead.
Sum up
Why doesn't the GPU have an 8-bit model?
The GPU is essentially designed as a fine-grained parallel floating-point calculator. The Edge TPU is designed to perform 8-bit operations, and the CPU has a faster method of 8-bit content than full-bit wide floating-point numbers, because they must deal with this problem in many cases. Why choose MobileNetV2?
The main reason is that MobileNetV2 is one of the pre-compiled models provided by Google for Edge TPU.
What other products does Edge TPU have?
It used to be different versions of MobileNet and Inception. As of last weekend, Google launched an update that allowed us to compile custom TensorFlow Lite models. But only for TensorFlow Lite models. In contrast, Jetson Nano has no restrictions in this regard.
Raspberry Pi + Coral compared to others
Why does Coral look much slower when connected to Raspberry Pi? Because Raspberry Pi only has a USB 2.0 port.
The i7-7700K will be faster on both Coral and Jetson Nano, but it still cannot match the latter two. So speculate that the bottleneck is the data rate, not the Edge TPU.
Edge intelligence is called the last mile of artificial intelligence.
Google just launched Coral Edge TPU in March. It is a development board (Coral Dev Board) priced at less than RMB 1,000. It consists of Edge TPU module and Baseboard. The parameters are as follows:
Nvidia also released the latest NVIDIA Jetson Nano last month. Jetson Nano is an embedded computer device similar to the Raspberry Pi. It is equipped with a quad-core Cortex-A57 processor and the GPU is an NVIDIA Maxwell-based graphics card with 128 NVIDIA CUDA cores, 4GB of LPDDR4 memory, 16GB eMMC 5.1 storage, and supports 4K 60Hz video decoding. At present, there are not many evaluation reports on these two products. Today, we bring you a review of two products by netizen Sam Sterckval. In addition, he also tested i7-7700K + GTX1080 (2560CUDA), Raspberry Pi 3B +, and a 2014 MacBook pro including an i7-4870HQ (without CUDA-capable kernel).
Sam uses MobileNetV2 as a classifier, pre-trains on the imagenet dataset, uses this model directly from Keras, and uses TensorFlow on the back end. Use the floating-point weights of the GPU and the 8-bit quantized tflite version of the CPU and Coral Edge TPU.
First, load the model and a magpie image. Performing a prediction first as a warm-up, Sam found that the first prediction is always more telling than the subsequent prediction. Then Sleep for 1 second to ensure that all thread activities are terminated, and then classify the same image 250 times. Using the same image for all classifications can ensure that the data bus is kept close throughout the test.
Sam found that the quantitative tflite model score using the CPU is different, but it seems that it always returns the same prediction results as other products. He suspects that the model is a bit strange, but can ensure that it does not affect performance.
Comparative analysis
In the first histogram, we can see that there are three more prominent data, two of which are implemented by Google Coral Edge TPU USB accelerator, and the third is implemented by Intel i7-7700K assisted by NVIDIA GTX1080.
If we compare it carefully, we will find that GTX1080 is actually completely unable to compete with Google's Coral. To know that the maximum power of GTX1080 is 180W, while Coral Edge TPU is only 2.5W.
NVIDIA Jetson Nano's score is not high. Although it has a GPU that supports CUDA, it is actually not much faster than the 2014 MBP i7-4870HQ, but it is still faster than this quad-core, hyper-threaded CPU. However, compared with the i7 50W energy consumption, the average energy consumption of Jetson Nano is always maintained at 12.5W, which means that the power consumption is reduced by 75% and the performance is improved by 10%.
NVIDIA Jetson Nano
Although Jetson Nano does not show an impressive FPS rate in the MobileNetV2 classifier, its advantages are very obvious:
It is very cheap, low energy consumption. And more importantly, it runs the operation of TensorFlow-gpu or any other ML platform, just like other devices we usually use. As long as our script does not go deep into the CPU architecture, we can run the exact same script as i7 + CUDA GPU, and we can also train! Sam strongly hopes that NVIDIA should use TensorFlow to preload L4T.
Google Coral Edge TPU
Sam unabashedly expressed his love for the elaborate design and high efficiency of Google Coral Edge TPU. Below we can compare how small the Edge TPU is. Edge TPU is the so-called "ASIC" (application-specific integrated circuit), which means that it has small electronic components such as FETs and can be fired directly on the silicon layer, so that it can accelerate the thrust speed in specific scenarios. But Edge TPU cannot perform back propagation.
This means multiplying each element (pixel) of the image with each pixel of the kernel and then adding these results to create a new "image" (feature map). This is the main job of Edge TPU. Multiply all the contents at the same time, then add all the contents at crazy speed. There is no CPU behind this, as long as you pump the data into the left buffer.
We see that Coral’s performance/watts comparison is so different. It’s a bunch of electronic devices that are designed to perform the required bitwise operations without any overhead.
Sum up
Why doesn't the GPU have an 8-bit model?
The GPU is essentially designed as a fine-grained parallel floating-point calculator. The Edge TPU is designed to perform 8-bit operations, and the CPU has a faster method of 8-bit content than full-bit wide floating-point numbers, because they must deal with this problem in many cases. Why choose MobileNetV2?
The main reason is that MobileNetV2 is one of the pre-compiled models provided by Google for Edge TPU.
What other products does Edge TPU have?
It used to be different versions of MobileNet and Inception. As of last weekend, Google launched an update that allowed us to compile custom TensorFlow Lite models. But only for TensorFlow Lite models. In contrast, Jetson Nano has no restrictions in this regard.
Raspberry Pi + Coral compared to others
Why does Coral look much slower when connected to Raspberry Pi? Because Raspberry Pi only has a USB 2.0 port.
The i7-7700K will be faster on both Coral and Jetson Nano, but it still cannot match the latter two. So speculate that the bottleneck is the data rate, not the Edge TPU.
請按此登錄後留言。未成為會員? 立即註冊