ST is launching today the VD55G4 and VD65G4 800 X 700 CMOS image sensors with new statistical output capabilities, enabling developers to more easily compute certain image characteristics or more quickly adjust the sensor’s settings for their applications. The sensors also feature automatic white balance and exposure, as well as automatic wake-up from the previous generation, while offering optimizations that reduce power consumption to only 0.9 mW. Moreover, they support I3C at up to 12 MHz and SPI at up to 42 MHz. The new interfaces allowed ST to show a new face detection demo running on an STM32U3, which supports two I3C buses.
Underestimating the role of the CMOS image sensor

It’s about more than the eye can see
While most engineers understand that computer vision applications remain challenging, too few realize how much the CMOS sensor itself plays a critical role in the viability of these systems. Many focus on the software framework, MCU computational throughput, or the underlying neural networks. Similarly, when teams pore over the sensor itself, most simply focus on a few key technical specifications, such as the image’s resolution. However, the reality is that the amount of data processed by the sensor is far greater than the image our eyes perceive. In fact, one issue in our industry today is that too few developers know how to use the statistical analysis from an advanced image sensor.
It’s about more than bandwidth can transport
The industry is experiencing a transitional period. As resolution increases, previous interfaces just can’t handle the amount of information traveling from the sensor to the MCU. A typical data rate for an I2C interface is about 3.4 Mb/s in high-speed mode, which is about one tenth of what’s available over a MIPI CSI-2 over I3C interface, although both use the same number of wires (i.e., two wires, SDA and SCL). This is even more important, as developers require more information from the sensor, such as more advanced statistical analysis. In such an instance, being able to assign various data streams to different channels can vastly optimize operations. However, this is impossible without support at the sensor level.
It’s about more than capturing light
Similarly, global shutter sensors are not only about the image they capture but also about how they capture it. For instance, it is common to stream an image at a low frame rate, such as one frame per second, to reduce power consumption and lower the application’s computational requirements. However, if the sensor takes too long to analyze the frame or perform its statistical analysis, it could miss a frame or produce a suboptimal one that differs significantly from the next. Concretely, that means computer vision applications may suffer errors or perform poorly. Consequently, developers may decide to rely on the host MCU to compensate for those shortcomings, but blow up their power consumption in the process.
Relying on the features of the VD55G4 and VD65G4

Statistics
The VD55G4 and VD65G4 are the first VDx5Gx CMOS sensors to make their statistical analysis available to developers, enabling them to use information previously kept within the device. This is also why we had to come up with new part numbers. We needed to create new devices capable of sending this data without negatively impacting performance or boot time. It was also an opportunity to create an RGB version of our sensor, as the previous generation, the VD55G1, only had a monochrome variant. And while monochrome remains a popular choice for computer vision applications, the popularity of the RGB version shows that many AI systems now rely on color and therefore require the VD65G4.
Interfaces
This new generation of VDx5Gx devices is also the first one to offer an SPI interface capable of streaming 800 x 700 RGB images at eight frames per second. Specifically, it enables designers to use cost-effective microcontrollers for face detection applications, thereby making them far more accessible. Additionally, the new sensors use an independent I2C interface to output their statistical analysis. As a result, developers can enjoy a high-bandwidth communication channel for their image and a separate one for the data that could help them tweak their application. Systems with higher-power constraints can also use an I3C link for both the image and parameter tweaks. Put simply, the new generation offers a lot more flexibility.
Auto exposure and white balance
The new devices build on the foundations laid out by the VD55G1. We find the same 40-nm bottom-layer 3D-stacking architecture, which enables greater data-processing capabilities on the sensor without sacrificing the light sensitivity of the photocytes on the top layer. These global shutter sensors also come with automatic white balance and automatic wake-up. Automatic white balance and automatic exposure use sensor statistics to adjust the image before rendering the first frame. It takes the result of multiple exposures to determine the most accurate brightness and white balance. Because everything happens on the device, the sensor only needs to send the correct image to the host MCU, thereby significantly reducing the data transmitted.
Auto wake-up
As for auto wake-up, the sensors can detect motion in a single frame, thanks in part to information from the automatic exposure feature and the ability to segment a frame into zones, thereby enabling low-power, latency-free motion sensing. The feature also runs entirely on the sensor, meaning that it doesn’t require the assistance of the host MCU. This is possible thanks to an always-on differential mode that can track changes between frames and immediately throw an interrupt to wake the host MCU. That means that it’s possible to have an automatic wake-up system that only requires between 0.9 mW and 1.5 mW.
How to get started?
The best way to get started is to grab the STEVAL-CAM-M0I evaluation board or the STEVAL-EVK-U0I USB kit, along with one of the camera modules. VD65G4 modules with standard, wide, and ultra-wide fields of view will be available in a few weeks. VD55G4 modules with wide and standard fields of view will be available later this year. Both modules will also have dedicated software packages with drivers and implementation examples.
