Camera data formats - An Introduction
Updated: Dec 13, 2020
As we know, there are 2 types of cameras, Monochrome and Color.
Lets start with Monochrome. In case of Monochrome sensors, the sensor collects luminescence data and it is not capable of collecting color information. Thus, the video format from Monochrome cameras as represented in terms of Y. When we look at the specification of monochrome cameras, we come across formats such as Y8, Y16 etc. What do they mean?
Every sensors has a specification called bit depth. It, basically, indicates the number of bits used to represent the digitized value of the electric charge stored in the pixel after exposure. We find sensors that have bit depth of 8bits, 10bits and 12bits. From our experience with digitization, we can say that the sensor with 12bit depth has the lowest noise floor as it has more data content. However, the sensor with the 8bit depth can support faster frame rates as the data content is less.
When we work with monochrome cameras, based on the bit depth of the sensors, we can expect the corresponding format. When we are working with 8bit sensor based cameras, the video output that we come across is Y8. Each pixel is presented by 8 bits. However, when we working with 10bit or 12bit sensors, we come across video outputs such Y8 or Y16. In case of Y8, each pixel is represented by 8 bits, the LSBs are ignored. Whereas, in case of Y16, we have addition of dead bits on the LSB side to take representation to 16 bits. The host applications understand the video only in bytes. Thus, we have to accommodate to either 8 bits or 16 bits.
Lets get into color cameras. As we know, color cameras are designed using RAW Bayer sensors. The RAW Bayer output is passed through the pipeline of ISP functions to get color video or image output. The color video output that we come across often are YUV422, RGB888 and RGB565. What do they mean?
YUV is a color encoding scheme that enables assignment of brightness and color values to each pixel. ’Y’ represents brightness or Luma value and ‘UV’ represents color values or chrome values. On the other hand, RGB is a color encoding scheme that enables each pixel to be represented by red, green and blue values.
Each pixel is represented by 24 bits. Each pixel is assigned unique Y, U and V values – each 1 Byte. In the data stream, each byte is ordered in a manner where we can understand the value corresponding to each pixel.
Each Pixel represented by 16 bits. This format having lower data size enables faster frame rates. However, in this format the Y value is unique for each pixel and is represented by 1 Byte. However, the U and V values are shared between 2 pixels. Thus, the PC buffers 2 pixels data to start decoding each pixel.
This format having lower data size enables faster frame rates. However, in this format the Y value is unique for each pixel and is represented by 1 Byte. However, the U and V values are shared between 4 pixels. Thus, the PC buffers 4 pixels data to start decoding each pixel.
This format is also called planar format. In this format U and V values are group together. Unlike the other formats where data stream had Y, U and V values streamed sequentially, in this format, we get Y values first and then we get U and V values. This enables the format to be more compressible.
As the format name indicates, each pixel is represented by 24 bits. Each pixel is represented by R, G and B values and all 8 bits each. This format provides the most information among the RGB formats.
This format enables the representation of each pixel with 16 bits. R value is represented by 5 bits, G value is represented by 6 bits and B Value is represented by 5 bits. This is relatively common among RGB formats as it uses less bits per pixel, thus more frame rates.
At Regami Solutions, we shall be able to assist you with the right recommendations for your use case and help you with the right imaging solutions.
Please feel free to get in touch with us in case of any queries.