H.264
1. Version
2010.10.21 v1.0 gang.wang, initial
2010.10.29 v1.1 gang.wang, add H.264 rate control
Add more picture to describe h.264
2. Target audience
RD or GES.
3. Purpose
Build knowledge base for H.264 and let others understand it quickly.
4. Terms
N/A
5. Introduction
Currently we focus on two basic topics
1) H.264 video file format (byte-stream based & packet-based) 2) Different bit rate H.264 steaming means what [TBD]
6. H.264 video file format (concept)
Generally H.264 video is encapsulated by MPEG-2 TS (MPEG-2 part 1). And TS is a standard format for transmission and storage of audio, video, and data, and is used in
broadcast systems such as DVB and ATSC.
Here TS is not our focus, we focus on H.264 video stream itself.
Generally we know video stream is composed of frames, and frame is composed of pixels. We will give them a detailed description.
6.1 How to represent a pixel
As we know, frame is composed of pixels; we must know how to represent a pixel firstly, in fact there are two methods to represent a pixel, YUV, and YCbCr.
YUV is used for a specific analog encoding of color information in television systems.
YCbCr is used for digital encoding of color information suited for video and image compression and transmission such as MPEG and JPEG, and is the most common way to express color in a way suitable for compression/transmission in digital video/image systems. So we will focus on it as: 1) What is YCbCr
2) YCbCr subsample story
3) YCbCr subsample pattern notation
What is YCbCr
For each pixel, there are three numerical values that collectively describe its color. They are identified as Y, Cb, and Cr.
1) Y is the luma value, describing its luminance (brightness)
2) Cb and Cr collectively form the chrominance value, describing its color.
YCbCr subsample story (Chrominance subsampling)
Why we can do chrominance subsampling and why we need do it, the story is remove visual redundancy and reduces bandwidth:
During the early work on color television systems (analog, of course), note was taken of the fact that the human eye is able to discern finer detail conveyed by differences in luminance than for detail conveyed by differences in chromaticity.
The encoding scheme adopted there separately conveys the luminance-related value luma and the chromaticity-related value chroma (chrominance) over―subchannels‖ having different bandwidth (and thus supporting different levels of resolution)—the chrominance subchannel having reduced resolution capabilities.
The result was a system that well matched human perceptual response, allowing the conveyance of quality images with less overall bandwidth requirement than if equal
bandwidth were allocated to luma and chrominance information.
YCbCr subsampling pattern notation
Subsampling pattern always like 4:2:2, following is detailed diagram to depict all kinds of pattern, we do not explain the entire pattern, just give some of them.
4:4:4 --> no chroma subsampling, each pixel has Y, Cr and Cb values. 4:2:2 --> horizontally subsample Cr, Cb signals by a factor of 2. 4:1:1 --> horizontally subsampled by a factor of 4.
4:2:0 --> subsampled in both the horizontal and vertical dimensions by a factor of 2. 4:1:1 and 4:2:0 are mostly used in JPEG and MPEG.
How to represent a Frame/Picture
In H.264, in order to coding and compress video, pixels firstly group into block; block is then grouped into Macroblock. Macroblock is then grouped into slice, multi slice may then group into slice group, and lastly multi slice group is then composed of a picture. Unit order is as following:
Pixes?block?macroblock?slice?slice group?picture
6.2.1 Pixes
We already introduce it above.
6.2.2 Block
H.264 supports many type of block size, why? Because every picture may have different character, one same block size may be not good for two picture compression.
Currently H.264 support following type block size.
6.2.3 Macroblock
Multi block group into Macroblock, which size is 16*16 pixels ? Basic syntax & processing unit
? Contains 16x16 luma samples and 2 x 8x8 chroma samples (this is for subsampling
4:2:0)
? Macroblocks within a slice depend on each other
?