in ways that are intended to be minimally perceptible. In particular, the quantization parameter QP regulates how much spatial detail is saved. When QP is very small, almost all that detail is retained. As QP is increased, some of that detail is aggregated so that the bit rate drops – but at the price of some increase in distortion and some loss of quality. Figure 1 suggests that relationship for a particular input picture – if you want to lower bit rate, you can do so by lowering QP at a cost of increased distortion. Figure 2 suggests that as source complexity varies during a sequence, you move from one such curve to another.
Figure 1. For a particular source frame
Figure 2. But when source complexity varies….
10.2 General Rate Control Model
Here we will introduce VBR and CBR rate control model. 1) VBR means Variable Bit Rate, 2) CBR means Constant Bit Rate
10.2.1 VBR Rate Control Model
Figure 3 illustrates open loop (or VBR) operation of a video encoder. The user supplies two key inputs – the uncompressed video source and a value for QP. As the source sequence progresses, you will get compressed video of fairly constant quality, but the bitrate may vary dramatically. Because the complexity of pictures is continually changing in a real video sequence, it is not so obvious what value of QP to pick. If you fix QP for an \the sequence having slow motion and uniform areas, then the bit rate will go up dramatically when you reach the \
Figure 3. Open Loop Encoding (VBR)
10.2.2 CBR Rate Control Model
In reality, constraints imposed by decoder buffer size and network bandwidth force us to encode video at a more nearly constant bitrate. To do this, Figure 4 suggests that we must dynamically vary QP based upon estimates of the source complexity, so that each picture (or group of pictures) gets an appropriate allocation of bits to work with. Rather than specifying QP as input, the user specifies demanded bitrate instead.
10.2.3 H.264 Rate Controller Framework
Here we give a conceptual Rate Controller Framework.
User Interface:
This interface will provide three parameters to affect QP value.
1) Demanded Bitrate, which is our target bitrate, determined by customer‘s
requirement and network environments
2) Buffer Capacity and initial buffer occupancy, which control virtual buffer model‘s
capability
Encoder Interface:
This interface will provide two parameters to affect QP value.
1) Basic Unit Residuals 2) Actual bits
Rate Controller Components
1) Rate-Quantization Model
The heart of the algorithm is a quantitative model describing Figure 2 — the relationship between QP, actual bitrate and a surrogate for encoding complexity. However, the bits and complexity terms should be associated only with the residuals. Why?? Because the quantization parameter QP can only influence the detail of information carried in the transformed residuals. QP has no direct effect on the bitrates associated with overhead, prediction data, or motion vectors. The Mean Average Difference (orMAD) of the prediction error is used for this purpose.
The model takes an algebraic form such as ResidualBits = C1 * MAD / QP + C2 * MAD / QP
But it may take a simpler form (with C2 = 0) or a more complicated form involving exponentials or other basis curves for fitting. This equation [note that our term ResidualBits is synonomous with the term Texture Bits used by other authors [2]] corresponds to equation 2-84 of [6] and to equation 1 of [2]. The free coefficients C1 and C2 may be estimated empirically, by providing hooks in the encoder for extracting the residual coefficients, as well as the number of residual bits needed to transmit them.
Having established the model in (2), we can solve for the demanded QP when the target value of ResidualBits is supplied by theBit Allocation modules in Figure 5.
2) Complexity Estimation
As indicated above, we need a simple metric that reflects the encoding complexity associated with the residuals. The MAD of the prediction error is a convenient surrogate for this purpose:
This MAD is an inverse measure of predictor's accuracy and (in the case of interprediction) the temporal similarity of adjacent pictures.
Ideally, the MAD would be estimated after encoding the current picture, but that would require us to encode the picture again after the QP is selected – quite a burden for a computationally intensive standard like H.264! Instead, we can usually assume that this complexity surrogate varies gradually from picture to picture, and estimate it based upon data extracted from the encoder for previous pictures. Note that this assumption fails at a scene change.
3) QP-Limiter
Figures 4 and 5 represent a closed loop control system which must be appropriately damped to guarantee stability and to minimize perceptible variations in quality. For difficult
sequences having rapid changes in complexity, QP-demand may oscillate noticeably, so a rate limiter is applied which typically limits changes in QP to no more than ± 2 units between pictures.
Virtual Buffer Model
Any compliant decoder is equipped with a buffer to smooth out variations in the rate and arrival time of incoming data. The corresponding encoder must produce a bitstream that satisfies constraints of the decoder, so a virtual buffer model is used to simulate the fullness of the real decoder buffer.
The change in fullness of the virtual buffer is the difference between the total bits encoded into the stream, less a constant removal rate assumed to equal the bandwidth (or demanded bitrate). The buffer fullness is bounded by zero from below and by the buffer capacity from above. The user must specify appropriate values for buffer capacity and initial buffer fullness, consistent with the decoder levels supported.
4) QP Initializer
QP must be initialized upon start of video sequence. An initial value may be input manually, but a better approach is to estimate it from the demanded bits per pixel, i.e.,
DemandedBitsPerPixel = DemandedBitrate / (FrameRate * height * width) Equation 2-67 of [6] provides a recommended table relating initial QP to DemandedBitsPerPixel.
5) GOP Bit Allocation
Based upon the demanded bit rate and the current fullness of the virtual buffer, a target bit rate for the entire group of pictures (GOP) is determined, and QP for the GOP's I-picture and first P-picture is also determined.
The GOP Target is fed into the next block for detailed bit allocation to pictures or to smaller basic units.
6) Basic Unit Bit Allocation
the \control recommendations [6]. With this approach, scalable rate control may be pursued to different levels of granularity – such as picture, slice, macroblock row or any contiguous set of macroblocks. That level is referred to as a \control is resolved, and for which distinct values of QP are calculated.
If the basic unit is smaller than a picture, then this block in Figure 5 actually breaks out into two layers – one for the picture itself and another for the basic unit. Figure 5 and our discussion are limited to the case where the picture itself is the basic unit. For details on how to treat smaller basic units, please see [5] or [6].
For H.264, the emphasis is on computing QP for each stored picture (usually a
P-picture)[Strictly speaking, the H.264 standard allows B pictures to be used as reference pictures. However, this is not expected to be common usage.]. The QP's for non-stored pictures (ordinarily B-pictures) are then interpolated (and offset) from QP values for their neighboring P pictures. First, considering the MAD of the picture, one can determine a target level for the buffer fullness. Then using the buffer target level, it is easy to calculate the target bits for the picture.
11. Diagram from ebook for good understanding H.264
One-way Video coding scenarios