H.264_understanding_kb_v1.1(5)

2018-12-20 22:47

8. H.264 Profile, Level and Encoder

8.1 Profile

A profile is usually a set of algorithmic features.

8.2 Level

A level is usually a degree of capability

8.3 Software Encoder

9. Different bitrate video steaming means what

In fact, after made some research, it is very complicated topic. This is related with video quality, low bitrate generally means low video quality.

How the specified bitrate is computed with given video, please reference H.264 rate

control.

10. H.264 Rate Control

When we talked about Rate Control, it is a complicated topic; every H.264 transcoding chipset vendor may implement their own algorithm to reach it, it is always their IP, we only give a general description here collected from internet.

10.0 Terminology

Prediction. Both H.264 and MPEG-* may predict a macroblock by traditional inter

(temporal) prediction, i.e., a motion estimation from previous reference pictures followed by transmission of the motion vector. Additionally, H.264 supports advanced intra (spatial) prediction of a macroblock from encoded values for neighboring pixels that have already been encoded (e.g., in raster-scan order).

Residual. The difference between the source and prediction signals is called the residual, or the prediction error. A spatial transform is then applied to the residual to produce

transformed coefficients that carry any spatial detail that is not captured in the prediction itself or its reference pictures.

Distortion. Distortion refers to the difference between the original source image x, and the reconstructed image y after it has been decoded. In H.264, sum of squared difference is used to quantify distortion as (1/N) i |yi – xi |2, for any set of N pixels.

Complexity. As the saying goes, I can't define complexity, but I know it when I see it! A single source picture is complex if it is \activity is synonymous with source complexity for this case. However, for a video sequence, the meaning of complexity is, well, more complex! For example, if a video sequence consists of one busy object that translates slowly across the field of view, it may not require very many bits because the temporal prediction can easily capture the motion using a single reference picture and a series of motion vectors. It is difficult to define an inclusive video complexity metric that is also easy to calculate. See MAD

MAD: Mean Absolute Difference of Prediction Error. For rate control, what is more

important is the encoding complexity of the residuals that are left over after the inter or intra prediction process is finished. The Mean Absolute Difference of Prediction Error is usually closely related to encoding complexity. Suppose xi is the source value for ith pixel, then:

Spatial Activity. This term is used to quantify the amount of spatial variation within a part of the picture, normally a block of N pixels. Suppose the N pixel values xi, i = 1,..,N. Then the activity for those N pixels is: (1/N) i (xi – )2, where = (1/N) i xi. In other words the spatial activity is the sample variance of a block's values. It is the measure for local complexity used in MPEG-2.

Bitrate. Bitrate refers to the bits per second consumed by a sequence of pictures, i.e., bitrate = (average bits per picture) / (frames per second). In practice, it is equated to the reliable network bandwidth that is provisioned or available for the stream.

Quantization Parameter (QP). Residuals are transformed into the spatial frequency domain by an integer transform that approximates the familiar Discrete Cosine Transform (DCT). The Quantization Parameter determines the step size for associating the

transformed coefficients with a finite set of steps. Large values of QP represent big steps that crudely approximate the spatial transform, so that most of the signal can be captured by only a few coefficients. Small values of QP more accurately approximate the block's spatial frequency spectrum, but at the cost of more bits. In H.264, each unit increase of QP lengthens the step size by 12% and reduces the bitrate by roughly 12%.

Group of Pictures (GOP). The Group of Picture concept is inherited from MPEG and refers to an I-picture, followed by all the P and B pictures until the next I picture. A typical MPEG GOP structures might be IBBPBBPBBI. Although H.264 does not strictly require more than one I picture per video sequence, the recommended rate control approach does require a repeating GOP structure to be effective. Thus, H.264 rate control will not work properly if the IntraPeriod parameter is set to 0.

Basic unit. The authors of references [4] and [5] introduced this useful term that

expresses the granularity on which QP is adjusted in the feedback control loop. If the basic unit is a picture, then the rate controller's adjustments to QP are uniform across the picture. In MPEG-2, the basic unit is a macroblock. Initially, most H.264 applications will probably use the picture as basic unit, but ultimately a full or partial row of macroblocks is expected to yield the best compromise between uniform bitrate and uniform quality.

10.1 Introduction

A rate control algorithm dynamically adjusts encoder parameters to achieve a target bitrate. It allocates a budget of bits to each group of pictures, individual picture and/or sub-picture (macroblock) in a video sequence. Rate control is not a part of the H.264 standard, but the standards group has issued non-normative guidance to aid in implementation.

Block-based hybrid video encoding schemes such as the MPEG and h.26* families are inherently lossy processes. They achieve compression not only by removing truly redundant information from the bitstream, but also by making small quality compromises

共8页:

H.264_understanding_kb_v1.1(5).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档