H.264_understanding_kb_v1.1(2)

2018-12-20 22:47

Macroblock coding:

+--------+--------+-----------+------------+------+----+----+--------+ | ADDR | TYPE | QUANT | VECTOR | CBP | b0 | b1 .| ... b5. | +--------+--------+-----------+------------+------+----+----+--------+

? ADDR , address of block in image

? TYPE, identifies type of macroblock (intra-frame, inter frame, bi-directional inter

frame)

? QUANT, quantization value to vary quantization ? VECTOR, motion vector

? CBP, Coded Block Pattern, some blocks in macroblock match well, some match poorly

- this is bit mask indicating which one is present. ? b0, b1…b5, the blocks

6.2.4 Slices & Slice Group

? Slices are self-contained, which means it is self-decodable unit and can be decoded

independently, the advantage is when client receive a slice, it can decode it

immediately, not waiting the whole picture to be downloaded. Another advantage is when lose one slice during transmission, do not impact other slice. ? Slices are a sequence of macroblocks

? A slice group may contain 1 to several slices

? Slice group rule, different slice group may be provided with different video quality. For

example, multiple foreground slice group and single background slice group, as for former, we can use small compress ration, and as for latter, we can use big compress ration, because human eyes are interested in foreground area.

? FMO: Flexible Macroblock Ordering, this feature is introduced in H.264, we can see

the difference from following diagram as consider slice on MPEG-2/4/H.264.

6.2.5 FMO Slice Group Type

When using FMO, the image can be divided in different scan patterns of the macroblocks. There are seven FMO map types, referred to as Type 0 through Type 6. Type 6 is the most

general one and allows full flexibility to the user. The others use specific pattern rules. These patterns can be exploited to represent the MBAmap more efficiently:

? Type 0: uses run lengths which are applied consecutively until the map is complete.

Therefore only the run lengths are needed to rebuild the image on the decoder side. ? Type 1: also known as scattered slices; it uses a mathematical function, which is

known in both the encoder and the decoder, to spread the macroblocks. The distribution in the figure, in which the macroblocks are spread forming a chess board, is one common case (see Applications).

? Type 2: is used to mark rectangular areas, so-called regions of interest. In this case

the coordinates top-left and bottom-right of the rectangles are saved in the MBAmap. ? Type 3-5: are dynamic types that let the slice groups grow and shrink over the

different pictures in a cyclic way. Only the growth rate, the direction and the position in the cycle have to be known.

Note: Each color represents a slice group

Next we describe some of the possible applications of FMO in H.264/AVC video coding:

? The FMO Type 1 can be useful to maintain privacy in videoconferences.

The image is divided into 2 slice groups with the macroblocks spread forming a chess board. Each slice group is sent in a different packet. This way, if a hacker wants to decode the videoconference, he or she will have to know exactly in which two packets the information is being sent.

? The FMO Type 1 can also be used in transmission environments with a high

packet loss rate (example seen in Experimental Results).

? The FMO Type 2 can also be very useful. Let‘s imagine we want to transmit a

news bulletin video with a low bit rate. If we encode the whole image with the bit rate distributed equally for all the macroblocks , we'll get some bad results. Humans are particularly sensitive to image errors in faces, so we can mark the region of the newsreader‘s face so it's encoded with more bits. Therefore, the background of the set will be encoded with less bits and the newsreader with a lot more bits (maintaining the total bit rate as we had before). This way, the subjective visual quality will be enhanced significantly.

6.2.6 H.264 Slice Type

H.264/AVC的slice依照編碼的類型可以分成下列種類：

3) I-slice: slice的全部MB都採用intra-prediction的方式來編碼

4) P-slice: slice中的MB使用intra-prediction和inter-prediction的方式來編碼，但每一個

inter-prediction block最多只能使用一個移動向量；

5) B-slice: 與P-slice類似，但每一個inter-prediction block可以使用二個移動向量。比較

特別的是B-slice的?B‘是指Bi-predictive，與MPEG-2/-4 B-frame的Bi-directional概念有很大的不同，MPEG-2/-4 B-frame被限定只能由前一張和後一張的I(或P)-frame來做inter- prediction，但是H.264/AVC B-slice除了可由前一張和後一張影像的I(或P、B)-slice外，也能從前二張不同影像的I(或P、B)-slice來做inter- prediction，

而H.264/AVC另外增加兩種特殊slice類型：

(1) SP-slice: 即所謂的Switching P slice，為P-slice的一種特殊類型，用來串接兩個不同bitrate的bitstream.

(2) SI-slice: 即所謂的Switching I slice，為I-slice的一種特殊類型，除了用來串接兩個不同content的bitstream外，也可用來執行隨機存取(random access)來達到網路VCR的功能。

這兩種特殊的slice主要是考量當進行Video-On-Demand streaming的應用時，對同一個視訊內容的影片來說，server會預先存放不同bitrate的壓縮影片，而當頻寬改變時，server就會送出適合當時頻寬位元率的影片，傳統的做法是需要等到適當的時間點來傳送新的I-slice (容量較P-slice大上許多)，但因為頻寬變小導致需要較多的時間來傳送I-slice，如此會讓client端的影像有所延遲，為了讓相同content但不同bitrate的bitstream可以較平順地串接，使用SP-slice會很容易來達成(圖4)，不僅可以直接送出新的bitstream，也因為傳送的P-slice的容量較小，所以不會有時間延遲的情形出現。當client端的使用者要切換到新的接收頻道(channel)時，因為與目前傳送的bitstream不但內容不同連位元率也不同，傳統的做法需讓client重新緩衝(buffering)一段新頻道的內容(圖5)，此時是為了要接收新頻道bitstream的I-slice，然後再開始傳送新頻道bitstream後續的P-slice，如此client也會發生延遲接收的現象，而且當client要進行所謂的快轉、倒轉、隨機存取(random access)的動作時，傳統的做法無法達到即時的反應，H.264/AVC利用SI-slice就可以輕易地達到目的。

6.2.7 Picture/Frame

Multiple slice groups into one picture or frame.

How to represent a H.264 video stream

Each coded video stream consists of successive GOPs, and the GOP is a group of successive pictures within a coded video stream.

A GOP can contain the following picture types: ?

? ?

I-picture or I-frame (intra coded picture), reference picture, which represents a fixed image and which is independent of other picture types. Each GOP begins with this type of picture. P-picture or P-frame (predictive coded picture), contains motion-compensated difference information from the preceding I- or P-frame.

B-picture or B-frame (bidirectionally predictive coded picture), contains difference information from the preceding and following I- or P-frame within a GOP. D-picture or D-frame (DC direct coded picture), serves the fast advance

A GOP always begins with an I-frame. Afterwards several P-frames follow, in each case with some frames distance. In the remaining gaps are B-frames. A few video codecs allow for more than one I-frame in a GOP.

The I-frames contain the full image and do not require any additional information to reconstruct it. Therefore any errors within the GOP structure are corrected by the next I-frame. B-frames within a GOP only propagate errors inH.264, where B-frames can be referenced by other pictures in order to increase compression efficiency.

共8页:

H.264_understanding_kb_v1.1(2).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档