Audio        Video        System Control        Security        Data

VIDEO                                                                  

            H.323 does not require terminals to have video capabilities. If video capabilities are provided, it must adhere to the H.261 protocol with QCIF as its mode. Support for other video compression schemes and resolutions are optional.      

            H.261 is one of the most widely used standards for video compression internationally. It defines the algorithms for encoding and decoding video at both CIF and QCIF. CIF is the Common Intermediate Format, with a luminance signal of 352 pixels per line and 288 lines (i.e. 352x288 samples per frame), and (optionally) two chrominance signals each of 176x144 samples per frame. The frame rate for CIF is approximately 30 fps. QCIF means Quarter CIF, with a resolution of 176 x 144 pixels. The video format is 4:2:0 luminance and chrominance (YCrCb).

 

Figure 1. CIF Picture

 

 

 

 

 

 

Figure 2. QCIF Picture  

Frame Organization

In the H.261 standard, video frames are processed by first dividing the image into smaller sections or “blocks”. A block is defined as an area with dimension 8 x 8 in pixels.

A CIF picture can then be defined as having dimensions 44 x 36 in blocks. Blocks belong to a larger group known as macroblocks. A macroblock consists of six blocks, 4 blocks for the Y luminance information, 1 for the Cr chrominance information, and 1 for the Cb chrominance information. These six blocks combine to form a 16 x 16 pixel matrix formed from the Y information with the Cr and Cb information averaged out over the larger block. Thirty-three macroblocks form a larger block known as a GOB or Group of Blocks.  A CIF picture would have a dimension of 6 x 2 in GOBs.

H.261 compression and encoding

The H.261 compression algorithm is DCT-based and resembles MPEG to some degree. DCT stands for Discrete Cosine Transformation. It is a popular method because it processes data in such a way that it can be more efficiently compressed using run length encoding (RLE). DCT transforms a block of pixel intensities into a block of frequency transform coefficients. The transform is then applied to new blocks until the entire image is transformed. Huffman/RLE encoding can then be performed on the processed data.

            A major difference between H.261 and MPEG is that the quantization value used is variable and is determined by the amount of data reduction required to fit the available video bandwidth. H261 has rate control which allows it to cope with a variable video bandwidth. H.261 trades picture quality against motion, which results in moving pictures having poor image quality as compared to still pictures. 

            The H.261 encoding process uses past frames to encode differences, much like the way MPEG does it. Unlike MPEG, H.261 only has two types of frames. These are the intra-coded frames and the inter-coded frames. MPEG utilizes three frames, the I or intra-frame, the P or predicted frame, and the B or bi-directional frame. H.261 only uses past frames as references for its motion estimation algorithm. MPEG’s B-frame uses the nearest preceding I or P frame and the next future I or P frame as its reference. H.261’s intra-coded frames are frames which are wholly encoded with no reference. Inter-coded frames are frames whose encoding is based on the previous frame. The H.261 standard states that each macroblock must be intra-coded at least every 132 frames to prevent errors from accumulating. Terminals may also ask for a complete picture update, wherein an intra-coded frame is sent.

 

Figure 3. H.261 encoder

          If a frame is being encoded as an intra-frame, the macroblocks go through a DCT transformation, quantization using a value determined by the rate control feedback and finally through the zigzag Huffman encoding to produce the final bitstream output. This output is replicated and sent to a decoder so it can be utilized by the inter-encoder.

Figure 4. H.261 decoder

          The H.261 decoder is much simpler. First of all, the encoder is not merely an encoder but also has a built-in decoder. This built-in decoder is used for decoding the bitstream output to recreate the reference frames needed for motion compensation. A frame can take one of two paths when it reaches the decoder. This all depends on the type of frame. Inter and Intra coded frames take different paths during decoding. The intra-coded macroblock is decoded by simply reversing the DCT-quantization-encode process. The decoded data is then used  to build up the frame. This frame can then be displayed. A copy of this frame is also stored for use as reference when decoding inter-coded macroblocks and frames.

          Inter-coded macroblocks are decoded using the vector and reference frame and then filtered to improve the appearance before sending the decoded macroblock to be incorporated into the new frame.

H.263

             The H.263 video standard is based on H.261 and is designed to compress moving pictures at lower bit rates.

 The main elements in the basic H.263 compression algorithm  are:

1.      inter-picture prediction/motion compensation

2.      block transformation

3.      quantization

4.      variable length coding (VLC)

 A decoder would then have the following components:

1.      variable length decoding

2.      inverse quantization

3.      inverse block transformation

4.      motion compensation

       

            The coding algorithm of H.263 has some improvements and changes over H.261 and it can often achieve the same video quality that H.261 can produce with less than half the number of bits in the coded stream. This is the reason why it is the preferred video codec over H.261. It can provide almost the same quality at half the bandwidth price.

 

            The main factors that contributed to H.263’s significant improvement over H.261 are:

1.      half pixel motion vector prediction – half pixel precision is used for motion compensation, as opposed to H.261’s use of full pixel precision and loopfilters.

2.      negotiable options – Four negotiable coding options are included in H.263. These are:

a.       Unrestricted Motion Vector mode - In this mode motion vectors are allowed to point outside the picture. "Non-existing" pixels from outside of a picture are reconstructed based on the edge pixels. This mode offers extensive advantage during movements along the edges of a picture (including camera movements).

b.      Advanced prediction mode - In addition to 16x16 motion vector for some macroblocks four 8x8 vectors are used. Encoder decides which type of vectors to use. Four vectors take more bits but offer better prediction.

c.       Syntax-based arithmetic coding mode - Instead of VLC coding this mode utilizes arithmetic coding. It improves compression ratio of 3-4%, keeping SNR at the same level.

d.      PB-frames mode - In this mode two consecutive pictures are coded as one unit, similarly to the MPEG compression. There is one frame predicted from the last decoded frame (P) and one predicted bidirectionally (B) from the last decoded frame and currently decoded P-frame. For simple video sequences it allows doubling the frame rate without increasing the bandwidth.

3.      support for new picture resolutions – H.263 defines new frame formats such as Sub-QCIF (128 x 96), 4CIF (4 X CIF), and 16CIF (16 X CIF).

4.      some parts of the datastream structure are now optional, so the codec can be configured for a lower bitrate or better error recovery

 It is expected that H.263 will replace H.261 in many applications.

    

Audio        Video        System Control        Security        Data