Les B-Frames et le Bidirectional encoding Il peut exister jusqu'à 3 types d'images dans un DivX: - les I-Frames (Intra) - les P-Frames (Prédis) - les B-Frames (Bidirectionnels) Dans le DivX Codec 5, les seuls types d'images utilisés pour l'encodage étaient les images I et P. Les Images-I (I-Frames) sont encodés en prenant compte seulement des informations contenues dans cette même image, sans aucune interaction avec les images extérieures. Une image-I est le même concept que compresser une seule image au format JPEG. Les Images-P (P-Frames). P-frames (Predicted) are forward predicted and may refer to either an I-frame or P-frame. Elles sont encodées en fonction de l'image qui les précède. Si une séquence vidéo est composée d'une même chaîne d'image, alors seulement les quelques pixels variant d'une image à l'autre sont enregistrés. Exemple: le journal télévisé. Le fond change très peu et seulement les petits mouvements du présentateur sont enregistrés (rappelez-vous qu'il y a 25 images/ seconde). Donc, au lieu d'utiliser chaque image séparément comme si vous les sauvegardiez en JPEG, vous pouvez utiliser les redondances de ces images grâce aux images-P (P-Frames). Fondamentalement une image-P est une future image qui détermine où un bloc de l'image précédente s'est déplacé dans l'image-P courante. Donc, au lieu d'encoder spatiallement (JPEG) l'image, l'image-P dit simplement: "Hé, le bloc de l'image précédente est maintenant joignable à la position (X, Y)", ce qui a pour effet de diminuer la quantité d'informations nécessaires. En fait le codec transmet essentiellement des informations de coordonnées, beaucoup plus que des informations graphiques (voir Images-I). Le DivX Pro 5.0 introduit aussi la capacité de créer des Images-B. Les Images-B permettent au DivX Codec de prédire des images à l'avance, en choisissant la meilleure prédiction entre 2 images prédictives au lieu d'une seulement. Les Images-B . B-frames are not only codec by using forward predicted frames but also from backward predicted frames which can be an I or P frame. Using B-frames reduces the amount of data needed to code a frame and improves quality more specifically in areas where moving objects reveal hidden areas. Compensation Globale de Mouvement La Global Motion Compensation (GMC)(Compensation Globale de Mouvement) aide à ameillorer les scènes complexes où il y a des zooms and panning. L'habilité à réduire la quantité de données d'une image à l'autre peut-être réduite grâce au partage de similarités entre les zooms et les since there is a commonality within panning and zooming scenes that can be used to more efficiently compensate for what is more normally a group of blocks in such scenes. Quarter Pixel As explained in the "B-frames" summary, data is reduced when the difference between two frames (prediction error) is transmitted instead of the entire image being sent. The difference 11 in a successive frames composition is generally computed on a macroblock-by-macroblock basis (16x16 pels) or on a block by block basis (8x8 pels). For example, a part of an image located in a block at grid location (1, 1) may move to grid location (1, 2) in the next frame. As you may realize an image in one block will likely need more accuracy than just the ability to move on a limited block by block basis with an accuracy that is limited to an integer pixel unit (1, 1). DivX has increased the previous accuracy of using a half pel (1. 5, 1.5) to include the ability of using "Quarter Pel" (1. 25, 1. 75) accuracy with the Codec release. Quarter Pel performs a specific filtering on each block to produce a virtual block that should represent how the original block should appear if it is moved a ¼ of a pixel unit.
"New" Quarter Pixel Dans la version 5.0.3, le Quarter Pixel Motion Estimation a été mis à jour vers la nouvelle version du standart 14496-2 DCOR1. (Les versions plus anciennes que la version 5.0.3 peuvent toujours décoder les Qpel). Psychovisual Enhancements By exploiting what we know about the Human Visual System (HVS) we have increased the efficiency of allocation video data helping to increase the perception of quality in video. For example, if the human visual system has very low sensitivity to a specific type of characteristic in an image we may decrease the amount of data located at this location and re-allocate this data to a location within an image where the human visual system is much more sensitive. The Psychovisual enhancements are applied to both a frame and macroblock basis. One of the important factors in evaluating Psychovisual Modeling is to NOT just compare a single frame but to compare a full sequence. An image may look worse or better when a single frame is examined but the key to reducing data is to reduce data in a way that the human visual system does not notice over a video sequence running at a full frame rate (e. g. 30 frames per second). Psychovisual modeling is a fairly new field when applied to real videos or movies. This area is full of possibilities we have only just started on and will continue to explore.
Pre----Processing Processing Processing Processing Video noise is often referred to as "specks", "snow", or "hair" within a video (i. e. "snow" that is visible when watching TV over an antenna"). Any number of the processes of video 13 13 Page 14 15 production and distribution can add noise into the video. Some of the worse video noise can be seen in old or poorly recorded movies. Noise can be a big problem when it comes to compressing the video as the noise consumes a large proportion of the bits available for wanted video. The preprocessing filter uses digital signal processing techniques to remove the noise from the source material prior to encoding. Broadly, there are two classes of filter that can reduce noise: temporal and spatial. To explain how they work, let's consider a single pixel somewhere in the image. Spatial filtering looks at the neighborhood pixels within the pixel's own frame and applies a smoothing, or low-pass function. A temporal filter smoothes pixels at the same position over a few consecutive frames to reduce the effect of noise. By using these techniques to reduce noise prior to video encoding we can, in certain content, increase our compression ration and improve quality. There are 4 settings for preprocessing: 1. Light 2. Normal 3. Strong 4. Extreme
As with all features there is a particular content that may be affected more than others. Generally old noisy content can see dramatic effects in file size reduction and quality. Normal pre-processing should not introduce any visual degradation of the source file, however we have provided a "Light" setting for very tricky source, the "Strong" and "Extreme" settings will wash the source a little, however it will remove the most amount of data and should be used when file size is more important than quality Keyframe Keyframe Keyframe Keyframe The DivX encoder will automatically insert a key frame every time it detects a scene change. However, long interval between scene changes are possible, and when they occur, the encoder automatically inserts keyframes with user specified frequencies. Keyframes are the largest of all frames, so the frequency of their placement can have a drastic effect on the encoded file. We have found 300 frames to be the maximum interval the encoder should go without inserting a keyframe. This corresponds to at least one keyframe every 10 seconds in a 30 fps stream. Also, depending on the player used, the maximum key frame interval may determine the maximum interval for seeking. This occurs when players are designed to seek to "I" or keyframes. Reducing the keyframe interval can also improve delays and the quality of streaming content Deinterlace Deinterlace Deinterlace Deinterlace Interlacing, invented in the 1940's, is probably the earliest form of video compression. Instead of transmitting a complete video frame 60 times every second, engineers discovered that they could halve the bandwidth needed by the TV signal if they sent alternately odd and even "fields", each field comprising just the odd or even picture lines. Interlacing is most commonly 14 14 Page 15 16 found on material intended for TV broadcast, or material created by consumer camcorders. Interlacing is not a problem if it is correctly displayed on an interlaced display device, i. e. a television. An interlaced video camera running at 30fps captures the odd-numbered lines of a frame in 1/ 60 th of a second, and the even-numbered lines in the next 1/ 60 th of a second. When viewed on a progressive display device, such as a PC, two fields are interlaced to create one frame. Because half the frame's lines are captured a fraction of a second later than the other half, fast-moving objects may appear jagged, the result of the object advancing slightly within 1/ 60 th of a second. The "progressive" format is preferred for PC playback since the entire frame is captured each second and no de-interlacing will be required. It is possible to remove the jagged-edge interlace artifacts by applying a process known as "de-interlacing" to the video. The DivX codec is able to de-interlace the source video prior to encoding. For this to work correctly, it is important that the video has not been resized vertically external to the codec. Resizing within the codec does not affect the operation of the codec's de-interlacing. The DivX® codec has two main options for de-interlacing: "All frames are progressive" -This is the default setting where de-interlacing and IVTC are never used. It is suitable for material that is already in a progressive format. "All frames are interlaced" -The codec will use an adaptive algorithm to deinterlace every frame prior to resizing and encoding. The video should not be cropped or resized prior to encoding. Resizing within the codec will cause no problems. Interlaced Video Support Interlaced Video Support Interlaced Video Support Interlaced Video Support Encoding and decoding of interlaced content is now supported. If the content you're encoding is interlaced you can either de-interlace the content so that it is progressive or preserve the interlaced fields. Preserving the interlaced fields may sometimes result in better video quality during playback, but the cost is a bigger file size. Interlace support is compliant with the MPEG-4 standard and uses block level decisions to make its selection versus progressive or interlace. This means that interlace coding is used when interlace artifacts are detected and progressive coding is used when motion is very low. Due to the very nature of interlaced video the minimum number of lines for NTSC is 480 and 576 for PAL (actual number can be lower but result is not guaranteed). Interlaced content is usually found on TV or captured with a video camera (DV sources are usually interlaced). Data Rate Control Parameters (RC) Data Rate Control Parameters (RC) Data Rate Control Parameters (RC) Data Rate Control Parameters (RC) The Data Rate Control parameters can only be changed through the operating system registry. The DivX Codec uses a patent-pending dual asymmetric rate control. It uses dual period control loops to achieve a best balance reacting and adjusting to the variations in a short time sale while controlling and averaging the bitrate in the long time scale. Essentially, it is well balanced as it adapts dynamically to the content of the scene, providing optimal allocation of bandwidth. It is flexible and easily adjustable for different application scenarios. The creation of the DivX Rate Control algorithm comes from testing many real full-length movies against the DivX codec in multiple user environments (i. e. TV, PC, PDA, etc.). There are several settings that may be experimented with. We highly recommend that only experiences users change these settings since minor changes can cause significant effects.
Maximum and Minimum Quantizers Maximum and Minimum Quantizers Maximum and Minimum Quantizers Maximum and Minimum Quantizers The quantizer is one of the most important parameters in video coding. The quantizer controls how fine the encoder codes the video sequence. The rule of thumb is: for the same frame, a smaller quantizer equals better quality and higher bit consumption while a larger quantizer equals lower bit consumption and inferior quality. Since every frame has a different amount of complexity a subjective equality in quality can be seen among different frames even with the varying quantizers. Basically, the quantizer operates the rate control. Balancing the quality of video with bit consumption can be quite an art form. Note: RC settings are truly "for the adventurous souls". These default settings should give near optimum results.
RC Averaging RC Averaging RC Averaging RC Averaging RC Averaging controls how fast the RC forgets the rate history. Larger values usually result in better higher motion scenes and worse low motion scenes Rate Control Down/ Up Reaction Rate Control Down/ Up Reaction Rate Control Down/ Up Reaction Rate Control Down/ Up Reaction RC Down/ Up Reaction -control the relative sensitivity in reaction to high or low motion scenes. Larger values usually result in better high motion scenes, but larger bit consumption. 17 17 Page 18 19 All these parameters are inter-correlated. The effect from their setting is approximate and often depends on the settings of the other parameters. Data Partitioning Data Partitioning Data Partitioning Data Partitioning Data portioning may be useful in any situation where transmission errors may occur, such as a streaming or broadcast environments. Data Partitioning is a different way of organizing data in the stream. A frame is composed of adjacent macroblock and each macroblock usually includes motion vector (prediction) and texture information. This allows the stream to be more resilient to transmission errors, in this modality the motion vector and the texture are separate (not interlaced with each single macroblock) and grouped in video packets. Each video packet is and independent entity inside the steam and can be decoded separately from the others. Use of Data Partitioning can also permit the activation of a series of tools that allows for error recovery and packet resynchronization.
Performance/ Quality Performance/ Quality Performance/ Quality Performance/ Quality There are 5 settings available for Performance/ Quality. Essentially if more quality is desired more CPU is needed. There should rarely be a time when you will need to pick any other quality setting other than "Slowest" as it produces the BEST quality. Accuracy in motion estimation is sacrificed at the to increase the performance of encoding content. With today's CPU's and the efficiency of the DivX Codec encoding at up to "Full Screen" resolutions at real-time encoding speeds is possible. However lower quality settings could be useful when there is not enough CPU power and a sacrifice in quality can be justified. Generally, real-time or faster than real-time encoding speeds are only necessary when broadcasting real time video feeds, yet the faster the encoder the lower the cost of encoding. Leave this setting at "Slowest" unless otherwise necessary DivX 5.0.3 CLI parameters DivX 5.0.3 CLI parameters DivX 5.0.3 CLI parameters DivX 5.0.3 CLI parameters The command line interface parameters will be automatically updated as you use the GUI to change the codec parameters. The opposite is also true. When you type parameters into the CLI, the GUI will show the changes after you press the tab or enter key. This makes the CLI an easy shorthand for advanced users to use to manage their settings. |