Producing H.264 Video
Encoders in Action
As you would suspect, not all encoding tools address these options the same way. For example, when encoding with the Apple H.264 codec, Apple Compressor only supports the Baseline and Main Profiles, and it does so with a somewhat anonymous check box labeled "Frame Reordering" (Figure 2). Check the box and you get the Main Profile; if you don't, you get Baseline. Compressor does not let you designate a level or encode using the High profile.
As mentioned, the Adobe Media Encoder lets you choose both the profile and level, as you can see in Figure 3. Note that neither Compressor nor Adobe Media Encoder provides access to the encoding parameters discussed in the next section.
Another H.264-specific encoding option enabled in some encoding tools is entropy coding, which designates how the compressed data is packed in the final video file. As you can see in the screenshot of Sorenson Squeeze shown in Figure 4, there are two choices, CAVLC and CABAC, with the latter available only when producing using the Main or High Profiles.
Like many advanced encoding options, the advanced option-CABAC-produces a higher-quality file that's harder to decode on the playback platform. To oversimplify, the data are packed tighter, which is more efficient qualitywise. The downside is that it requires more CPU horsepower to unwrap and display on the viewing station. The obvious questions are, "How much better is the quality?" and, "How much harder is the file to decode?"
In my tests comparing similarly configured files (720p at 800Kbps video data rate) encoded with CAVLC and CABAC, the quality difference was noticeable in some hard-to-compress scenes, and I've seen some experts claim that CABAC delivers similar quality at 12%-15% lower data rates. On the decode side, playing back the CABAC file took three-fifths of a percent (that's .006%) more CPU horsepower on my HP 8710w Mobile Workstation running a 2.2GHz Core 2 Duo CPU and a 4% difference on an older pre-Intel dual 2.7GHz PPC G5 Mac.
Since the quality advantage is meaningful and the playback difference negligible, I always use CABAC when producing with profiles (and encoding tools) that support it. As you can see in Figure 5, an analysis of an H.264 video file downloaded from YouTube, YouTube does as well and also uses the High Profile-nice validations for both recommendations. The utility that provided this analysis is MediaInfo, which is free and runs on Windows, Mac, and Linux platforms. It's the one tool that I install on every computer that I own, and you can download it at http://mediainfo.sourceforge.net/en.
Like most advanced video compression technologies, H.264 uses interframe compression to eliminate redundancy between frames, which is why talking-head sequences encode much more easily than World Cup matches. H.264 implements interframe compression using the same three frame types deployed by MPEG-2: I-frames, B-frames, and P-frames.
Briefly, I-frames (also called keyframes) are encoded without reference to any other frame using JPEG, the still-image compression technique. P-frames can look backward to previous I-frames or P-frames for redundancies, while B-frames can look forward and backward for redundancies, making B-frames the most efficient frame type. Like CABAC coding, however, this efficiency comes at a cost-files encoded with B-frames have higher CPU playback requirements.
This triggers the same analysis as with CABAC-"How much better is the quality, and how much higher are the CPU requirements?" Regarding the first question, files encoded with B-frames enjoy slightly higher quality than those encoded without B-frames, but only on high-motion files produced at the lowest possible bitrate. In terms of required CPU horsepower on the playback side, files with B-frames can consume up to 10% more CPU horsepower, but the difference is usually 5% or less.
For this reason, I recommend using B-frames when the profile supports it. As shown in Figure 6, from Telestream Episode Pro, typical B-frame-related parameters include the number of B-frames and number of reference frames. The number of B-frames is the number of B-frames in sequence between I- and P-frames. So at a setting of 3, the frame sequence would be IBBBPBBBPBBB ... and so on until the next I-frame.
The number of reference frames is the number of frames searched for interframe redundancies. Here you balance encoding time against the potential for quality improvement, since searching for these redundancies takes time. In most videos, redundancies occur most frequently in the frames immediately surrounding the frame being encoded, so reference frame values higher than three to five typically provide little additional quality. Figure 6 shows the settings I use and recommend for most encodings, a B-frame setting of 3 with three reference frames.
Companies and Suppliers Mentioned