November 1, 2017
By Jan Ozer Contributing Editor
Featured Articles

One Title at a Time: Comparing Per-Title Video Encoding Options

The blooper on the right is a bitrate graph (in the Bitrate Viewer application) showing the output of a synthetic test clip that mixes 30 seconds of talking head with 30 seconds of ballet. The concern are the swings in data rate, which can result in frequent ladder switches and stoppages (see my article “Bitrate Control and QoE—CBR is Better”). With capped CRF, you have no control over data rate swings within the file, which is a significant QoE concern for me. As a counterpoint, OVP JW Player uses capped CRF for its per-title encodes, and obviously the group wouldn’t if it caused significant issues.

Capella Systems

Capella Systems’ SABL is a feature of the company’s Cambria FTC encoder, which I tested in a review. SABL assesses video complexity by running a CRF encode of the target clip, and it allows the user to set the duration of the measurement period. If set to a short duration, like 10 seconds, the most complex 10 seconds in the clip determine the data rate of the entire clip. If set to a longer duration, say, 2 minutes, then a short, complex section won’t impact the overall data rate that much, but might look a bit degraded after the encode.

Besides setting the measurement duration, you also set how the data rate of the CRF encode, or complexity value, impacts the original encoding ladder. This is shown in Figure 2, where you can see that a value of more than 7000Kbps directs the encoder to shift the data rate of each rung in the encoding ladder by 50 percent. At the other end of the spectrum, Cambria will adjust the data rate of the encoding ladder downward by 50 percent if the complexity value is less than 2000Kbps.

Figure 2. Cambria adjusts the encoding ladder according to complexity value.

The script shown in Figure 2 is completely customizable, and it includes the ability to adjust video resolution based on the complexity value and even to cut rungs below a certain data rate. Unfortunately, I didn’t test either of these capabilities because I didn’t know it was possible, and didn’t think to ask until I analyzed the Brightcove results. By that point, it was too late to retest.

So, for this article, I compared the before and after results using the same number of rungs and resolutions. While this article is moving through the publishing process, I’ll rerun the analysis, and will update the results in the Cambria review on StreamingMedia.com.

To conclude our feature table analysis, Cambria’s data rate control is completely independent of complexity value; you can use CBR, VBR, or a mix on the different rungs in your encoding ladder.

Cambria had a perfect 14–0 record on our box score, and played error-free ball with two home runs. While Cambria’s savings lagged behind those delivered by both other technologies, note that this is a function of the script, not the product. Had I included lower adjustment values (if under 1200, cut to 20 percent of the original bitrate), I could have recouped much of the savings, and I’ll try this for the updated results. There were no saves in our current analysis, but SABL may cut some rungs during our updated tests, which would count as saves.

Cambria had two home runs, which look identical to the left side of Figure 1, and no bloopers, and overall proved to be a steady, reliable performer. Check the Streaming Media website to see if the updated tests show superstar potential.

Brightcove’s Context Aware Encoding

Brightcove’s Context Aware Encoding (CAE) is the newest product in the group, and it’s still in beta, or in the parlance of this article, the preseason. As you can see in Table 3, CAE hit the most home runs and earned the second-most saves, but also recorded the most errors, exclusively in the form of encoding ladder gaps in excess of 2.05x. So it will be interesting to see how CAE performs in the regular season (the product is scheduled to ship in Q4 2017).

Technically, CAE considers four factors when producing its standard encoding ladder:

The properties of the content
The distribution of user devices (connected TVs, PCs, smartphones, tablets, etc.)
The properties of user devices and networks
The constraints specific to video codecs, like encoding profiles

To evaluate content, the CAE profile generator runs several “probe” encodes over the video, measuring quality with SSIM and other metrics, and then assigns a mathematical model to the video.

Interestingly, within the context of Brightcove’s core OVP business, the two middle parameters mean that the devices that you distribute to and their effective bandwidth will impact your encoding ladder. So, if you’re targeting Android phones over 3G connections in a developing country, you’ll get one ladder; if targeting 4G in Scandinavia, you’ll get another.

As shown in Table 1, CAE offers the broadest feature set of the four new technologies, with the ability to add or cut rungs from the encoding ladder, change output resolutions, and apply post-encode quality checks. CAE is also the most configurable. When defining the parameters of the dynamic output, you can specify the minimum and maximum number of renditions, min/max resolution, max frame rate, key frame rate, min/max bitrate, max first rendition bitrate, min/max SSIM for the quality check, and the H.264 profiles for the encoded files. Then you can also specify all of the normal encoding parameters like bitrate control technique, b-frame and reference frame parameters, and the like.

CAE is a complex work in process, and several of the issues discovered during our testing have already been resolved. So I’ll focus on the positives of Brightcove’s approach and reserve a full consideration of strengths and weaknesses for when the product ships.

The best results were shown by the Tutorial file, a PowerPoint-based video that has been deployed by corporations around the globe. The original encoding ladder is shown as Table 1; the CAE ladder is on the left in Figure 3, along with how the values changed from the control to the CAE output.

Figure 3. One of the many home runs hit by Brightcove

Because the underlying content was so simple, CAE encoded this clip in only three rungs, with the 1080p rung so compact that viewers down to the fifth rung could view it at a lower data rate than the original file in the control ladder (which was 900Kbps). Despite dropping the 1080p data rate by more than 534 percent, CAE retained a PSNR value of over 45, and the video proved crisp and artifact-free. Benefits in lower rungs were even greater, with PSNR values boosted by as much as 41 percent and VMAF scores that would translate to vastly superior QoE. Overall, Brightcove also shaved 33 rungs from the 13 encoding ladders that it completed, accounting for the 33 saves.

Brightcove produced 14 videos, but kicked one out because some of the rungs could not be produced at the quality level specified in the script. In practice, you would reencode at a lower quality level, but we didn’t have time to get that done for this article.

From a commercial perspective, Brightcove expects to offer the functionality as an add-on service for its Video Cloud customers, but pricing is still TBD. At press time, Brightcove hadn’t decided whether to offer a standalone version of the service for Zencoder customers.

FASTech.io’s Video Optimizer

We took only a cursory glance at FASTech, which is a startup hosted at the Qualcomm Institute Innovation Space and at StartR, an accelerator at the Rady School of Management at the University of California–San Diego. According to a brochure on its website, FASTech’s Video Optimizer “employs proprietary technology to build a model for each video based on FASTech’s large-scale quantitative analyses and subjective perceptual tests.” This is used to “find optimal compression settings and obtain high compression rates while keeping excellent perceptual quality.” The technology is currently available as a cloud service, with pricing based upon bandwidth savings or a fixed license.

Referring to Table 1, FASTech deploys a fixed output ladder in terms of the number and resolution of rungs, adjusting only file data rate. We tested three files with FASTech. Results were good in the higher motion file, but FASTech overcooked the two lower motion files, dropping the data rate of the screencam file by 118 percent, but dropping PSNR from 56.99 to 39.32, and VMAF by 7.14. However, I didn’t have time to deploy post-encode quality checks, which are available, to avoid these issues.

FASTech’s technology was originally deployed to optimize single files for archiving and similar purposes, and the company only recently added the ability to create encoding ladders. So consider it a work in process. That said, if you’re building a cloud-based encoding workflow, and want to add per-title capabilities, FASTech is your only option.

[This article appears in the October 2017 issue of Streaming Media Magazine as " One Title at a Time."]