Buyers' Guide: Per-Title Encoding
Per-title encoding is one of the simplest and least expensive ways to reduce streaming bandwidth, increase viewer QoE, or accomplish both. It's becoming a standard feature in cloud-encoding services, and in this buyers' guide, I'll cover what you need to know to choose the best per-title option for your cloud-encoding scenario.
In per-title encoding, an encoding service creates a unique encoding ladder for each video file. While available for live streaming, per-title encoding is most often utilised with video on demand (VOD), and this will be the focus of this article.
As with all buyers' guides, company mentions are meant to be representative, not exhaustive. If your company isn't mentioned, feel free to add a note below.
Per-Title Encoding Overview
When you compress video using per-title encoding, the encoder analyses the file to gauge complexity and then creates an encoding ladder specifically for that file. Some services use artificial intelligence to gauge complexity; others use simple mechanisms like Constant Rate Factor encoding. But all incorporate the file-analysis component.
From there, services differ in the per-title controls they make available to the users, the inputs the service considers when creating the ladder, and the configuration items the service actually customizes according to these inputs. I'll dedicate a section to each of these differences and conclude with tests you can run to gauge the performance of individual per-title technologies.
Per-Title Encoding Controls
Per-title encoding controls comprise the configuration options you can customize when inputting the file. This information should be available in the service's user interface or documentation. With some services, you have no ability to impact the output; you upload the file, the service analyses it and creates the ladder, and you get what you get. Most publishers need more control than this. At the very least, you need the ability to set the minimum and maximum file data rates. Otherwise, the lowest rung of the encoding ladder may have a data rate that's higher than the minimum connection speed you want to support, and the maximum data rate may exceed your budgeted goals.
As an example, Figure 1 shows the options available when configuring a per-title encode using AWS Elemental's Automated ABR encoding feature from the program's user interface. (Options available via AWS Elemental's API may be different.) You can see the minimum and maximum data rate, which is important for the reasons previously stated.
Figure 1. Input configuration options available for AWS Elemental’s Automated ABR per-title encoding
You can also see the maximum renditions, which control the maximum number of rungs on the ladder. This is important for several reasons. First, since you get charged for encoding by the output rung, controlling the number of rungs controls encoding costs. Second, it controls the efficiency of rung switchings.
To explain, the general rule of thumb is that the data rate in the encoding ladder should increase by 1.5x–2x per rung. So, if Rung 3 is 1,000Kbps, Rung 4 should be between 1,500Kbps and 2,000Kbps. If the rungs are closer, the quality difference is minimal, obviating the need for the switch. If they are too far apart, the encoding ladder could strand viewers connecting via low bitrate connections on ugly rungs.
If you specify a range like 200Kbps–6Mbps, as in Figure 1, a maximum rung count of seven is sufficient to produce rungs that meet the 1.5x–2x target. If the service doesn't let you limit the number of rungs, you could end up boosting your encoding costs with no real benefit to QoE.
You may also want the ability to set the minimum and maximum resolution of the ladder rungs and the ability to force ladder rungs at specific resolutions. For example, if you deploy 640x360 viewing windows on your website, you may want multiple rungs encoded at that resolution.
It's also useful to be able to set a quality level for your per-title encodes, which not all services enable. As previously mentioned, all services analyse the incoming file to determine the optimal ladder configuration. Some services, like AWS Elemental and Tencent, simply create the ladder from there. Others, like Bitmovin and Brightcove, allow you to choose a quality level that obviously adjusts quality and data rate up or down.
This is useful in several instances. First, most publishers have a preselected target quality level for their top rung, usually between 93 and 95 Video Multimethod Assessment Fusion (VMAF) points. If the per-title service doesn't hit that target consistently using default settings, you can adjust that with the quality setting. Also, if you have different quality targets, say, for free and subscriber videos or user-generated and premium content, you can use the quality configuration to adjust the quality for these different targets.
Creating a Per-Title Encoding Ladder
The vast majority of cloud services analyse the input file, gauge complexity, and create the ladder. Some services, like Brightcove's Context Aware Encoding, take into account "constraints associated with the delivery network and device being used to view the content." This makes perfect sense, as you'd create a totally different ladder if you were serving Android phones connecting via 3G or smart TVs connecting via a high-speed connection, even if both ladders were delivering the same source file.
Obviously, you need the distribution data, which means that this capability is only available when the service encoding your files is also distributing it. None of the other services mentioned herein offer this capability, although you can use the configuration options previously discussed to customise the ladder for the audience.
Adjusting Per-Title Configuration Options
All per-title encoding services customise the data rate of video files when customising an encoding ladder, but several don't adjust the number of rungs in the ladder or ladder resolution. This static approach degrades quality in many scenarios.
For example, assume a service uses three rungs for every file. This might work well for animated files that might need a top rung of 1,500Kbps, but it would be a poor alternative for a soccer match that needs a top rung of 6,000Kbps to achieve VMAF 94. A fixed ladder with seven rungs would deliver the soccer match well but would create unneeded rungs for the animation.
You typically also want to use larger resolutions for simple files like talking heads and animations and smaller resolutions for high-motion videos with lots of details. Encoding ladders that only adjust data rates can't optimise for these different videos.
Most of the major players adjust the data rate, rungs, and resolution, and this should be a minimum requirement for the services that you consider. Beyond this, advanced file configurations definitely push the envelope in terms of what you need, and they illustrate what some services may not provide.
For example, if you're distributing High Dynamic Range (HDR) content, make sure that your candidate services can handle HDR and also whether they can automatically switch from HDR to SDR within the ladder. Ditto if you're publishing high-frame-rate video like 60 fps sports content. At some point in the ladder, you'll probably want to switch from 60 fps to 30 fps. If you're working with HDR or high-frame-rate content, ask about these capabilities early in the process.
One other consideration relates to compatibility issues like codec profiles. With some services, you can choose the H.264 profile (baseline, main, or high) for the entire ladder but not for individual rungs. Others let you specify a profile for each rung, which is essential for those who still wish to produce lower rungs using the baseline or main profile for legacy devices.
Other Per-Title Encoding Issues
A couple of points in the all-important "other" category: first is price. Some services charge a premium for per-title, which makes sense since it may need another analysis pass. Others require that you use certain encoding parameters that cost more and increase the overall price of encoding. Since cost is always important, inquire about the price of per-title early in the process.
The second issue relates to shot-based encoding, which is what per-title encoding will evolve to over the next few years. As the name suggests, with shot-based encoding, the encoder divides the input video into the various shots that comprise it, and it customises encoding for each shot rather than the title as a whole. Netflix switched to shot-based encoding in 2018.
Shot-based encoding is superior to per-title because it's more efficient than using arbitrary keyframe intervals. Specifically, Netflix reports that shot-based encoding could deliver the same quality as per-title at a 17.1% lower bitrate or 3.7 additional VMAF points at the same bitrate. As far as I know, none of the services discussed herein offer shot-based encoding, but it should be available from these or other vendors sometime in 2022 or 2023.
How to Compare Per-Title Encoding Services
Once you have a short list, here are some tests that you can perform to further compare the services.
First, find two files that are vastly different in terms of complexity, say, a soccer match and a simple animation. Around 1–2-minutes long should suffice. Upload them both to the service, and encode using the per-title encoding feature. Download the encoded files, and do the following.
To begin, compute VMAF for the top rung of both ladders. If the difference between the two files is greater than two to three points, it may mean that the service's complexity measurement isn't that accurate, so you can't rely on it to produce consistent quality across all content types.
I performed this exercise with two files and AWS Elemental MediaConvert using the default settings, which means there's no limitation of lower or upper bitrate or ladder rungs. The results are shown in Table 1. The VMAF score for the cartoon (at 1.77Mbps) was 94.02, while the VMAF score for the soccer match, encoded at over four times the data rate, was 96.61. This tells us that AWS Elemental MediaConvert's quality gauge delivers consistent quality over very disparate file types.
Table 1. Per-title encoding by AWS Elemental MediaConvert for a simple cartoon and a soccer clip
Next, gauge whether the bitrate/quality score delivered by the service matches your targets, particularly if the service doesn't offer quality adjustments. The VMAF scores delivered by AWS Elemental MediaConvert are appropriate for premium content but may be too high for user-generated content.
Then, you need to ascertain the bitrates and resolutions of all files in the encoding ladders with MediaInfo or a similar tool and plug the values into a spreadsheet like Table 1.
AWS Elemental MediaConvert customises the number of rungs depending on the content (which is good). It will also customise the resolution for the different content types, deploying higher-resolution rungs further down into the encoding ladder. At 469Kbps, the animated file was 960x540; at 606Kbps, the soccer match was 480x270. This is good as well, since it will optimise quality for the different file types.
If you want to consistently serve data rates below 400Kbps or so, you'll have to configure the minimum bitrate setting, as shown in Figure 1.
These simple tests should only take a few minutes to complete and will go a long way toward identifying the best cloud-based per-title encoding alternative for your service.
[Editor's note: This article first appeared in the 2022 Streaming Media Industry Sourcebook.]
Video encoding began as a one-dimensional data rate adjustment that reflected the simple reality that all videos encode differently is now a complex analysis that incorporates frame rate, resolution, color gamut, and dynamic ranges, as well as delivery network and device-related data, along with video quality metrics.
Bitmovin's Steve Geiger outlines the benefits of per-title encoding, how to use it optimize your delivery, and what the workflow entails in this clip from Streaming Media West 2019.
Streaming Learning Center Principal Jan Ozer explains per-title encoding and rates the different per-title encoding technologies in this clip from his presentation at Streaming Media East 2019.
Once revolutionary, pre-title encoding was replaced by shot-based encoding and then context aware encoding. Here's how to evaluate vendors when choosing a solution.
Companies and Suppliers Mentioned