▼ Scroll to Site ▼

How Netflix Pioneered Per-Title Video Encoding Optimization

Article Featured Image

The only tool I’m aware of that can perform cross-resolution and device aware testing is the SSIMWave Video Quality-of-Experience Monitor (SQM), which can delivers an SSIMplus rating (see the review), but that’s only for its proprietary SSIMplus algorithm, not PSNR, which the tool can also calculate. Here’s an excerpt from the SQM manual on the topic.

Netflix Per-Title Fig3

Figure 3. The only cross-resolution test tool I know can’t produce cross-resolution PSNR values.

When we asked about this, Ronca responded, “We use scaled PSNR for all quality metrics so the encoded video would be scaled back to the original resolution.” Essentially, this means that Netflix scales lower resolution files, like the 720x480 variant, back up to 1080p to compare to the original 1080p file. After a bit of research, and some input from some compressionists with experience in this area (thanks Fabio), I learned that this is pretty standard fare. A commenter to the original article also shared this explanation. Mystery solved.

Why PSNR?

At this point, I should probably also express my bias against the PSNR test as the basis for making compression-related decisions. I explain part of this in my post, "Why I like VQM better than PSNR or SSIM," and confess to a growing appreciation for the SSIMplus metric, which ties to anticipated viewer ratings, can perform cross-resolution testing, and is device specific. To be fair, Netflix acknowledge PSNR's deficits by stating:

Although PSNR does not always reflect perceptual quality, it is a simple way to measure the fidelity to the source, gives good indication of quality at the high and low ends of the range (i.e. 45 dB is very good quality, 35 dB will show encoding artifacts), and is a good indication of quality trends within a single title.

I agree that PSNR is a good indication of quality trends in a file, but if it doesn’t “always reflect perceptual quality,” why not base this analysis on a different metric like VQM (scaled as Netflix is scaling PSNR) or SSIMplus? You can browse through some test results that compared PSNR, SSIM, VQM, and SSIMplus in a post titled, The SSIMplus Index for Video Quality-of-Experience Assessment. Note that it was produced by employees of SSIMWave, the developer of SSIMplus.

Interestingly, in its paper the Optimal Set of Video Representations in Adaptive Streaming, a dense work mentioned in a comment on the Netflix blog, researchers stated that, “we model the satisfaction function as an Video Quality Metric (VQM) score [19], which is a full-reference metric that has higher correlation with human perception than other MSE-based metrics.” MSE means mean square error, and PSNR is such a metric.

In short, if SSIMplus is better, and VQM is better, why use PSNR, particularly if you're drawing quality-related conclusions from the scores, not just quality trends? Interestingly, in the blog post, Netflix referenced the VMAF quality metric, which it is co-developing. So we asked whether Netflix was using PSNR, or VMAF, to dictate their quality decisions. Ronca responded, “We use both VMAF and PSNR. VMAF is still in development and the bitrate resolution decisions are made using PSNR. VMAF will get promoted when we have confidence.”

As a follow up, we asked, “Why PSNR as opposed to VQM or SSIM or SSIMplus? Ronca commented, “Good question, and other metrics may have provided better results.” Then he continued, “PSNR was already built into our tools so we get the data [computationally] free and it required minimal dev.  Thus far, it appears to be working well, with VMAF and internal subjective tests confirming our recipe. Also, we limit use to a prediction metric to drive some codec recipe decisions, where some of the issues with PSNR maybe not so important. As I mentioned, we will eventually use VMAF to drive the codec decisions.”

The Caveats

The caveats are fairly clear. First, Netflix is a subscription service, so all bandwidth related to file delivery is fully funded. When ascertaining the highest bitrate, quality is the critical factor for most files.

But what about non-subscriptions services? Of course, all videos are funded one way or another, whether by advertising or by dipping into the marketing or training budget. In most of these non-subscription cases, the maximum data rate is dictated by bandwidth cost, not quality. When cost determines the maximum bitrate, the analysis becomes which resolution delivers the best quality at that data rate, which is why effective cross-resolution testing is so essential. That is, you say 3Mbps is the limit, and the analysis becomes if 1080p, 720p, or 540p delivers the best quality.

Also, as an OTT service, Netflix displays most videos at full screen. In contrast, most producers delivering shorter content produce for a smaller display window. This is why Apple’s TN2224 has two bitrates at the 640x360 window, while Netflix has none. The first rule of producing for adaptive streaming is to have at least one stream for each window size on your website, and many producers use two or more.

So while the Netflix blog post breaks new ground in justifying content-aware encoding, few producers should apply the whole cloth.

Summary, Conclusion, and Pitch

Overall, the Netflix post marks a bold line in the sand that a single encoding ladder is insufficient for companies distributing disparate types of videos. As mentioned above, even if you’re distributing relatively homogenous videos, if you’re using an encoding ladder not customized for your video type then it’s almost certainly suboptimal.

In short, TN2224 is dead (at least for broad-brush implementations). Welcome to the new era of content-aware encoding.

Content Aware Encoding: The Webinar

I’m all-in on content-aware encoding, and truth be told, was before the Netflix post (as an editor and several encoding clients can attest). But the Netflix post crystallized my thoughts on the matter and added some valuable procedural workflows. On January 28, 2016, at 2:00 PM EST, I’ll present a webinar detailing a simple but effective technique for implementing content-aware encoding for your content.

In the webinar, you will learn:

• Lessons learned from the Netflix blog post, including some not shared above

• A simple procedure for identifying the optimal encoding ladder for each category of content (even if you only have one type)

• How to verify the quality of the various stream compositions with objective benchmarks like PSNR, SSIMplus, and VQM, as well as tools like the SSIMWave Quality of Experience Monitor

• How encoding with HEVC changes the equation

During the webinar, you’ll see how I applied the procedure to produce content-aware encoding ladders for high-motion video (Tears of Steel), simple animations (Big Buck Bunny), complex animations (Sintel), talking head video (yours truly), screencam videos, and videos comprised of PowerPoint and talking heads.

You’ll walk away knowing how to test the encoding complexity of your own footage and create a content-aware encoding ladder. You’ll also receive encoding ladder templates you can immediately deploy if you have content in the above-described categories.

The Webinar will cost $30.72 , and you can read more about the content and sign up by going to this post on StreamingLearningCenter

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

One Title at a Time: Comparing Per-Title Video Encoding Options

Save on bandwidth and keep costs down: Per-title video encoding solutions let publishers break free from fixed encoding ladders. Explore the benefits of four methods.

Comparing Quality Metrics Up and Down the Encoding Ladder

How can publishers compare video quality at different resolutions? There's the theoretically correct answer and then there's how it's generally done.

Conference Research Tests Adaptive Video and Quality Benchmarks

Video encoding professionals should take note of four papers presented at the recent International Symposium on Electronic Imaging. Read on for a detailed assessment.

Netflix Leads to Decreased DVD Sales for Movies and TV Shows

The debut of Netflix streaming leads to rapid decreases in spending for physical media. The U.S. and U.K. have both seen DVD sales erode.

Netflix Creates Local Recommendations Using Global Communities

The algorithm wizards at Netflix pull back the curtain and show how they use worldwide data to improve local and personal recommendations.

CES '16: Netflix Launches in Over 130 Countries During Keynote

Chelsea Handler, Will Arnett, Krysten Ritter, and Wagner Moura join Reed Hastings and Ted Sarandos to celebrate the SVOD's future

Netflix Re-Encoding Entire Catalog to Reduce File Sizes By 20%

By recognizing that some titles are more visually demanding than others, Netflix has revolutionized the way it encodes video and will dramatically cut down bandwidth requirements.

70% of Traffic Video/Audio; Netflix Twice as Popular as YouTube

A report from Sandvine shows that Netflix alone now makes up a greater share of traffic than all audio and video did five years ago.

Netflix Use Skyrockets Among Regular Viewers, Finds GfK

Regular viewers of the SVOD average watching 10 shows and 4 movies on it each week; mobile Netflix viewing is also on the rise.

Virgin to Offer Netflix Streaming on 10 Planes With Fast Wi-Fi

"Netflix and chill" meets the mile high club as Netflix subscribers gain the ability to stream from the SVOD leader's catalog at 35,000 feet.