How to Find the Optimal Data Rate for Encoding
Watch the complete panel from Streaming Media East Connect, How to Fine-Tune Your Encoding with Objective Quality Metrics on the Streaming Media YouTube channel.
Learn more about optimizing your encoding at Streaming Media West 2020.
Read the complete transcript of this clip:
Jan Ozer: The first thing I do when I'm trying to come up with an encoding ladder--whether it's an encoding ladder for a single file, or it's an encoding ladder for a category of files--is to find the ceiling. The ceiling is the lowest full-resolution data rate that delivers acceptable quality. If you're working with a 1080p source file, you're going to have the highest-quality source file that you actually ship for the highest quality encoding quality you're going to ship, that's the ceiling. And when I find the ceiling, I first use a technology called CRF, or constant rate factor. Then I get confirmation from VMAF, and we'll look at Hollywood proof as well. And then, once we figure out the ceiling, it becomes a simple matter to choose the rungs of the ladder and choose their data rates.
So what is constant rate factor encoding? Constant rate factor encoding is an encoding mode in X.264, X.265, and VP9. It adjusts the data rate to achieve target quality as opposed to adjusting quality to achieve a target data rate. When we encode to CBR or VBR, we say, "Give me 500 kilobits per second, and the quality goes up and down to match that data rate. With CRF. It's just the opposite. It's "Give me this quality-level and adjust data rate up and down to give me that quality level."
The quality range for CRF is 1-51, and the lower the score, the higher the quality. So if you're looking at a simple FFmpeg script, what you would see is that this is a data rate-focused encoding string where I'm saying, "Here's the input file, here's the output file. Give me 500 kilobits per second and vary the quality if you need to, in order to achieve that target data rate." And then here's CRF again, input/output. You say, "Give me CRF 23 quality and vary the data rate up and down to deliver that."
In this case, CRF is giving us a measure of encoding complexity, meaning it's telling us how hard that file is to encode. From that, we can figure out what the ceiling is.
So how do we do that? I wrote a book called Video Encoding by the Numbers. In the book, I used these eight files to measure pretty much every key decision you'd make in developing an encoding ladder, such as, "What's the ceiling? What's the keyframe interval, bit rate control, B-frame interval, reference frame?"--all that stuff.
So I used a bunch of different files because we all know all files encode differently. We've got some movie files in here. We've got some animated files, some synthetic files for business use, and then we've got a music video and some simple talking-head files. I encoded the files using CRF 23, as we just saw on the previous slide. And what I found was that the data rate varied from 1 megabit per second, to over 6 megabits per second.
For the PowerPoint and talking head tutorial--which would look very much like what you're looking at now--I could get a data rate of 1 megabit per second, and a VMAF score of 96.68. Any score above 93 is going to be pretty much perfect to the viewer.
With a tutorial, 1 megabit per second did it; with a movie-like production, 6 megabits per second gave me 92.74. So I'm almost there; that would probably would be pretty high quality video file.
But what we're seeing is that the data rate varied by over 600%, the VMAF ratings range from 92.74 to 96.88. That's a standard deviation of 1.39, which is pretty minimal considering that the VMAF scores go from 0 to 100. With a talking head, I can use half the bitrate and get pretty much the same VMAF score as a haunted house-type video, which for me validates the benefit of per-title encoding. If you're not looking at per-title--if you're using a fixed bitrate ladder for a range of input sources, then you're either wasting bandwidth on one hand, not delivering sufficient quality on the other hand, or some measure of both.
So, if you're trying to find the ceiling, encode the file, or a category of files at CRF 23, which is going to correspond to anywhere between VMAF 93 to 96, which means that should be sufficient quality to ship without noticeable degradation in the file.
Jan Ozer explains how to choose presets and tweak encoding parameters when prioritizing speed or quality, and how to find the perfect balance.
Jan Ozer discusses real-world applications for per-title and per-category encoding, where applying objective quality metrics to recalibrate an encoding ladder is effective and also times when it's not as helpful.
Jan Ozer outlines the basics of objective quality metrics, from their recent evolution to their method for predicting how human eyes perceive video quality to the best metrics available now.