How to Balance Encoding Time and Quality
Watch the complete panel from Streaming Media East Connect, How to Fine-Tune Your Encoding with Objective Quality Metrics on the Streaming Media YouTube channel.
Learn more about optimizing your encoding at Streaming Media West 2020.
Read the complete transcript of this clip:
Choosing the optimal encoding time and quality trade-off. In this case, we're going to be looking at this within the context of an X.264 preset. All encoders and codecs have configuration options that trade off quality versus time. Most HEVC encoders have this, some H.264 encoders do, particularly in FFmpeg. The technique I'm going to use lets you visualize how they differ and choose the best option. As I said, we're going to be looking at this within the context of choosing the preset for X.264.
So, what are presets? Presets are simple ways to adjust multiple parameters to trade off encoding speed versus quality. And if you've got an X.264 encoder, whether it's you FFmpeg or one of the cloud encoders, you're going to have access to these parameters. Medium is generally the default preset. But is Medium necessary? It's going to be much slower than some presets. It's going to be much faster than others. Is this the quality/encoding time tradeoff that you want? Any time I look at an encoder, anytime I look at a new codec this is the kind of analysis that I apply. And what I do first is choose a test file.
Then I encode to different presets that are targeting around 96 VMAF points. I want to target that because that's the effective zone where people are going to be encoding their files for the most part. And that means that all files are going to be encoded at different bitrates because of the variation here, as we saw with the CRF calculation, I can't encode them all at 3-6 megabits per second. I've got to customize them to hit that 93 to 95 VMAF point.
Then I measure encoding time, I measure average VMAF, and I measure low-frame VMAF--the lowest-quality frame in that encode. So this is average VMAF, and these are the X.264 presets here. Red is bad. Green is good. So this tells us that in terms of average quality, ultra-fast and super-fast--no surprise there--it gives us the worst quality for these policies here. Very Slow/Not Placebo gives us the best in all cases. So this immediately tells you that if you're using Placebo, not only are you extending time significantly, you're also getting lesser quality than you get with Very Slow. But interestingly, the total delta here, the difference in quality between the green and the red is only about 6%. So it's not a life-changing event. And although the difference--going back to the sixth VMAF points here--it's 89 here's 96. Most viewers would notice that difference.
And here's low-frame VMAF, which is the lowest VMAF score for any frame in the video. And down here, we're seeing concentration in terms of overall low, except "Tutorial," which is the PowerPoint video. And we're seeing a scattering here, but we're seeing a 33% difference in overall delta between the highest and lowest quality, meaning the difference between the lowest quality here and the highest quality here averages 33%. And we see some scores in the 50s. So that tells us that you may not care so much about overall quality, but low frame quality is a particular issue with ultra-fast and super-fast, which is why you typically try and avoid them in all instances, except for live encoding, when you need to use them to produce the frame.
Then we plot all this out with encoding time. So this is encoding quality as a percentage of the total quality. So it starts out pretty high. And then this is low frame quality. And we see this down here in a range that's pretty scary. And then this is encoding time. So this tells us that Very Slow takes 31% of the encoding time of Placebo, which again, if you're using Placebo, not only are you wasting time, you're tripling the time of encode. You're losing slight quality--not significant, but, but you are losing a bit of quality. So the average quality here at Super-fast and Ultrafast is probably okay, but you probably wouldn't want to use these unless you absolutely had to, because of the low frame quality here, faster is probably the first acceptable quality from both an overall quality and a low frame quality.
It makes very little sense to go beyond Medium. So if you're using Slow, Slower, or Very Slow, you're getting minor quality improvements. Realistically, here, you're cutting your encoding time by 50. Here, you're coming close to doubling it. And if you're not paying by the encode like this--if you're just using a service that you pay pay for per minute of video, maybe you don't care. On the other hand, if you're running your own encoder or you're own encoding farm, these types of decisions can add significant capacity. If you're encoding using Very Slow and you cut to Medium, you almost triple your encoding capacity. And very few people would notice the difference.
And then, once I go through this analysis, I'd also want to go through and see if there were any visible differences in the frame plots. This is Ultrafast versus Medium. And we see not only is the overall score much lower, there are a lot of scary areas here. Whereas if you look at Fast, which I said before was probably the lowest acceptable preset, you see very few areas where the Fast file really drops below the Medium file indicating a potential quality issue. So the conclusion regarding X.264 is, Faster is the best preset. If you're seeking maximum throughput, it makes very little sense to go beyond Medium, and placebo never seems to make sense.
Jan Ozer explains how to use constant rate factor encoding and per-title encoding to find the optimal data rate for encoding different types of video files, ranging from talking heads to high-action movies.
Jan Ozer discusses real-world applications for per-title and per-category encoding, where applying objective quality metrics to recalibrate an encoding ladder is effective and also times when it's not as helpful.
Jan Ozer outlines the basics of objective quality metrics, from their recent evolution to their method for predicting how human eyes perceive video quality to the best metrics available now.