Moscow State Reports: FFmpeg Still Tops for H.264, but Falters for HEVC and AV1
Every year the Moscow State University (MSU) Graphics and Media Lab releases multiple codec comparisons that are essential reading for any company that delivers streaming video to its viewers. This year, MSU released three reports which I detail below, all with free downloads that contain a snapshot of data, and the full enterprise report with all data for all three reports for $950.
All reports evaluate multiple codecs of different types using two use cases—offline encoding (1 fps) and online encoding (30 fps)—with the number of test clips and objective metrics deployed varying by report. The first report, Part 1: Full HD, Objective, released in July 2020, tested 20 codecs representing H.265, AV1, H.264, and other codecs, with 50 full HD videos, and reported results for SSIM, VMAF, PSNR log and average MSE, and the best quality/speed tradeoff using SSIM as the metric.
Table 1 shows what you get in the free version as compared to the enterprise version, with some overall details also provided in the online report description linked to above. Theres a lot of value in the enterprise version, including the encoder settings used for the tests which are created with input from each developer.
Table 1. What's in the free and enterprise versions
Figure 1 shows the VMAF ratings for all tested codecs using the x265 codec as the reference at 100%. Codecs to the left can deliver the same quality as x265 at the data rate shown in the chart, so the BVC2.0 codec offers a 25% bitrate reduction at the same quality while the Aurora AV1 Encoder offers a 19% reduction. Codecs to the right of x265 need a higher data rate to deliver the same quality, so the rav1e AV1 encoder needs 37% more data to deliver x265 quality, close to 50% higher than Aurora.
Figure 1. Average bitrate for fixed quality at 1 fps using the VMAF metric
Table 2 shows the overall scoring for both the offline and online tests in all reported categories. Note that there are no AV1 codecs in the 30 fps category because none were able to encode at that rate.
One frustration with the report is that not all codecs are commercially available, either for licensing or through an on-prem or cloud encoder, which makes their inclusion largely irrelevant to report buyers looking for the best codec for their streaming operation. As examples, the BVC2.0 codec is from Bytedance, the owner of TikTok and appears to be for internal use only. Same deal with the Tencent V265 which is even more confusing because the developer is listed as Tencent (a large Chinese conglomerate) while a company named Shannon Lab is marketing the Tencent Shannon Encoder (T265). It would be great if the reports designated if the codec was commercially available and provided contact information for those that are.
Table 2. Part 1 winners
Another frustration is that the codecs change for each report. For example, Part II, Full HD Content, Subjective Evaluation, which evaluated 11 codecs using eight video sequences, doesnt include Aurora, and subjective ratings would have been valuable for any company that was considering deploying that AV1 codec. That said, the work involved in gathering the subjective data is staggeringly impressive; MSU used its crowdsourced quality platform, Subjectify.us, to gather 236,736 comparisons from over 61,000 unique participants.
Overall, in the 1 fps tests, the ByteDance codec led the pack, with Alibaba placing second, and the Kingsoft AV1 encoders coming in third and fourth.
Figure 2. Subjective ratings a 1 fps encodes.
Table 3 contains the overall winners in the subjective ratings, again with no AV1 codecs in the right-hand column because none could achieve 30fps encoding.
Table 3. Part 2 winners
Moving on to 4K and 10-Bit
The third report, entitled Part III: 4K & 10-bit Content, Objective, compared 12 codecs using 12 UHD videos encoded at 8 and 10-bits per pixel. In these tests, the MainConcept HEVC encoder performed very well, coming in second behind BVC1 in the 8-bit encoding trials shown in Figure 3.
Figure 3. Subjective ratings at 1 fps encodes
Table 4 shows the overall results. Note that only four codecs participated in the 10-bit trials, which MainConcept won by a 17% margin over x265, the nearest competitor. If youre a premium content distributor pumping out HDR 4K streams, this should put MainConcept on your short list.
Table 4. Overall results in the 4K 8-bit/10-bit trials
For me, the strongest lesson from these results is that youre going to step outside the FFmpeg box to get the best quality HEVC and AV1 encoders. With H264, the x264 codec included in FFmpeg and most encoders based on FFmpeg was rightfully acknowledged to be best in class. That isnt the case with HEVC and AV1, where the codec in FFmpeg is outclassed by several other codecs.
The head of the Moscow State University Graphics and Media Lab—the people behind VQMT and Subjectify.us—offers his insights into objective and subjective metrics, as well as VMAF hacking and AV1.
The Alliance for Open Media has gone all-in on SVT-AV1, but real-world tests show that Scalable Video Technology codecs don't live up to their promise when it comes to performance.
If you're serious about experimenting with different codecs and/or encoding parameters, MSU's Video Quality Measurement Tool is an essential tool, and version 13 brings some welcome improvements.
In rigorous tests of video codecs, only one produced vastly different results during objective and subjective examinations. What could cause that discrepancy?
Moscow State University's Video Quality Measurement Tool was already good. Enhancements in the new version, including new metrics and the ability to run multiple analyses simultaneously, make it even better.