AV1: A First Look
Table 2 shows the BD-Rate and BD-Quality calculations. Briefly, BD-Rate shows the data rate savings associated with using the AV1 codec. Looking at the overall average, AV1 could deliver the same quality as x265 with a 34.88% data rate reduction, the same as VP9 with a 37.69% reduction, and the same as x264 with a 54.82% reduction. Looking at the Football clip, the numbers are 24.89%, 35.5%, and 50%, which roughly track the observations made about crossing the 93 VMAF threshold above.
BD-Quality reverses the variables: for equivalent bandwidth, it finds the average quality improvement. So, on average, at the same bitrate as HEVC, VP9, and x264, AV1 would deliver 1.25, 1.48, and 3.15 additional VMAF points, respectively. To put this in perspective, a VMAF change of six equals a “just noticeable difference” that should be detected by more than 75% of viewers. That means that if you substituted AV1 for any of the three codecs at the same data rate, most viewers would not notice the difference. This is likely more the unfortunate result of the lack of range in encoding scores than true qualitative differences in the clips, but the numbers are the numbers.
Table 2. BD-Rate and BD-Quality scores versus AV1.
Basically, the BD-Rate numbers tell you that AV1 can save significant bandwidth when substituted in for x265, VP9, and especially x264. However, the perceptible quality differential between videos encoded by the respective codecs isn’t that significant for the tested clips at the tested encoding configurations.
I discussed these results with Nigel Lee from Euclid IQ, who’s helped me understand how to apply and interpret BD-Rate and other objective metrics in the past. He recommended that we test more challenging clips at more challenging data rates. So, despite the long encoding times, I decided to take Nigel’s advice.
Objective Quality—Round 2
Specifically, I added two test clips, one the initial runway sequence of Zoolander,which is very challenging, and a different 30-second section of the Football clip with much more motion. I also decided to test at more aggressive data rates, in six steps from 1Mbps to 3.5Mbps inclusive. Figure 4 is the rate distortion curve, which shows much more differentiation between the clips; not only between AV1 and HEVC, but between HEVC and VP9, and H.264 and all other codecs.
Figure 4. Updated results for two more challenging clips at more aggressive data rates.
Table 3 shows the numbers. Though the BD-Rate differential between AV1 and x265 shrank significantly, the BD-Quality value almost tripled, so at the same data rate, AV1 would produce a VMAF score averaging 2.99 points higher. Still not visible to most viewers but getting closer. At 5.55 points for LibVPx, many viewers would notice a difference between videos encoded at the same data rate, while the 16.79 score for x264 indicates that most prudent video publishers wouldn’t attempt these data rates with x264 (or any H264 codec).
Table 3. Round 2 BD-Rate and BD-Quality scores versus AV1.
What did we learn about testing? You should focus your tests on the data rates at which your video will most likely be deployed. At this point, H.264 and any newer codec should produce near perfect quality at 6 Mbps, making that data rate irrelevant for forward-looking testing. HEVC and VP9 take the near perfect quality level down to between 3.5Mbps to 4Mbps, and AV1 and future codecs should bring this down into the 2Mbps to 3.5Mbps range. For this reason, it makes the most sense to test in the range covered in the second round.
What about clip type? If a significant differential only appears in challenging clips, is it relevant for the vast majority of easier to encode clips? I would say yes. Remember that both test clips in the second series were challenging sequences from longer videos that were easier to encode on average than the tested segments. Even talk shows have challenging sections, whether the opening logo or quick cuts to the applauding audience. So, while the overall quality difference may be minor on generally easy-to-encode videos, AV1 should be able to cut the overall data rate and preserve the quality in hard-to-encode sequences within these clips .
Decoding speed is shown in Figure 5. To assess this, I converted the 6Mbps Elektra file to Y4M format using a simple FFmpeg script and recorded approximate performance as measured by FFmpeg during the conversion. As you can see in the figure, AV1 decoded at .66x real time, with HEVC at 8x real time, VP9 at 10.5x and H264 at 14x.
You can also see that with CPU utilization stuck at about 20%, AV1 was using only one of the available four cores on the system. If AOM can recode the player so it can address more than a single core, real time playback of 1080p video may be possible on the Zbook, though CPU utilization will be much higher than for the other formats which means a diminished battery life.
I also attempted to play all four files using FFplay while recording CPU utilization in Performance Monitor. H.264, HEVC, and VP9 all played in real time with minimal impact on CPU, which tended to indicate some form of GPU-based encoding or decoding in hardware in the CPU. AV1 wouldn’t play at all, which is probably not surprising for a codec still in its experimental status.
Figure 5. AV1 decode suffers from the lack of hardware acceleration.
Absent a more efficient decoder, these decoding numbers don’t bode well for playback performance of AV1 on devices without GPU or some other form of hardware acceleration that likely won’t hit the market until early 2020. The E3-1505M v5 Xeon processor in the Zbook is a pretty robust CPU, and it doesn’t look like it could play a 1080p file.
These tests revealed glimpses of very alluring quality as compared to existing codecs, but at a current encoding cost that’s far beyond what the vast majority of video publishers can afford to pay. How many devices will play AV1 without some form of hardware acceleration is also in question, though again, this may be easily fixable. While you should expect encoding and decoding performance to improve pretty quickly, it’s hard to see AV1 as relevant for most producers for at least 12 to 18 months.
The author wishes to thank Nigel Lee, Chief Science Officer at EuclidIQ, for his high-level review of the quality discussion. Dr. Lee did not review the measurements or calculations, so any errors are those of the author.
The first tests of AV1 showed glacial encoding times that seriously detracted from the codec's usability. But since then encoding times have improved enough that AV1 is almost usable—and we've got the charts to prove it.
Looking back and the successes and misfires of H.264, HEVC, and VP9 show what the industry can expect from AV1 and VVC.
When comparing the video quality created by different codecs, consider the companies running the comparisons and the metrics they're offering.
Is AV1 all that people expect it to be? How much better would HEVC be doing with a fair royalty policy? Look to these charts for the answers to tomorrow's codec questions.
AV1 delivers equivalent quality to HEVC, but with a lower data rate. For now, though, it's slow. A five-second clip took 23 hours and 46 minutes to encode.
H.264 still leads, HEVC is starting to gain traction, and AV1 had its coming-out party. To add to the confusion, other codecs offer alternatives to all three. Jan Ozer makes sense of all the codec news from this year's NAB.
Streaming Media's Jan Ozer interviews Netflix' Anne Aaron at the AV1 Coming Out Party at NAB 2018, where she discusses Netflix' plans for the codec as well as how HEVC will continue to play a prominent role.
Streaming Media's Jan Ozer interviews Alliance for Open Media executive director Gabe Frost at NAB 2018, talking about AV1's launch, its future, and how it will likely co-exist with HEVC
Today the Alliance for Open Media froze the AV1 bitstream and released an unoptimized software encoder and decoder; AV1 decode should arrive in several browsers and some content from member companies over the next few months, with hardware implementations in about a year.
Latest codec quality comparison study finds AV1 tops on quality, but far behind on speed, and that VP9 beats out HEVC. Moscow State also launched Subjectify.us, a new service for subjectively comparing video quality.
With Apple joining the Alliance for Open Media, there's been a lot of talk about AV1 vs. HEVC. But we no longer live in an either/or world, and can now give viewers what they want with minimal cost and effort compared to years gone by.
Mozilla's Timothy Terriberry, Brightcove's David Sayed, and Twitch's Tarek Amara debate the future of streaming codecs at Streaming Media West 2017.
Companies and Suppliers Mentioned