The Great UHD Codec Debate: Google's VP9 Vs. HEVC/H.265
Which delivers better quality, encoding time, and CPU performance—HEVC or VP9? We put them to the test to decide once and for all.
As of today, the great UHD codec debate involves two main participants: Google’s VP9 and HEVC/H.265. Which one succeeds—and where—involves a number of factors and will likely differ in various streaming-related markets, which I discuss in the “State of Codecs” article in the 2015 Streaming Media Sourcebook. While the actual performance of the two codecs is a consideration, it’s generally not the deciding factor—certainly it wasn’t with VP8 and H.264. Still, codec performance matters, and in this article I’ll analyze just that, looking at three criteria: quality, encoding time, and the CPU required to play back the encoded streams.
By way of background, as I reported back in November, there have been several HEVC/VP9 comparisons; some found the two codecs even, with others finding VP9 on par with H.264 and much less effective than HEVC. I reported on some quick and dirty comparisons at Streaming Media West and expanded upon those for this article. The short answer is that the quality produced by each codec is very similar.
Here’s a brief overview of my testing. I selected three 4K source clips. The first, which I called the New clip, was a collection of footage I shot with a RED camera for a consulting project; this clip is designed to represent real-world footage. The second was a short section from Blender.org’s movie Tears of Steel that represents traditional movie content. The third was a short section from Blender.org’s movie Sintel that represents animated footage. Working in Adobe Premiere Pro, I produced very high data rate H.264 mezzanine clips in 4K, 1920x1080, and 1280x800 resolutions. These mezzanine clips were the starting points for all encodes.
I then supplied the clips and detailed encoding specifications to three recipients: to Google for encoding to VP9; to a contact at MulticoreWare, which is coordinating the x265 project; and to a supplier of HEVC IP who chose to remain nameless. All three companies encoded clips to my specifications and supplied them to me. I also contacted MainConcept, which supplied a copy of its flagship encoder, TotalCode Studio, and advised me to use the standard presets modified for the appropriate resolution, data rates, and frame rates. I quickly learned that TotalCode has two limitations. First, it can only encode to mod-16 resolutions, which meant it couldn’t meet the target resolutions for either of the 4K outputs of the two Blender movies. Since the objective testing that I performed requires a precise matching of input and output resolution, I couldn’t test either 4K clip.
The second limitation is that the encoder currently supports only single-pass encoding. As we’ll see, this definitely impacted output quality in all encoded clips.
Once I received or created all the test clips, I confirmed that they met the data rate limits, then compared their respective video quality measurement (VQM) scores using the University of Moscow Video Quality Measurement Test. You can see the results in Table 1, with lower scores being better.
Table 1. HEVC vs. VP9 results with lower scores better
The three columns on the left show the results from the three HEVC participants with VP9 the last column on the right. The lowest (and best) score for each clip is highlighted in green, and the average score for each technology is shown in the total row at the bottom of the table. As you can see, VP9 scored the lowest (and best) of all tested codecs, though the difference between x265 and VP9 was negligible and commercially irrelevant, as was the difference between x265 and the IP Provider. I comment on MainConcept’s performance later in this article.
After running the objective tests, I viewed each clip in real time to observe any motion-related artifacts that might not have been picked up in the objective ratings. Vanguard Video’s Visual Comparison Tool is particularly useful for this, with a split screen view that lets you play two videos at once and drag the centerline during playback to shift from viewing one video to the other (Figure 1), or overlay a video atop another and switch between them using the CTRL TAB keystroke combination. I saw nothing during these trials that contracted the objective VQM findings.
Figure 1. Comparing the quality of x265 and MainConcept via Vanguard Video’s Visual Comparison Tool
At present, VP9 delivers the same level of performance as HEVC, which isn’t surprising given that VP8 performed very similarly to H.264. So if you’re choosing between the two, pure codec quality shouldn’t be a major differentiator.
Regarding MainConcept, the ratings are not surprising, since the current version of TotalCode Studio only supports single-pass encoding. Since the clips were produced using 125 percent constrained VBR encoding, this meant that technologies with two passes could allocate more data to the hard to encode regions in all test clips.
As an example, Figure 2 shows how MainConcept compared to x265 over the duration of the 1920x1080 New clip, with MainConcept in blue and x265 in red. Lower scores are better. As you can see, the quality of the two codecs was very similar except for the region on the extreme right: the high motion, high detail shot shown in Figure 1 that was the hardest to encode in the entire clip. With the benefit of two-pass encoding, TotalCode Studio could have allocated more data to this region, which likely would have resulted in much closer scores.
Figure 2. MainConcept proved very close to x265 in all but the highest motion regions of the clip (MainConcept is blue, x265 red, lower is better).
As a complement to the main testing, I encoded a talking head clip with consistent motion throughout with both codecs x265 and MainConcept, testing the theory that consistent motion would negate the benefit of two-pass encoding. Despite the two-pass encoding advantage enjoyed by x265, MainConcept won this trial with a VQM score of 0.410, while x265 was 0.488. In addition, in the single-pass encoding trials I ran for the aforementioned HEVC configuration story, MainConcept was slightly ahead in most tests.
Last week, HEVC Advance announced it is forming a new patent pool in addition to the one offered by MPEG LA. While the new group—which will likely include Dolby, GE, Mitsubishi Electric, Philips, and Technicolor—did not announce licensing terms, past precedent indicates that the royalties won't likely be onerous for most encoding and decoding vendors, though how it will affect content publishers is less clear.
H.264 still accounts for most video encoding today, but HEVC/H.265 and VP9 are beginning to make noise. What will 2015 bring?
Why would set-top box makers bother supporting UHD video when home bandwidth connections aren't nearly robust enough to carry it?
Why did Google purchase On2 Technologies back in 2010? Because encoding and streaming VP9 video is saving it tens of millions each year.
While it's fun to be on the cutting-edge of new video codecs and formats, H.264 should be every publishers' primary focus for the time being.
For compressionists who want to see the image quality differences a tool measures, SSIMWave can feel incomplete. An upcoming update may change that.
VP9 is the open-source codec from Google, and provides a royalty-free alternative to HEVC. It's more efficient than H.264, and while it's less efficient than HEVC, it compares well on quality.