The State of Codecs 2017
With the launch of HEVC Advance and formulation of the Alliance for Open Media, 2015 was the most disruptive year in recent history as it related to the adoption and deployment of video codecs. While 2016 was not quite so seismic, there were a number of significant events that will impact how we choose and use video codecs over the next few years, with one particularly unhappy holiday present for all H.264 users. I’ll discuss them here in mostly chronological order.
January 2016: U.S. HDR Market Converges on Two-Technology Approach
On Jan. 5, Dolby announced that LG’s 2016 OLED TVs and Super UHD TVs would feature Dolby Vision HDR technology, signaling that Dolby Vision will likely become the de facto premium HDR technology in the U.S., with HDR10 as the royalty-free, but less featured, alternative. Admittedly, this date is arbitrary; we could have selected March, when Dolby announced that Amazon would support Dolby Vision in its streaming 4K titles, or June, when Wal-Mart’s Vudu did the same. The key point is that support for Dolby Vision HDR is coalescing. Virtually all Dolby Vision-capable TVs will also support HDR10, which made it safe for TV manufacturers to build, and consumers to buy, 4K TVs with HDR.
This is significant, because the advantages of 4K-only solutions over 1080p TVs are minimal under most normal viewing conditions. In contrast, HDR significantly improves the viewing experience under virtually all conditions. Unfortunately, actual HDR deployment by content developers has been slowed by the lack of a de facto standard.
Dolby Vision has multiple advantages over HDR10, including 12-bit support as compared with 10-bit, and dynamic metadata that can be adjusted on a scene-by-scene or even frame-by-frame basis. In contrast, HDR10 uses static metadata sent at the beginning of the movie. Dolby Vision can also support up to 10,000 nits of brightness as compared with 4,000 nits for HDR10, though it’s unclear when hardware that can display 10,000 nits will be available.
Fortunately, content producers can master for Dolby Vision with fallback support for both HDR10 and even standard dynamic range displays, simplifying production and distribution. Overall, 2017 should become the year it’s actually safe to invest in a high-end 4K HDR-capable TV set, at least here in the U.S. In Europe, the BBC is pushing the Hybrid Log-Gamma (HLG) HDR technology, which it developed with Japanese broadcaster NHK. So far, while there are some projectors that support HLG, few if any TVs do. However, unlike Dolby Vision, which requires hardware support in the device, HLG should be field upgradeable via a firmware update. So if HLG ever becomes popular in the States, many HDR TVs should be able to support it.
Dolby Vision HDR appeared to become the de facto premium HDR standard.
March 2016: Beamr Acquires Vanguard
In March, Beamr, an Israeli-based video optimization vendor, bought codec vendor Vanguard Video. This is significant for several reasons. From a technology perspective, the combination will enable Beamr’s optimization technology to be used within Vanguard’s already highly regarded H.264 and H.265 codecs, rather than being applied as a post-process. This is more efficient from both quality and workflow perspectives, though Beamr hasn’t yet announced the availability of this solution.
From a pure heft perspective, the combination, which was enabled by a $15 million round of private funding, gives the combined company the ability to compete with larger encoding companies like Elemental Technologies and Harmonic. The merger is already bearing fruit; in December, Verizon Ventures, the venture capital arm of Verizon Communications, owner of Verizon Digital Media Services and other related companies, invested $4 million into Beamr. This investment likely foreshadowed the use of Beamr’s codec-related technology by Verizon Digital Media Services, and others in the family tree.
June 2016: Apple Supports Fragmented MP4 in HLS
At its 2016 Worldwide Developers Conference, Apple announced that HTTP Live Streaming (HLS) will support the Common Media Application Format (CMAF), a file packaging specification jointly authored by Apple and Microsoft. Why is this significant? Previously, HLS only supported files packaged in the MPEG-2 Transport Stream container, while the Dynamic Adaptive Streaming over HTTP standard (DASH), and proprietary technologies like Microsoft’s Smooth Streaming and Adobe’s HTTP Dynamic Streaming (HDS), supported files packaged in the fragmented MP4 container (fMP4). This meant that many publishers needing DASH and HLS support for their various playback targets had to produce and store files in both formats. CMAF enables a single format compatible with HLS and DASH.
The fly in the ointment is that CMAF enables two incompatible common encryption modes: cipher block chaining (CBC) for Apple’s FairPlay digital rights management (DRM) technology, and counter mode (CTR) for PlayReady, Widevine, and other DRMs. Content encrypted with CBC can’t be decrypted by PlayReady-compatible clients, while content encrypted with CTR can’t be decrypted by FairPlay clients.
Google has minimized the issue by supporting CBC in Widevine, which covers the Android platform, as well as Chrome and Firefox. Until Microsoft does the same for PlayReady and Edge, however, we’re stuck with two fragmented MP4 data silos, one encrypted in CBC, the other CTR. Or, more likely, until CMAF plays on all platforms, most producers will likely stick with HLS content encapsulated in an MPEG-2 transport stream wrapper, and DASH in fMP4.
August 2016: Apple Quietly Deprecates TN2224
Apple’s Tech Note TN2224 has been the Rosetta Stone for encoding professionals since its initial publication in March 2010, providing general guidance regarding how to formulate an encoding ladder, as well as specific guidance on encoding for HTTP Live Streaming (HLS). On Aug. 2, Apple unceremoniously deprecated TN2224 with a short note stating, “Important: This document is concerned with practices and with the rationale behind them. For detailed requirements please refer to the HLS Authoring Specification for Apple Devices.” Well, at least Apple said it was important.
Interestingly, until Dec. 12, the HLS Authoring Specification for Apple Devices had been named the Apple Specification for Apple TV. On that date, Apple amended the title to apply to all Apple Devices. The net effect is that where you used to look to TN2224 for specific guidance for encoding for HLS, now you should look to the Apple Devices spec.
If you haven’t looked at TN2224 for a while, recent changes to it, and those engendered by the Apple Devices spec, have been dramatic. Apple now recommends a keyframe interval of 2 seconds (as compared with 3 seconds) and a segment length of 6 seconds (as compared with 9 seconds). The encoding ladder itself has been completely revamped, and Apple now approves the use of up to 200 percent constrained VBR, perhaps the most significant change of all.
In addition, the Apple Spec recommends using the High profile, rather than the Baseline or Main, obsoleting several older iPhone models, and dictates that all 30i content should be deinterlaced to 60p rather than 30p, a curious preference for smoothness over detail. Finally, the Apple Devices spec states that the 2000 kbp variant in the encoding ladder should be the first variant listed in the master playlist file, making it the first retrieved by the player. The bottom line is that if you’re encoding HLS, you should check out the Apple Devices spec, particularly if streaming to an app that requires App Store approval.
August 2016: Netflix Loudly Proclaims HEVC Superior to VP9, Then Quietly Recants
On Aug. 31, Netflix’s Jan De Cock presented the results of a massive Netflix internal study to the SPIE Applications of Digital Image Processing conference in a presentation titled, “A Large-Scale Video Codec Comparison of x264, x265 and libvpx for Practical VOD Applications.” One of the most significant findings was that when measured with Netflix’s VMAF benchmark, x265 proved about 20 percent more efficient than VP9 (libvpx) at all tested resolutions.
4K is making inroads, but it's the profound visual richness of high dynamic range video that will really revolutionize how people watch television. Streaming networks are leading the way.
After a year of uncertainty, HEVC seems poised to finally make inroads, if not for 4K then for bandwidth reduction.
H.264 still accounts for most video encoding today, but HEVC/H.265 and VP9 are beginning to make noise. What will 2015 bring?
Companies and Suppliers Mentioned