Navigating a Multi-Codec World
Video distribution has grown into a huge ecosystem, so it is only natural that new video formats arise and that companies have a variety of reasons to launch new standards. MPEG, in particular when joined by ITU-T, has been extremely successful in bringing together companies, technology and IPR in the field. This led to success stories such as MPEG-2 and H.264/AVC, and domination in most of the ‘traditional’ broadcast business.
On the Internet, there has always been a proliferation of formats, where small and large tech companies played a role. Some of these formats, such as Windows Media Video 9, were standardized (as SMPTE VC-1). Other formats, such as MPEG-4 Visual, became prevalent in the (illegal) download market. In the late 90s and 00s, we saw those co-exist with formats such RealVideo, TrueMotion VPx, Theora and H.263. With the introduction of adaptive streaming over the Internet, AVC became the preferred format for most streaming services.
In both broadcast and Internet distribution, AVC has led to a rare consolidation, and can be considered as one of the biggest technology successes in recent decades. It led to a dominance in technology rarely seen, similar to JPEG for still image coding (the "alien technology from the future," as Tim Terriberry from Mozilla called it). For a long time, the question was not, which video standard are you using, but rather, which AVC profile are you using.
More recently we’re seeing increasing fragmentation again. Already from the start of HEVC standardization in 2010, it was known that significantly superior technology to AVC existed, but licensing uncertainty delayed the transition to HEVC. There is some momentum around HEVC now, primarily for premium content. But at the same time, a parallel move towards formats such as VP9 and AV1 has gained a foothold. And further efforts have led to formats such as VVC, EVC, and LCEVC. In the Chinese market, parallel standardization efforts have led to the AVS series of standards (Figure 1).
Figure 1. Evolution of video compression standards
There seems to be industry consensus that we will continue to live in a multi-codec world. Although there is a lot of money involved, it is not a true war, as is often stated. Rather, this is a business reality of co-existing formats with differing use cases, and companies supporting formats for different reasons. The neutral user should not care about all these formats though, and not all formats that we’re currently seeing will play a significant role. Some formats will find niche use cases, while others will continue to dominate for a long time and will gradually be replaced by future successors. Others will silently fade out.
This is a situation that will live on for at least some time – although there could be ways towards consolidation again.
What Caused the Original Fragmentation?
AVC was a huge success, with a very strong offering on all accounts. The jump in compression efficiency over MPEG-2 and MPEG-4 Visual was significant, timing was aligned with the transition to HD and licensing terms were certainly reasonable. Companies that were part of the Joint Video Team at the time, a relatively small group, have been reaping the benefits of their standardization participation for the last two decades. AVC entered the market in 2003, and while it took some years for efficient software and hardware codecs to arise, they are still extremely popular (thinkn of x264 on the encoding side). Up to today, AVC is the dominant format for both broadcast and bits sent over the Internet, and it will continue to serve as the greatest common denominator format for the coming years.
Everything was set to repeat the same success when HEVC was launched in 2012. A huge number of companies, institutes and researchers contributed to the standard, but once it was finalized, the waiting game started. Waiting for real-time encoders, first in software, then hardware… but mainly, for the patent holders to declare their IPR, and for patent pools to announce reasonable licensing terms.
In 2015 the uncertainty was still there, even though decent HEVC encoders were already on the market. Moving to HEVC at that time seemed a risky endeavor. In that year, Google launched VP9 and suddenly, there was a move by several companies (including Google/YouTube and Netflix) away from the MPEG family of codecs. With the right encoder implementation, VP9 is a strong contender for HEVC. But still, there was a consensus that an even better format was possible. And that with the right consortium of companies, and on the foundation of VP9, a strong joint codec could be built. This is what gave birth to AOMedia.
In the end, this move was an economic reality. The founders of AOMedia realized that huge chunks of Internet traffic could be saved by going beyond AVC. Licensing reasons were blocking them from moving forward, but technological advances along with the willingness to be better shepherds of the Internet, led to growing momentum around alternatives. In general, if traditional organizations do not move fast enough, they will be overtaken by industry organizations that have stronger technological power, and are more agile. The video market is too big, and its impact on overall Internet traffic is too large to be blocked by a handful of patent holders. The financial impact for individual large companies is big enough to warrant substantial investment in next-gen formats that are not encumbered by unreasonable licensing terms.
Although the “openness” of AOMedia has sometimes been questioned, with AV1 we are seeing an open standard with several (open-source) implementations that came to market much faster than before. AV1 is equally versatile as VVC, and is already being launched for streaming, web conferencing (see e.g. Cisco’s announcement on AV1 support for WebEx) and low-latency applications.
What Are the Essential Elements for a Standard to be Successful?
New video formats pop up all the time, and companies or research institutes have varying reasons to introduce them. Some offer significant compression advantage, while others are tailored to specific use cases, yet others claim to be royalty free. Whatever the rationale behind them is, not all formats have what it takes to become successful. Some do, but they fail to create momentum around them. Successful formats tend to have some specific characteristics in common.
To move towards a new standard, a certain jump in compression efficiency is required. What perceptual quality can you offer your viewers at the same bitrate? Can you provide higher quality of experience to your viewers? Can you deliver the next step in resolution at an acceptable bitrate? Often, the question is reversed and becomes, what bitrate reduction is obtained at the same quality, and therefore, what cost savings can the new format bring. Hence the question is around a threshold that makes sense from an investment perspective. Often the 50% threshold is used, and this is still true for certain use cases. For the broadcast world, the cost of upgrading hardware is usually immense, and the cost/channel is an important benchmark. 50% bitrate reduction at equivalent quality is probably a sane investment threshold in this world, given the large overhead in replacing systems such as head-ends. Also here though, we’re seeing increasing transition to software, leading to shortened adoption cycles, with a lower investment threshold.
For OTT video, it can make sense to innovate faster. There, a 20-30% bitrate reduction could be sufficient when trading off the costs vs. benefits. In this case, the cost of encoding is usually less of a bottleneck, once reasonably fast software encoders are available. Distribution/CDN costs and Quality of Experience (QoE) will be essential here. On the one hand, you reduce bits over the network and potentially improve visual quality. On the other hand, you might get a hit in caching efficiency, client QoE and storage costs. (Caching efficiency goes down when offering more formats; rebuffers might go up, and battery life could be affected; storage goes up since you’re not necessarily replacing your catalog, but rather adding new profiles.)
Support for new use cases
Fortunately, most recent formats are ‘versatile’ enough to support all common use cases. Some standards offer extensions for use cases such as multi-view/3D and scalable coding, which could be required in niche cases. Typically though, standards are optimized up to a certain resolution, and that level of optimization could be the trigger to adopt a new format. That explains why you still see the choice of format go hand-in-hand with screen resolutions, and transitions from e.g. SD to HD, HD to UHD, and UHD to 8K. The timing is probably right at the moment, with 5G, interactive applications, and 8K (major events such as 2021 Olympics) as possible drivers for next-gen formats such as AV1 and VVC.
Reasonable licensing terms
For some formats, licensing terms were unknown for too long, or patent pools became overly demanding. And true, you can put quotes around “royalty-free,” since legally speaking there are never guarantees. But still, the message towards more clarity around licensing has been received, and there is growing momentum around royalty-free initiatives, with giant companies rallying behind these formats. In the case of AOMedia, this resulted in the formation of a huge consortium and patent pool. In MPEG, this led to a royalty-free profile in the EVC standard.
If critical mass is not present at the start, you can still create it with a strong offering in the previous items, but it will be an uphill battle. With the large number of proponents around VVC, its ecosystem received an instant boost. The same holds for AV1 within AOMedia. For AV1’s successor, AV2, it is expected that there will be an even more active involvement from a larger AOMedia representation from the start. Essential in creating this critical mass is a solid representation of the client ecosystem and having the right software/hardware implementation companies on board. For Internet distribution, browser support has been critical in the past. For the growing mobile market, support on the prevalent operating systems (iOS, Android) and mobile chipsets can mean a big boost.
What Will the Future Bring?
None of us has a crystal ball, so there is no use in presenting absolute numbers or detailed forecasts. Rather, there are trends based on strengths of the different formats and expectations around their installed base, which are described here, and which are represented in the “adoption curves” below (Figures 2 and 3).
For the current line-up of formats, the strengths mentioned in the previous section are mostly present in AV1 and VVC. In different line-ups, VVC has shown to offer the best compression efficiency. On top of that, it has a large contributor base. Still, its popularity will depend, to a large extent, on its licensing terms.
AV1, on the other hand, is already growing in deployment. In general, there is a serious momentum around royalty-free codecs, with a large consortium that not only dominates the Internet in terms of streaming bits, but also represents the major browsers and consumer devices. AOMedia not only has their original members that are often cited, but in the last two years, other giants have followed, such as Apple, Facebook and Samsung. The combined AOMedia patent portfolio is stunning. Add to that hardware companies, encoder developers, and a growing community of (open-source) developers. The only thing that was missing was a strong presence of broadcasters, which is one of the reasons why Synamedia joined AOMedia.
How formats will be adopted depends on the use case, and we still expect a split between broadcast and OTT oriented video. Nonetheless, there is growing convergence driven by a tendency towards more flexible software-based solutions in both worlds.
Figure 2. Expectations around format adoption in the OTT ecosystem.
On the OTT side (Figure 2), AVC is still dominant, and considered as the fallback option, with which you can reach nearly all devices. HEVC usage has been relatively limited, but has seen a recent uptick, now that licensing terms are better understood, and with more than a billion of chips out there supporting HEVC decoding. Typically, it is considered a format for premium content, including 4K and HDR/WCG content. The HEVC market share is expected to grow over the next years, but then to depreciate again in favor of VVC and AV1. Given its large decoder support, VP9 is still relevant. Among others Netflix, Twitch and YouTube have deployed VP9 (including for Google’s recently launched Stadia online gaming platform), leading it to occupy a reasonable share of bits over the Internet. Strong encoder and decoder implementations are available, in both hardware and software. With the right encoder implementation, VP9 can be considered a close alternative to HEVC in terms of compression efficiency.
Obviously, AV1 is the up-and-coming format in the OTT world, with a growing list of AV1 support announcements. Several implementations are available that keep getting more efficient, both in terms of compression efficiency, and cycles (implementations include libaom, SVT-AV1, Google libgav1, Mozilla rav1e, and proprietary encoders such as Visionular and EVE-AV1). In the early days of AV1, it took a ridiculous amount of CPU cycles to encode AV1, but those days are gone. Significant gains over HEVC can now be achieved with reasonable CPU cycles. Even as a live video format AV1 is gaining traction (see for example advances in WebRTC, and announcements by e.g. WebEx), and hardware support is growing. AV1 has the efficiency to replace AVC, VP9 and HEVC since it combines all their strengths. We expect that in a couple of years, AV1 will become dominant for Internet video traffic, with a long tail of AVC for legacy devices, and HEVC for premium content on iOS devices.
On the broadcast side (Figure 3), the situation is somewhat different. In many cases, hardware that’s slower to upgrade, such as set-top boxes, has to be taken into account. In other cases, broadcasters will follow national or regional standards such as ATSC or DVB. The breakthrough of new formats will hence strongly be influenced by choices made in governments and standardization organizations.
Figure 3. Expectations around format adoption in the broadcast ecosystem
Clearly AVC is still dominant for broadcasting, but there is growing momentum growing around HEVC, particularly towards UHD channels. MPEG-2 is still around for legacy reasons, and there actually are reasons to keep innovating on MPEG-2 encoding (the ATSC transition is playing a role and is a motivation for us to make our MPEG-2 encoding even more efficient). So far, AV1 has been less of a focus there, although it has the potential to replace HEVC. VVC clearly has strong points to take a large share of the broadcast market in the future and offers a significant enough compression gain over HEVC to be considered its successor. Still, it will be up to the IPR holders and the licensing terms that will come out of the ongoing discussions around MC-IF (rhe Media Coding Industry Forum, an initiative to foster the formation of a VVC patent pool. The first results have been announced at https://www.mc-if.org/). Similar to 4K being a driver for HEVC adoption, 8K could be an incentive for VVC adoption. Although the industry will move slowly towards such high resolutions, we’re already seeing an uptick in the sales of 8K TVs, and sports events might accelerate adoption.
Our expectation is that in 5 years, AVC and HEVC will still be splitting the market, with a growing presence of VVC, a small chunk of AV1, and a long tail of MPEG-2.
How to Choose Formats for OTT Deployment
The multi-codec world is here to stay, at least for some time. For OTT implementors and distributors, it makes sense to offer multiple formats, to have the highest QoE for every platform and user. Still, there is a breaking point in terms of supported formats. Supporting 2-3 formats is probably sustainable, and offering those alternatives in your catalog can be a healthy trade-off. Supporting 4-5 formats or more becomes very expensive, and leads to implementation and/or support headaches, high storage costs, and degrading cache efficiency on your CDN. (More cache misses will lead to degrading QoE and higher costs, even when supporting a more efficient, but rarely used format.)
In the end, you want to reach your customers with the highest QoE, at a reasonable cost. Choices have to be made, and as discussed, client support is key here. The no-brainer here is to offer AVC streams as the greatest common denominator. If UHD streams need to be offered, potentially with DolbyVision and/or HDR10 support, HEVC might be a good choice, to reach high-end mobile devices and recent TV sets. For Android devices and browsers, VP9 is still a good format, but we’re reaching the point where AV1 should be the preferred format over VP9, with an even wider reach and additional compression performance gains.
Can HEVC be removed from the list above? Probably not, given the number of devices boasting HEVC decoders, and as long as iOS doesn’t support VP9 or AV1. Can both VP9 and AV1 be removed from the list above? Probably not, unless you want to fall back to AVC on browsers. For now, the dichotomy in client support still warrants two highly efficient formats (such as HEVC and AV1), on top of a long-tail format (AVC). And in any case, it’s too early to think about VVC, as efficient client decoders are still 1-2 years out.
Wait a Minute, What About the Other Standards?
The careful reader has noticed that some formats are missing from the discussion above. Does that mean that they cannot play a role? Not necessarily. A format such as EVC is backed by technology companies that control a huge share of the mobile chipset market. Still, the question remains where their strategy lies, given that AV1 and VVC have strengths over both profiles of EVC. Silicon area is expensive, so an important question is around which formats will be supported on major chipsets. Recently, Qualcomm reported not to include AV1 on its SnapDragon 888 series (although it does support VP9). Will Samsung and Qualcomm decide to invest in transistors for EVC, or simply skip to AV1 and/or VVC in the future? In case of the former, this might give a boost to EVC usage on mobile clients. EVC and other formats will need to prove that they can create sufficient critical mass or benefits in certain niche areas.
Is There a Way Out?
MPEG recently went through a major shake-up. Still, it is expected that MPEG will continue to produce high-quality standards and further extensions and successors to VVC. In the meantime, AOMedia has started its preparation towards AV2. On top of that, companies might continue to push their proprietary formats into standards, with the hope of monetizing their IPR. Hence, the multi-codec world will continue to exist.
In general, it’s hard to ignore the huge critical mass around AOMedia, even more so since Samsung and Apple have joined the original founding members. Both bring huge additions to the client ecosystem, which might be the missing pieces of the puzzle. If all current members contribute to AV2 from the start, and make sure all their use cases and concerns are addressed, this could give rise to a very powerful AV2 format with unseen industry support before it’s even finalized. The future could be bright, and simpler than it is today.
ProRes fits most cases, and IMF the rest, but whatever format you're using, metadata is key for efficient video library management.
Take a 5-minute survey about codecs, encoding workflows, and premium video features, and enter for a chance to win a $500 Amazon gift card.
Facebook Cobra Commander of Video Special Forces Colleen Kelly Henry discusses best practices for deploying new codecs in this clip from Streaming Media Connect 2021.
Streaming Learning Center's Jan Ozer explains the importance of adjusting the rungs on your encoding ladder in this clip from his presentation at Streaming Media Connect 2021.
Bitmovin's Paul MacDougall and Streaming Media's Tim Siglin discuss benefits built into the AV1 codec that enhance the efficiency of 8K delivery, and what that means in terms of codec adoption and for CDNs and consumers going forward in this clip from Streaming Media East Connect 2020.
Twitch Principal Video Specialist Tarek Amara explains which factors publishers should consider when choosing encoders and making codec support decisions in this clip from his Video Engineering Summit presentation at Streaming Media East 2019.