Bitmovin Pushes AV1 Forward, Joins Alliance for Open Media
Just when it seemed that the Alliance for Open Media's AV1 codec would be on the backburner through the end of 2017, Bitmovin announced that it is adding AV1 to its VoD and live encoding service and demonstrating live AV1 encoding and playback at NAB. In a blog post about the announcement, Bitmovin details AV1's progress to date, and reveals how far it needs to go to become usable and how quickly it can get there. Long story short, like objects in your rearview mirror, AV1 may be closer than it appears.
The AV1 Players
Bitmovin was founded in Austria in 2013, initially to market a DASH-based HTML5 player, then adding cloud encoding and analytics. Earlier this year, Bitmovin announced a managed on-premise encoding service based on Docker and Kubernetes.
The Alliance for Open Media (AOM) launched in 2015 to consolidate the open source codec development efforts of several its founders, including Google (VP10), Mozilla (Daala), Cisco (Thor), Microsoft, and Intel, with Amazon and Netflix also joining as founding members. In 2016, AMD, ARM, and NVIDIA joined the group, and the AOM website currently shows 23 members, not including Bitmovin, which joined in April 2017.
The primary focus of AOM is to develop "media codecs, media formats, and related technologies to address marketplace demand for an open standard for video compression and delivery over the web." In short, a royalty-free codec to use instead of the exceedingly royalty-bearing HEVC. The quality target was "50 percent over VP9/HEVC with reasonable increases in encoding and playback complexity," and in April 2016 we wrote that the target date for freezing the bitstream was "sometime between the end of the 2016 and March 2017." Since then, sources had advised that the target date might slip to the end of 2017. Among other details, in ITS blog post Bitmovin states that, "The AV1 bitstream freeze should be in Q3 2017."
Given AOM's membership, things can move quickly from there. That is, Google, Microsoft, and Mozilla can quickly add decode to their respective browsers, while Netflix, Google and Amazon can start encoding trials for content distribution. While hardware support for AV1 will obviously take longer, the hardware partners in the Alliance have worked alongside the codec developers (more below) during the development process, which should accelerate the availability of AV1 encode/decode in CPUs, GPUs, SoCs, and other hardware.
What Bitmovin Announced
On April 18, Bitmovin announced that its cloud encoding and managed on-premise offering "now supports AV1 encoding for VoD and Live." The company also announced that it would "showcase the first ever AV1 livestream, delivering 1080p playback at 1.5Mbps in broadcast quality at our booth at the NAB Show in Las Vegas (SU9007CM) from April 24-27." Let's deal with the second announcement, then circle back and address the first.
The live AV1 workflow Bitmovin will present at NAB starts with a 12Mbps 1080p 30fps stream generated by live encoding program Open Broadcaster Software (OBS). This is sent to the Google Cloud via RTMP where it's encoded by the Bitmovin cloud encoder to "broadcast quality" 1.5Mbps AV1. Regarding quality, the blog post states, "you would need around 4 to 15Mbps with traditional codecs like H264 to deliver the same quality." Once encoded, the video will be streamed to a desktop for playback via the AOM player and FFmpeg.
Figure 1 illustrates the difference between a single notebook encoding the stream, and encoding performance in the Bitmovin cloud. That is, the 40 second Tears of Steel teaser Bitmovin will demonstrate at NAB would take close to nine hours to encode on a notebook powered by 4-core i7-4800 MQ CPU. In contrast, the Bitmovin cloud can produce the same output in 34.5 seconds.
Figure 1. Bitmovin cloud compared to a 4-core notebook.
The obvious question is how many cores in the cloud are necessary to achieve this performance. At NAB, the company expects that it will require up to 200 cores. However, Bitmovin CTO Christopher Mueller commented that "it's really early stage and that we mainly want to show the flexibility of our encoding stack...I am very confident that we can bring down the hardware requirements to encode a single 1080p AV1 stream to 8 to 32 cores sooner than later." This makes sense, given that most heavy optimization work won't start until after the bitstream freezes.
However, at least in the short term, the real gate to deployment isn't the encoding side; it's the playback side. That is, any company attempting to stream AV1 video would have to supply a player or plug-in until Google, Microsoft, and Mozilla enable browser-based playback. For most video services, that's a non-starter, particularly if browser-based playback is coming, perhaps as soon as the fourth quarter of 2017. Note that Bitmovin does offer an Android SDK with AV1 decode, so if Android playback provides sufficient economic justification to start deploying AV1, you can start sooner than year's end.
Given these realities, let's consider Bitmovin's claim that its cloud and managed platform now supports AV1 encode. While undoubtedly true, the short-term lack of playback options probably makes this relevant for only a very small group of streaming producers.
Beyond describing the detailed inner workings of their NAB presentation, the Bitmovin blog post also provides interesting details regarding the AV1 development process, and how AV1 quality compares to HEVC, VP, and H.264. Let's take a quick look at both.
The blog post explains that, "the AV1 codec has its roots in the codebase of Google's VP9/VP10 codec with an additional 77 experimental coding tools that have been added and are under consideration. Out of that 77 experimental coding tools, only 8 are currently enabled by default (adapt_scan, ref_mv, filter_7bit, reference_buffer, delte_q, tile_groups, rect_tx, cdef), but the performance of the codec is already appealing."
In other words, the AV1 base is mostly VP10, with different experimental encoding techniques being added to improve quality and/or performance. Of the 77 available, Bitmovin enabled only the eight currently enabled by default in the codec. While this gives us a preliminary look at performance, we won't be able to assess final performance until the bitstream is frozen and encoding recommendations are provided. By the same token, any previous looks at AV1 based upon older code and enabled experiments are almost certainly not representative of AV1's final quality.
The blog post goes on to describe "the high-level process on how experiments can be added to the AV1 codec:"
- Coding tools are added as experiments into the AV1 codebase. They are controlled at build-time by flags (e.g., –enable-experimental –enable-<experiment-name>).
- The hardware team (group of hardware members inside of AOMedia) reviews the experiments to ensure it can be implemented in hardware.
- Each experiment needs to pass an IP review to ensure no IPs are violated.
- Once reviews are passed the experiment can be enabled by default."
As mentioned above, the hardware members in the group were apprised of codec developments in real time, so they won't be starting from scratch when the bitstream is frozen, and all experiments should be hardware-friendly. This will accelerate the availability of hardware-based encoding/decoding.
Second, the group is clearly hoping to ensure that they don't step on anyone's IP, so that free and open source will remain free and open source. No one can predict with any level of assurance whether AV1 will or won't be sued for infringement, but these types of procedures are assuring.
The blog post then describes comparisons between H.264, HEVC, VP9, and AV1, though it's difficult to generalize their results. For example, one set of comparisons involved PSNR and SSIM analysis; but these were performed on the animated movie Sintel (Figure 2). Since codecs often perform very differently on animated and real-world footage, it's impossible to predict AV1's comparative performance with real world video from these results.
Figure 2. AV1 proved best in PSNR comparisons on the Sintel video.
The blog also includes comparison frames from Tears of Steel (Figure 3) but the test case was so limited (1080p@500Kbps@24 fps) that it's again tough to draw any broad-based conclusions.
Figure 3. Ditto here in this limited comparison with HEVC.
I raised these concerns with Mueller, who promised that Bitmovin will "extend our experiments and publish more results after the NAB with a typical test set containing sports, animation and movies." While I expect AV1 to perform well in these comparisons, we won't know until the results are released.
What's it All Mean
Overall, Bitmovin's announcement is a very impressive technology demonstration by a relatively small company that revives hopes that AV1 will be meaningful in the relative short term. However, you should resist the urge to proclaim AV1 the quality king until more comprehensive comparisons are available.
Publishers are invited to test their content with a free online resource that shows the quality and bandwidth improvements of per-title encoding.
A three-pass process will learn from previous encodes to speed processing time and deliver better optimized files.
Netflix and YouTube could start using AV1 as soon as early 2018, while hardware implementations will take much longer.
Hardware acceleration and field programmable gate arrays may be the answer to the rising costs of encoding for multiple codecs including H.264, H.265, VP9, and soon AV1
Some of the best minds in the streaming industry say AV1 will be challenged by infringement claims, but some of the best minds could be wrong.