-->
Register FREE for Streaming Media Connect, May 20-22. Save your seat TODAY!

Software Decoding and the Future of Mobile Video

Article Featured Image

With AV1 hardware decode on mo­bile devices stuck in the mid-to-low teens as of 2025, and with VVC at zero, it’s clear that the race to supplant H.264 and HEVC will be contested with software-only decoding. As we sit here in 2025, we know that Meta has aggressively started dis­tributing AV1 streams for software-only play­back and has even co-created an open source project called VCAT (Video Codec Acid Test) to help benchmark mobile devices. We’ve heard that some VVC IP owners like Kwai, ByteDance, and Tencent are deploy­ing VVC with software decode, but there’s little data available.

What is clear is that there’s a huge market for $80 Android phones in regions where the bandwidth savings that AV1 and VVC deliver can improve both publisher margins and deliv­ery outcomes. Equally clear is that $80 phones can only include so much codec-specific decod­ing hardware, and codecs that require hard­ware playback just won’t compete.

The big questions are, of course, how much more efficient are AV1 and VVC than HEVC and H.264, and how efficiently do they play without hardware? So, those are the two topics I sought to address.

Let me say up front that to a certain extent, this is a fool’s errand. AV1 and VVC are codecs, and we can only compare implementations of
those codecs, in this case, SVT-AV1 3.0 and VVenC 1.13. I’ll defend this choice by noting that these open source versions of the codecs will be used by a substantial number of devel­opers. I’ll also note that other vendors offer al­ternative codecs that deliver far greater sav­ings than those available in the tested versions.

The playback side is even more complicat­ed. There are no open source or commercial VVC players for iOS and Android, and I don’t have 60 Android devices around for testing. I did test AV1 playback on a few devices, but that supplies only half the picture. In addition, the comparative tests I ran on older Windows and Mac computers revealed that player effi­ciency varies as greatly as codec implementa­tion efficiency, if not more so, therefore, even these results provide only basic guidance.

So, the TL;DR outcome is that my tests showed VVC to be about 7% more efficient encoding-wise than AV1, but it required about 2x–2.5x the CPU to play back the videos. This translates to fewer devices that can play VVC, which means less savings than AV1 will afford by delivering
slightly larger streams to many more devices.

Now that you know the end of the story, why should you bother reading the article? Well, if you’ll be encoding with SVT-AV1 or VVC, you’ll learn a bit about how to optimize your encodes, particularly the trade-offs that pre­sets deliver, and how many logical processors to use. With SVT-AV1, I also explored the qual­ity and playback efficiency of the fast decode option, although I didn’t find much to show for it. You can also peruse the FFmpeg command strings used for all encodes.

command strings

In addition, you’ll learn how much more effi­cient AV1 and VVC are than H.264, which may finally provide enough motivation to bite off that second codec you’ve been thinking about. Plus, at the end, you’ll get a good laugh at how crude my playback benchmarking efforts were.

Quality Testing Overview

I tested encoding quality with 12 10-second files, including sports, movies, concerts, and animations. Usually, I prefer to test with longer files, but this was untenable given the lengthy encoding times of the advanced codecs using the tested configurations.

Next, I identified the configuration options I would use for each codec, which are presented in Table 1. I decided to test for VOD distribution, so I configured all of the encodes to 2-second GOPs using 200% constrained VBR with de­fault settings for I-frames at scene changes.

Table 1. Critical configuration parameters for each codec
Table 1. Critical configuration parameters for each codec

Note that SVT-AV1 doesn’t support constrained VBR, so I tested using average bitrate mode, which may have given SVT-AV1 a slight advan­tage over the other codecs since it didn’t have to adhere to any maximum bitrate constraints. Given that the test files were generally ho­mogeneous and only 10 seconds long, this ad­vantage is likely pretty slight.

Next, I focused on preset, threads, and tun­ing, which I’ll present on a codec-by-codec basis, starting with SVT-AV1. For perspective, as I covered in “The Correct Way to Choose an x264 Preset," when en­coding for even moderate-volume VOD distri­bution, it makes the most economic sense to encode using the settings that deliver the best possible overall quality. The rationale is this: Most publishers choose a target quality level for their top rung, perhaps 95 VMAF. If you use an encoding configuration that delivers lower quality, you have to boost your bitrate to hit the target.

So, it costs more to deliver that file than if you encoded with higher-quality settings. Even if an encoding configuration takes 10 times longer to encode, if it’s even slightly more efficient, given sufficient distribution volume, it makes economic sense to spend the money and encode at top quality. That’s why Netflix deployed its brute force convex hull encod­ing method, which encoded each file dozens of times to find the optimal output configura­tion. And that’s why YouTube encodes its most popular files using AV1 instead of VP9 and did so back when AV1 took well over a few hundred times longer than VP9 to encode.

Intuitively, if you’re only streaming your files to even a few thousand viewers, band­width costs aren’t a major consideration, and you’re probably not thinking about upgrading beyond H.264 or HEVC. If your view count is six figures or beyond, it starts to make sense, and at these volumes, it pays to encode at the highest possible quality to minimize band­width costs. Accordingly, that’s how I chose most encoding parameters, and I’ll show you the figures later.

Encoding with SVT-AV1

Job number one with any new codec or en­coder is figuring out the quality/encoding time trade-offs of its presets. That’s what you see in Figure 1. To produce this, you would do the following:

  • Encode one or more files using all presets and otherwise identical encoding parameters. (I used two 10-second files.)
  • Measure encoding time and whichever metrics you care to track. I used VMAF computed via its Harmonic Mean and the Low-Frame score, or the lowest-quality frame in the video file, which is a potential measure of transient quality issues.
  • Represent all three as a percentage of the longest encoding time or highest-quality score, and plot as shown in Figure 1.

Figure 1. Finding the optimal preset for SVT-AV1
Figure 1.
Finding the optimal preset for SVT-AV1

For the record, note that the longest average encoding time for the two 10-second files was about 10 minutes, which dropped to 4 seconds at quality 12. This is an excellent range of quality/throughput options that enable SVT-AV1 to serve in multiple live and VOD workflows. In contrast, the equivalent numbers for VVC were 46 minutes, dropping to 39 seconds, which ob­viously takes most live applications off the table without extensive parallelization.

As Figure 1 shows, preset 1 (between 0 and 2 on the right) delivered 100% of the quality in about 50% the encoding time of preset 0. Pre­set 2 is pretty alluring, as you only lose about 0.2 harmonic mean VMAF points and even less low-frame scoring, while cutting encoding costs by 60%. But I stuck to my guns and tested using preset 1.

SVT-AV1 Level of Parallelism

The next configuration option considered was SVT-AV1’s level of parallelism, which “controls the number of threads to create and the number of picture buffers to al­lo­cate (higher level means more parallelism).”

I tested this because it’s one of the few options that works similarly on most input files, in con­trast to B-frames, reference frames, psycho­visual optimizations, and others that may be appropriate in some instances but not in others. Testing is straightforward; encode using pre­set 1 in the various LP values shown in the col­umns in Table 2—in my case, testing on a Windows workstation with 32 threads. Check the bitrate to ensure rough equivalency, then measure encoding time and the two quality metrics. Green is good, and yel­low is bad, so four logical processors delivered both the best overall and low-frame quality. In general, with threads, you’d expect a value of one to deliver both the best encoding quality and the longest encoding time, so four was an unexpected, but happy result, since it acceler­ ated the remainder of my test encodes.

Table 2. SVT-AV1’s level of parallelism
Table 2. SVT-AV1’s level of parallelism

SVT-AV1 and Tuning

Tuning is like the crazy uncle you don’t like to acknowledge, but you have to explain to your new significant other before you take them to meet the family. Here’s the CliffsNotes explanation. In this comparison, I gauged quality using the VMAF metric. Like all full reference metrics, VMAF compares the encoded file to the source and derives a score. The more similar the en­coded file is to the source, the higher the score. The more different it is, the lower the score.

Certain encoding parameters, like adaptive quantization, are psychovisual optimizations that encode different regions of the frame us­ing different settings, in theory to improve the subjective quality of the video, which is how it would be gauged by human eyes. So, the en­coder might apply greater compression to a smooth region in the video while applying low­er compression to a complex region. Most video quality metrics, including VMAF to a degree, see these adjustments as “differences” that re­duce the metric score, even though the frame might look better to a human.

Codec developers know that researchers will use metrics to compare their codecs. So, they create simple switches that reverse the psycho­visual optimizations that reduce the score. They don’t add configurations to boost the score (and game the system); they remove configuration options that artificially lower the score. For this reason, whenever you compare codecs using metrics, you should always check the codec documentation and apply any configurations that “tune” for these metrics.VMAF is an odd case because it’s designed to incorporate aspects of the human visual sys­tem and is correlated with subjective results via machine learning. What looks better to humans should look better to VMAF.

In contrast, PSNR and SSIM are primarily difference-based metrics. You always should tune with PSNR and SSIM, but with VMAF, it’s less clear. However, after chatting with Fraunhofer (which doesn’t think VMAF fairly judges its codec anyway), I decided to tune with all of the metrics. I realize this opens up a rabbit hole for discussion, hence the crazy uncle reference. Most codecs provide guidance regarding tuning mechanisms. This is from the SVT-AV1 docs: “Specifies whether to use PSNR or VQ as the tuning metric [0 = VQ, 1 = PSNR, 2 = SSIM].” I tried all three val­ues plus no tuning mecha­nism. A setting of 2 didn’t work, and 1 provided the highest VMAF score on the tested files, so I used 1 in my comparisons.

Fraunhoffer's VVENC Encoder

Rinse and repeat the same analysis with VVenC. Figure 2 shows the presets analysis. Unlike SVT-AV1 (and x264/x265), the highest-quality preset (slower) does deliver the top quality. Note the significant trade-off, how­ever, essentially 4x the encoding time for a 0.5% increase in VMAF score and 0.82% for low frame. Still, I used the slower preset in all of my test encodes.

Figure 2. Finding the optimal preset for VVenC
Figure 2. Finding the optimal preset for VVenC

Next, I analyzed how the number of threads impacted encoding time and quality, which you can see in Table 3. Again, it was a nice surprise because the eight-thread encode was about 4x faster than one thread, although only 20% fast­er than four threads, which raises an interest­ing point for those producing on Amazon Web Services or other cloud platforms. That is, Am­azon Web Services charges linearly by cores, so an eight-core system costs twice that of a four-core system, but it will only deliver 20% greater throughput. So, even though I used eight cores for my comparison encodes, a four-core sys­tem would make more sense for production, and you might be tempted to ignore the 0.04 VMAF delta and deploy single-core systems for the lowest cost per stream.

Table 3. Testing multiple threads with VVenC
Table 3. Testing multiple threads with VVenC

To explain, four single-core systems could produce four files in just more than 3 hours. In contrast, a four-core system that costs four times as much could only produce three files and change in 3 hours (3 x 56:47), while an 8-core system that costs eight times as much could only produce four files in the same 3 hours.

For tuning, VVenC offers a Perceptual QP adaptation switch (-qpa) that improves “sub­jective video quality,” presumably to the detri­ment of metric scores.

The values are 0: off, and 1: on. I encoded both ways and found the VMAF scores to be slightly higher with this feature disabled, so I tested at 0. I tested x264 and x265 for perspective only using a late 2024 FFmpeg version. I’ve run through the preset/threads testing many times for these codecs, and veryslow is always the highest-quality preset (never placebo), and a single thread always delivers the best quali­ty. In addition, tuning for SSIM always deliv­ers the best quality, so that’s what I used for these encodes.

Encoding Practices

When testing codec quality, you typically test at four or more bitrates with each file to produce both rate-distortion curves and Bjontegaard Delta-Rate (BD Rate) comparisons. For maximum relevance, it’s best to test within the typical usage range for that codec. For premium content producers, a qual­ity level of 95 VMAF points is a common target.

When you’re encoding disparate files, this means four different bitrates for each clip. I tested different VVC bitrates to create a range from 85 to 95 VMAF points. Then I encoded all other codecs at those four datapoints.

There’s a bias against showing consolidated BD Rate curves for multiple files because out­liers can disproportionately impact the aver­age score. If you encode all 1080p30 files from 1Mbps to 15Mbps, animated files might push 95 VMAF at 1.5Mbps, while Crowdrun wouldn’t reach it at 15Mbps. Averaging the results pro­duces a meaningless score. However, if you nor­malize the results for a specific target range, as I did here, it minimizes the potential for distor­tion. That’s the theory anyway. Figure 3 pres­ents the summary rate-distortion curves for the 12 test files.
VVC is on top of SVT-AV1 by a slight margin, but what does that translate to in numbers? You can see that in Table 4, the BD Rate results. Here, VVC shows a 7.61% ad­vantage over SVT-AV1. For perspective, note that when I compared VVenC to lib­aom back in 2021, VVC had a 5.55% advantage.

Table 4. BD Rate comparisons
Table 4. BD Rate comparisons

Why do the x264 compar­isons stop at -100% for SVT-AV1 and VVenC? I haven’t studied the macro, but if you look at Figure 3, you can see that x264 never intersects the minimum quality levels of either advanced codec, which seems to make it im­possible to compute an accu­rate BD Rate comparison. If you want accurate BD Rate comparisons, you’ll have to extend the tested bitrates upward, downward, or both until H.264 intersects.

Figure 3. Consolidated rate distortion curves for all 12 test files
Figure 3. Consolidated rate distortion curves for all 12 test files

It’s worth noting that at the tested quality levels, V­­V­enC took about 4.5x longer to encode than SVT-AV1. Just to keep the numbers simple, if you assume that VVenC costs $45/hour to encode, as com­pared to $10/hour for SVT-AV1, on the plus side, VVenC would allow you to reduce your bitrate by 7.61% while delivering the same quality as SVT-AV1. If you assume a 1-hour file and a $0.01 bitrate charge per GB, VVenC becomes the more affordable option after about 29,201 com­plete views.

You can follow the logic in Table 5, which you can download here. It’s simple. SVT-AV1 cuts the encoding cost by $35 but increases the 1-hour distri­bution cost by $0.001199. Divide $35 by this increase to get to 29,201; beyond that, SVT-AV1 is the more expensive alternative. As previously mentioned, al­though this sounds like a big number, most companies con­sidering SVT-AV1 and VVC in 2025–2027 are top of the pyra­mid streamers, for which 29,201 is relatively modest. If you’re a smaller streaming producer on the other side of this number, VVenC makes little sense.

Table 5. Breakeven in viewer hours
Table 5. Breakeven in viewer hours

I recognize that this calcula­tion ignores the carbon impact of VVenC, which is 4.5x that of SVT-AV1. But this opens anoth­er can of worms that’s well be­yond the scope of this article. So, let’s venture to the playback side.

Playback Efficiency

It’s safe to say that most services care much more about mobile efficiency than efficient playback on computers. Data from both the AV1 and VVC camps is extremely positive. Speak­ing at ACM Mile-High Video 2025, Meta’s David Ronca reported that phones like the Motorola e13 play back 1080p30 video for up to 11 hours, which he deemed a “bad experience.” This in­creases to 45 hours at 720p30. As I mentioned in 2023, multiple VVC patent owners reported highly efficient 4K playback on mobile devices.

However, while the Alliance for Open Me­dia and its members have worked together to provide an efficient player and great information on how to use it, VVC users haven’t done the same. So, we don’t have access to hyper-efficient players or understand how these UGC companies ensure viewer QoE.

In a perfect world, I would have tested SVT-AV1 and AV1 on multiple mobile devices using a similar player and reported precise CPU usage and battery consumption tests. In this world, the lack of a player capable of Android/iOS playback of VVC/AV1, not to mention hav­ing only a few devices available for testing, prevented this.

So, I tested on older Windows and Mac com­puters and found that VVC required at least 2x the CPU of AV1 or more. Figure 4 shows playback using the open source MPV Windows player (go2sm.com/mpv) on a circa 2012 HP Compaq Pro 6300 with a 3.4 GHz Intel i7-3770 CPU and 16GB of RAM.

Figure 4. CPU utilization of VVC, AV1, and HEVC in software on an HP Compaq Pro 6300 with a 3.4 GHz Intel i7-3770 CPU
Figure 4. CPU utilization of VVC, AV1, and HEVC in software on an HP Compaq Pro 6300 with a 3.4 GHz Intel i7-3770 CPU

Figure 5 shows CPU utilization playing VVC, AV1, and HEVC with VLC Player on a mid-2015 MacBook Pro pow­ered by a 2.2 GHz Quad-Core Intel Core i7 with 16GB of RAM. Each core has two threads, as shown in Figure 5. While VVC took a big chunk of power from all threads, HEVC and AV1 were relatively quiet on the even-numbered threads. Again, VVC is clearly at least twice as CPU-hungry as AV1 and orders of magnitude greater than HEVC. Note that neither machine has a hardware-based HEVC decoder.

Figure 5. CPU utilization playing VVC, AV1, and HEVC with VLC Player on a mid-2015 MacBook Pro powered by a 2.2 GHz Quad-Core Intel Core i7 with 16GB of RAM
Figure 5. CPU utilization playing VVC, AV1, and HEVC with VLC Player on a mid-2015 MacBook Pro powered by a 2.2 GHz Quad-Core Intel Core i7 with 16GB of RAM

While player efficiency is undoubtedly player-specific, until proven otherwise, this sets the ex­pectation that VVC will require roughly double
the CPU resources for decoding as AV1. This means acceptable QoE on fewer devices and a low­er return on your VVC encod­ing dollars.

On a final note, I did test SVT-AV1’s fast decode option (Fig­ure 6) but found it to have a negli­gible impact on either quality or playback efficiency. I did not use this switch in my quality bench­mark encodes.

Figure 6. SVT-AV1 normal versus fast decode
Figure 6. SVT-AV1 normal versus fast decode

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Meta's David Ronca Talks Benchmarking and Deploying AV1 in the Android Ecosystem

The arc of AV1 codec adoption at large-scale content owners has been long and complex, as Meta Senior Media Software Engineer David Ronca (who also spent 12 years developing encoding solutions at Netflix) knows as well as anyone. In this interview with Jan Ozer of Streaming Media and Streaming Learning Center, Ronca discusses the integration of AV1 into the Android ecosystem, focusing on the challenges and solutions for mobile devices. 

What Is AV1?

Scheduled to be the first codec released by the Alliance for Open Media, AV1 is positioned to replace VP9 and compete with HEVC. While we don't know many details yet, the backing of the Alliance should give AV1 a significant competitive advantage.