Software Decoding and the Future of Mobile Video
With AV1 hardware decode on mobile devices stuck in the mid-to-low teens as of 2025, and with VVC at zero, it’s clear that the race to supplant H.264 and HEVC will be contested with software-only decoding. As we sit here in 2025, we know that Meta has aggressively started distributing AV1 streams for software-only playback and has even co-created an open source project called VCAT (Video Codec Acid Test) to help benchmark mobile devices. We’ve heard that some VVC IP owners like Kwai, ByteDance, and Tencent are deploying VVC with software decode, but there’s little data available.
What is clear is that there’s a huge market for $80 Android phones in regions where the bandwidth savings that AV1 and VVC deliver can improve both publisher margins and delivery outcomes. Equally clear is that $80 phones can only include so much codec-specific decoding hardware, and codecs that require hardware playback just won’t compete.
The big questions are, of course, how much more efficient are AV1 and VVC than HEVC and H.264, and how efficiently do they play without hardware? So, those are the two topics I sought to address.
Let me say up front that to a certain extent, this is a fool’s errand. AV1 and VVC are codecs, and we can only compare implementations of
those codecs, in this case, SVT-AV1 3.0 and VVenC 1.13. I’ll defend this choice by noting that these open source versions of the codecs will be used by a substantial number of developers. I’ll also note that other vendors offer alternative codecs that deliver far greater savings than those available in the tested versions.
The playback side is even more complicated. There are no open source or commercial VVC players for iOS and Android, and I don’t have 60 Android devices around for testing. I did test AV1 playback on a few devices, but that supplies only half the picture. In addition, the comparative tests I ran on older Windows and Mac computers revealed that player efficiency varies as greatly as codec implementation efficiency, if not more so, therefore, even these results provide only basic guidance.
So, the TL;DR outcome is that my tests showed VVC to be about 7% more efficient encoding-wise than AV1, but it required about 2x–2.5x the CPU to play back the videos. This translates to fewer devices that can play VVC, which means less savings than AV1 will afford by delivering
slightly larger streams to many more devices.
Now that you know the end of the story, why should you bother reading the article? Well, if you’ll be encoding with SVT-AV1 or VVC, you’ll learn a bit about how to optimize your encodes, particularly the trade-offs that presets deliver, and how many logical processors to use. With SVT-AV1, I also explored the quality and playback efficiency of the fast decode option, although I didn’t find much to show for it. You can also peruse the FFmpeg command strings used for all encodes.

In addition, you’ll learn how much more efficient AV1 and VVC are than H.264, which may finally provide enough motivation to bite off that second codec you’ve been thinking about. Plus, at the end, you’ll get a good laugh at how crude my playback benchmarking efforts were.
Quality Testing Overview
I tested encoding quality with 12 10-second files, including sports, movies, concerts, and animations. Usually, I prefer to test with longer files, but this was untenable given the lengthy encoding times of the advanced codecs using the tested configurations.
Next, I identified the configuration options I would use for each codec, which are presented in Table 1. I decided to test for VOD distribution, so I configured all of the encodes to 2-second GOPs using 200% constrained VBR with default settings for I-frames at scene changes.

Table 1. Critical configuration parameters for each codec
Note that SVT-AV1 doesn’t support constrained VBR, so I tested using average bitrate mode, which may have given SVT-AV1 a slight advantage over the other codecs since it didn’t have to adhere to any maximum bitrate constraints. Given that the test files were generally homogeneous and only 10 seconds long, this advantage is likely pretty slight.
Next, I focused on preset, threads, and tuning, which I’ll present on a codec-by-codec basis, starting with SVT-AV1. For perspective, as I covered in “The Correct Way to Choose an x264 Preset," when encoding for even moderate-volume VOD distribution, it makes the most economic sense to encode using the settings that deliver the best possible overall quality. The rationale is this: Most publishers choose a target quality level for their top rung, perhaps 95 VMAF. If you use an encoding configuration that delivers lower quality, you have to boost your bitrate to hit the target.
So, it costs more to deliver that file than if you encoded with higher-quality settings. Even if an encoding configuration takes 10 times longer to encode, if it’s even slightly more efficient, given sufficient distribution volume, it makes economic sense to spend the money and encode at top quality. That’s why Netflix deployed its brute force convex hull encoding method, which encoded each file dozens of times to find the optimal output configuration. And that’s why YouTube encodes its most popular files using AV1 instead of VP9 and did so back when AV1 took well over a few hundred times longer than VP9 to encode.
Intuitively, if you’re only streaming your files to even a few thousand viewers, bandwidth costs aren’t a major consideration, and you’re probably not thinking about upgrading beyond H.264 or HEVC. If your view count is six figures or beyond, it starts to make sense, and at these volumes, it pays to encode at the highest possible quality to minimize bandwidth costs. Accordingly, that’s how I chose most encoding parameters, and I’ll show you the figures later.
Encoding with SVT-AV1
Job number one with any new codec or encoder is figuring out the quality/encoding time trade-offs of its presets. That’s what you see in Figure 1. To produce this, you would do the following:
- Encode one or more files using all presets and otherwise identical encoding parameters. (I used two 10-second files.)
- Measure encoding time and whichever metrics you care to track. I used VMAF computed via its Harmonic Mean and the Low-Frame score, or the lowest-quality frame in the video file, which is a potential measure of transient quality issues.
- Represent all three as a percentage of the longest encoding time or highest-quality score, and plot as shown in Figure 1.

Figure 1. Finding the optimal preset for SVT-AV1
For the record, note that the longest average encoding time for the two 10-second files was about 10 minutes, which dropped to 4 seconds at quality 12. This is an excellent range of quality/throughput options that enable SVT-AV1 to serve in multiple live and VOD workflows. In contrast, the equivalent numbers for VVC were 46 minutes, dropping to 39 seconds, which obviously takes most live applications off the table without extensive parallelization.
As Figure 1 shows, preset 1 (between 0 and 2 on the right) delivered 100% of the quality in about 50% the encoding time of preset 0. Preset 2 is pretty alluring, as you only lose about 0.2 harmonic mean VMAF points and even less low-frame scoring, while cutting encoding costs by 60%. But I stuck to my guns and tested using preset 1.
SVT-AV1 Level of Parallelism
The next configuration option considered was SVT-AV1’s level of parallelism, which “controls the number of threads to create and the number of picture buffers to allocate (higher level means more parallelism).”
I tested this because it’s one of the few options that works similarly on most input files, in contrast to B-frames, reference frames, psychovisual optimizations, and others that may be appropriate in some instances but not in others. Testing is straightforward; encode using preset 1 in the various LP values shown in the columns in Table 2—in my case, testing on a Windows workstation with 32 threads. Check the bitrate to ensure rough equivalency, then measure encoding time and the two quality metrics. Green is good, and yellow is bad, so four logical processors delivered both the best overall and low-frame quality. In general, with threads, you’d expect a value of one to deliver both the best encoding quality and the longest encoding time, so four was an unexpected, but happy result, since it acceler ated the remainder of my test encodes.

Table 2. SVT-AV1’s level of parallelism
SVT-AV1 and Tuning
Tuning is like the crazy uncle you don’t like to acknowledge, but you have to explain to your new significant other before you take them to meet the family. Here’s the CliffsNotes explanation. In this comparison, I gauged quality using the VMAF metric. Like all full reference metrics, VMAF compares the encoded file to the source and derives a score. The more similar the encoded file is to the source, the higher the score. The more different it is, the lower the score.
Certain encoding parameters, like adaptive quantization, are psychovisual optimizations that encode different regions of the frame using different settings, in theory to improve the subjective quality of the video, which is how it would be gauged by human eyes. So, the encoder might apply greater compression to a smooth region in the video while applying lower compression to a complex region. Most video quality metrics, including VMAF to a degree, see these adjustments as “differences” that reduce the metric score, even though the frame might look better to a human.
Codec developers know that researchers will use metrics to compare their codecs. So, they create simple switches that reverse the psychovisual optimizations that reduce the score. They don’t add configurations to boost the score (and game the system); they remove configuration options that artificially lower the score. For this reason, whenever you compare codecs using metrics, you should always check the codec documentation and apply any configurations that “tune” for these metrics.VMAF is an odd case because it’s designed to incorporate aspects of the human visual system and is correlated with subjective results via machine learning. What looks better to humans should look better to VMAF.
In contrast, PSNR and SSIM are primarily difference-based metrics. You always should tune with PSNR and SSIM, but with VMAF, it’s less clear. However, after chatting with Fraunhofer (which doesn’t think VMAF fairly judges its codec anyway), I decided to tune with all of the metrics. I realize this opens up a rabbit hole for discussion, hence the crazy uncle reference. Most codecs provide guidance regarding tuning mechanisms. This is from the SVT-AV1 docs: “Specifies whether to use PSNR or VQ as the tuning metric [0 = VQ, 1 = PSNR, 2 = SSIM].” I tried all three values plus no tuning mechanism. A setting of 2 didn’t work, and 1 provided the highest VMAF score on the tested files, so I used 1 in my comparisons.
Fraunhoffer's VVENC Encoder
Rinse and repeat the same analysis with VVenC. Figure 2 shows the presets analysis. Unlike SVT-AV1 (and x264/x265), the highest-quality preset (slower) does deliver the top quality. Note the significant trade-off, however, essentially 4x the encoding time for a 0.5% increase in VMAF score and 0.82% for low frame. Still, I used the slower preset in all of my test encodes.

Figure 2. Finding the optimal preset for VVenC
Next, I analyzed how the number of threads impacted encoding time and quality, which you can see in Table 3. Again, it was a nice surprise because the eight-thread encode was about 4x faster than one thread, although only 20% faster than four threads, which raises an interesting point for those producing on Amazon Web Services or other cloud platforms. That is, Amazon Web Services charges linearly by cores, so an eight-core system costs twice that of a four-core system, but it will only deliver 20% greater throughput. So, even though I used eight cores for my comparison encodes, a four-core system would make more sense for production, and you might be tempted to ignore the 0.04 VMAF delta and deploy single-core systems for the lowest cost per stream.

Table 3. Testing multiple threads with VVenC
To explain, four single-core systems could produce four files in just more than 3 hours. In contrast, a four-core system that costs four times as much could only produce three files and change in 3 hours (3 x 56:47), while an 8-core system that costs eight times as much could only produce four files in the same 3 hours.
For tuning, VVenC offers a Perceptual QP adaptation switch (-qpa) that improves “subjective video quality,” presumably to the detriment of metric scores.
The values are 0: off, and 1: on. I encoded both ways and found the VMAF scores to be slightly higher with this feature disabled, so I tested at 0. I tested x264 and x265 for perspective only using a late 2024 FFmpeg version. I’ve run through the preset/threads testing many times for these codecs, and veryslow is always the highest-quality preset (never placebo), and a single thread always delivers the best quality. In addition, tuning for SSIM always delivers the best quality, so that’s what I used for these encodes.
Encoding Practices
When testing codec quality, you typically test at four or more bitrates with each file to produce both rate-distortion curves and Bjontegaard Delta-Rate (BD Rate) comparisons. For maximum relevance, it’s best to test within the typical usage range for that codec. For premium content producers, a quality level of 95 VMAF points is a common target.
When you’re encoding disparate files, this means four different bitrates for each clip. I tested different VVC bitrates to create a range from 85 to 95 VMAF points. Then I encoded all other codecs at those four datapoints.
There’s a bias against showing consolidated BD Rate curves for multiple files because outliers can disproportionately impact the average score. If you encode all 1080p30 files from 1Mbps to 15Mbps, animated files might push 95 VMAF at 1.5Mbps, while Crowdrun wouldn’t reach it at 15Mbps. Averaging the results produces a meaningless score. However, if you normalize the results for a specific target range, as I did here, it minimizes the potential for distortion. That’s the theory anyway. Figure 3 presents the summary rate-distortion curves for the 12 test files.
VVC is on top of SVT-AV1 by a slight margin, but what does that translate to in numbers? You can see that in Table 4, the BD Rate results. Here, VVC shows a 7.61% advantage over SVT-AV1. For perspective, note that when I compared VVenC to libaom back in 2021, VVC had a 5.55% advantage.

Table 4. BD Rate comparisons
Why do the x264 comparisons stop at -100% for SVT-AV1 and VVenC? I haven’t studied the macro, but if you look at Figure 3, you can see that x264 never intersects the minimum quality levels of either advanced codec, which seems to make it impossible to compute an accurate BD Rate comparison. If you want accurate BD Rate comparisons, you’ll have to extend the tested bitrates upward, downward, or both until H.264 intersects.

Figure 3. Consolidated rate distortion curves for all 12 test files
It’s worth noting that at the tested quality levels, VVenC took about 4.5x longer to encode than SVT-AV1. Just to keep the numbers simple, if you assume that VVenC costs $45/hour to encode, as compared to $10/hour for SVT-AV1, on the plus side, VVenC would allow you to reduce your bitrate by 7.61% while delivering the same quality as SVT-AV1. If you assume a 1-hour file and a $0.01 bitrate charge per GB, VVenC becomes the more affordable option after about 29,201 complete views.
You can follow the logic in Table 5, which you can download here. It’s simple. SVT-AV1 cuts the encoding cost by $35 but increases the 1-hour distribution cost by $0.001199. Divide $35 by this increase to get to 29,201; beyond that, SVT-AV1 is the more expensive alternative. As previously mentioned, although this sounds like a big number, most companies considering SVT-AV1 and VVC in 2025–2027 are top of the pyramid streamers, for which 29,201 is relatively modest. If you’re a smaller streaming producer on the other side of this number, VVenC makes little sense.

Table 5. Breakeven in viewer hours
I recognize that this calculation ignores the carbon impact of VVenC, which is 4.5x that of SVT-AV1. But this opens another can of worms that’s well beyond the scope of this article. So, let’s venture to the playback side.
Playback Efficiency
It’s safe to say that most services care much more about mobile efficiency than efficient playback on computers. Data from both the AV1 and VVC camps is extremely positive. Speaking at ACM Mile-High Video 2025, Meta’s David Ronca reported that phones like the Motorola e13 play back 1080p30 video for up to 11 hours, which he deemed a “bad experience.” This increases to 45 hours at 720p30. As I mentioned in 2023, multiple VVC patent owners reported highly efficient 4K playback on mobile devices.
However, while the Alliance for Open Media and its members have worked together to provide an efficient player and great information on how to use it, VVC users haven’t done the same. So, we don’t have access to hyper-efficient players or understand how these UGC companies ensure viewer QoE.
In a perfect world, I would have tested SVT-AV1 and AV1 on multiple mobile devices using a similar player and reported precise CPU usage and battery consumption tests. In this world, the lack of a player capable of Android/iOS playback of VVC/AV1, not to mention having only a few devices available for testing, prevented this.
So, I tested on older Windows and Mac computers and found that VVC required at least 2x the CPU of AV1 or more. Figure 4 shows playback using the open source MPV Windows player (go2sm.com/mpv) on a circa 2012 HP Compaq Pro 6300 with a 3.4 GHz Intel i7-3770 CPU and 16GB of RAM.

Figure 4. CPU utilization of VVC, AV1, and HEVC in software on an HP Compaq Pro 6300 with a 3.4 GHz Intel i7-3770 CPU
Figure 5 shows CPU utilization playing VVC, AV1, and HEVC with VLC Player on a mid-2015 MacBook Pro powered by a 2.2 GHz Quad-Core Intel Core i7 with 16GB of RAM. Each core has two threads, as shown in Figure 5. While VVC took a big chunk of power from all threads, HEVC and AV1 were relatively quiet on the even-numbered threads. Again, VVC is clearly at least twice as CPU-hungry as AV1 and orders of magnitude greater than HEVC. Note that neither machine has a hardware-based HEVC decoder.

Figure 5. CPU utilization playing VVC, AV1, and HEVC with VLC Player on a mid-2015 MacBook Pro powered by a 2.2 GHz Quad-Core Intel Core i7 with 16GB of RAM
While player efficiency is undoubtedly player-specific, until proven otherwise, this sets the expectation that VVC will require roughly double
the CPU resources for decoding as AV1. This means acceptable QoE on fewer devices and a lower return on your VVC encoding dollars.
On a final note, I did test SVT-AV1’s fast decode option (Figure 6) but found it to have a negligible impact on either quality or playback efficiency. I did not use this switch in my quality benchmark encodes.

Figure 6. SVT-AV1 normal versus fast decode
Related Articles
The arc of AV1 codec adoption at large-scale content owners has been long and complex, as Meta Senior Media Software Engineer David Ronca (who also spent 12 years developing encoding solutions at Netflix) knows as well as anyone. In this interview with Jan Ozer of Streaming Media and Streaming Learning Center, Ronca discusses the integration of AV1 into the Android ecosystem, focusing on the challenges and solutions for mobile devices.
21 Feb 2025
Scheduled to be the first codec released by the Alliance for Open Media, AV1 is positioned to replace VP9 and compete with HEVC. While we don't know many details yet, the backing of the Alliance should give AV1 a significant competitive advantage.
03 Jun 2016