How YouTube Encodes Videos
I recently wrote three blog posts about the codecs YouTube uses to encode videos uploaded to the site. Looking back, I realize I was flying blind for the first two posts. Then, based on data gathered using an application called youtube-dl, the third post provides the most useful hard data, including the exact encoding ladders used by Google for 4K videos with several million views.
But I’m getting ahead of myself. I should start by saying that I’m fascinated by how YouTube encodes its videos. It has the highest volume of incoming videos, a brilliant staff, a more than sufficient budget, and a clear business need to craft the optimum solution. One day this past July, I was creating a lesson for a new course and focused on the economics of codec deployment. Specifically, with AV1 taking about 18 times longer than H.264 to encode, I was wondering how many views it took to justify the cost of encoding with AV1. Where could I go for real-world evidence? YouTube, of course.
So, I started downloading 1080p videos and checking the codec in YouTube’s Stats for Nerds feature. As documented in my first blog post, I found that YouTube encoded videos with fewer than 3,000 views with H.264, started to deploy VP9 once videos accumulated 3,500-plus views, and encoded with AV1 primarily on videos with more than 34 million views (or longtail videos that would likely accumulate that many).
The two takeaways were that VP9 started to make economic sense with only a few thousand views, but AV1 required millions. Of course, YouTube recently launched Argos, a video transcoding chipset that accelerates VP9 encoding (but not AV1), so just because it makes sense for YouTube doesn’t mean that it makes sense for you.
The origin of the second YouTube-related post was a question from a reader who asked whether YouTube created an H.264 version of all encodes so that viewers without VP9 or AV1 playback capabilities would always have videos to play. The answer was yes, but only up to 1080p.
I wondered whether YouTube encoded 2K/4K versions in AV1. Unlike with the 1080p videos, I didn’t see any 2K/4K files encoded with AV1, and I guessed that YouTube didn’t deploy AV1 for UHD videos because of the encoding cost. However, in a comment on this post, a LinkedIn reader noted that he downloaded a file list for a very popular 4K music video with youtube-dl and saw complete encoding ladders for AV1 and VP9, including 2K/4K, and that H.264 stopped at 1080p. In addition, a LinkedIn colleague provided a link to download youtube-dl and documentation.
Nothing motivates a researcher more than having a public guess proven wrong, so it was down the rabbit hole again. Data produced by youtube-dl led to the third post. I downloaded a file list for 10 4K music videos with multimillion views as well as a file list for the Top Gun 2 trailer, then at about 18 million views. As previously mentioned, the file lists detailed all audio/video files created by YouTube, including resolution, data rate, and codec—a veritable treasure trove of encoding data. I also downloaded several videos from the file lists to spot-check the data.
Some key findings were that, as my LinkedIn colleague suggested, YouTube creates full AV1/VP9 ladders for very high-volume 4K videos, including 2K/4K, but only encodes to 1080p with H.264. Also, at 1080p, AV1 delivers 47% savings over H.264 and 21% savings over VP9. At 4K, AV1 delivers 28% savings over VP9.
I found these bandwidth savings particularly significant. Back in June 2019, I wrote a blog post called "Hey AOM: Where’s the Beef?" This was a response to yet another member of the Alliance for Open Media loudly proclaiming it was deploying AV1 to "embrace the open spirit," without mentioning details about how much bandwidth savings AV1 actually delivered. Open spirit sounds great, but codecs don’t achieve widespread adoption until they demonstrate significant savings. I’m pretty sure that even the cantankerous Clara Peller would find a 47% savings over H.264 at 1080p and a 28% savings over VP9 at 4K quite meaty.
Streaming Learning Center's Jan Ozer explains the importance of adjusting the rungs on your encoding ladder in this clip from his presentation at Streaming Media Connect 2021.
How can publishers compare video quality at different resolutions? There's the theoretically correct answer and then there's how it's generally done.
Once you've decided on the bitrates that you want to deliver, it's time to determine the best resolution for each one, from 360p on up to 1080p. Streaming Learning Center's Jan Ozer shows you how.
Streaming Learning Center's Jan Ozer explains how to choose data rates and resolutions for adaptive bitrate streaming to most effectively meet client expectations and end-user needs.