NAB 2019: Twitch Talks VP9, AV1 and its Five-Year Encoding Roadmap
Jan Ozer: Hi, Jan Ozer here. Here this morning with Yueshi Shen, who's a research engineer, video engineer with Twitch. And we're going to talk about what Twitch does from a codec perspective, which is a changing dynamic. Hey, Yueshi, thanks for meeting me this morning.
Yueshi Shen: Thank you, Jan. My name is Yueshi, I work for Twitch. So if you don't know Twitch, Twitch is a live-streaming platform. Basically, our content is more on esports and gaming. According to twitchtracker.com, the public information, at our peak, we have 140,000 concurrent channels, and our peak viewership is four million. So talking about codec strategy, our content is more interactive, low-latency, and because we have a lot of channels, basically we have two categories, for the head content, we are releasing VP9 next month, and in the future, we are thinking x264, VP9, and AV1, when the ecosystem become mature. For head content, it's okay to streaming multiple formats because the viewership is huge, so streaming multiple formats, although increase our cost, it still actually save our traffic cost, so it's still worth while. But for the tail content, it's very different. We can only afford streaming one single format, so our strategy is currently still doing H.264 using hardware, high-density hardware solution, but we're hoping towards 2024, 2025, the AV1 ecosystem is ready, we want to switch to AV1 100%.
Jan Ozer: Did you say 2024 and 2025?
Yueshi Shen: 2024, this is our projection right now. But on the other hand, so as I said, our AV1 release will be, for the head content will be a lot sooner, we are hoping 2022-2023 we are going to release AV1 for the head content. But for the head content, we will continue to stream dual-format, AV1, H.264. But for the tail content, we are hoping towards five years from now to AV1, whole eco, every five-year-old device supports AV1. Then, we will be switching to AV1 100%.
Jan Ozer: So you are primarily a live platform and when you say head content, you mean the primary content that gets the most views?
Yueshi Shen: That's right. So for example, esports contents and also head broadcasters.
Jan Ozer: Okay, so you've recently been considering switching to VP9, is that hardware, software, are you looking at both, what's the decision there?
Yueshi Shen: That's a very good question. For VP9, I mean, as I said, we are looking at compression gain, so for VP9 right now we are using FPGA. We have evaluate software, we haven't got data that the software can deliver the same compression gain. And our bar is x264 median at the reference, at least 25% less bit rate than x264 median, but we also have roadmap to a 35%, but this is our bar.
Jan Ozer: Okay, so you are saying that the VP9 that was generated with the FPGA encoder was same quality, 25% lower data rate?
Yueshi Shen: That's right. That's right, yeah.
Jan Ozer: Okay, and that's a live transmission?
Yueshi Shen: That's a live transmission, yeah.
Jan Ozer: So one of the fun conversations you and I had a few months back was talking about because you're such a big platform, integrating VBR encoding, as supposed to CBR encoding caused this massive wave, can you go into the decision you make about VBR versus CBR as a live platform?
Yueshi Shen: That's about right, yeah. So one thing maybe people don't know that Twitch actually maintain a private CDN, we have our own replication, our backbone, and edge servers. We signed a parent contract with ISPs. According to our operation, we don't like VBR, and this is pretty much to do if we book a pipe, basically a certain bandwidth to the ISPs if the video is VBR, it's very difficult for us to control the quality of service because our video-mapping system doesn't know how many viewers we can put into this pipe if the bitrate is changing. So this is very different from VOD; VOD is different people at the same time watching different content, but for live, it's different people, at the same time, watching the same content, so any VBR is actually going to confuse our video mapping system. We don't know how many users we shall put into a pipe.
Jan Ozer: And for the most part, I'm sure this is clear to most viewers, but your streaming, you're getting one stream in from a remote gamer, and then you're transcoding that into multiple streams, what's your typical ladder look like?
Yueshi Shen: Yeah, so at the moment, we are ingesting 1080p60 from our broadcast.
Jan Ozer: What bitrate?
Yueshi Shen: So it varies depending on the broadcaster's upload bandwidth. Typically between 6 and 8.5 Mbps. Then we will transcode into different bitrates, 720p60 at 3 Mbps, then 720p30 at 2 Mbps, then 480p, all the way down to 160p at about 200 kbps.
Jan Ozer: Okay, what's your view of objective quality metrics? Which metrics do you use and which do you trust?
Yueshi Shen: Yeah, actually that's a very good research topic. Actually there's some active research going on right now. Right now, we are doing a combination of PSNR, SSIM, and VMAF, but what we trust most at this moment for Twitch is my eyes and my colleagues' eyes. So PSNR give us some reference. It can pick up some obvious encoding error, but still we more than 50% rely on our golden eyes.
Jan Ozer: Okay, I had a conversation with a compressionist from a big OTT house last night and he was talking about using different encoding parameters depending on the rung in the ladder, maybe more noise reduction in lower rungs, and maybe sharpness. Are you doing any of that, or are you looking into any of that?
Yueshi Shen: Right now, we are not doing image pre-processing. That is something we, right now, don't have resources to do, but that's a very interesting direction.
Jan Ozer: For the people that I write for, which is typically not your level, and certainly not the level of the guy I was speaking with last night, but people, I stop at the preset, and here's what each preset does, it's got a combination of encoding parameters, but looking at the individual, he just really opened up an interesting thought, because the lower-quality streams you're going to have a lot more scaling, you're going to have a lot more. So you what do you do to improve the quality of that stream that you wouldn't think about for the 1080p pass-through, or even the 720p stream? And I have no idea. I just wondered if you have looked at that.
Yueshi Shen: We haven't looked at that, and I probably need to evaluate the ROI, the return of investment, to quantify whether this is really helping us. Okay, so another thing is most of our viewers are watching 1080p60.
Jan Ozer: That's right.
Yueshi Shen: I don't have data to tell you how to quantify the return of interest.
Jan Ozer: So when you say the most of your viewers, can you tell us the percentage, is that 95% or 62%?
Yueshi Shen: I don't have the exact number off top of my head, but it is definitely more than 50%, and depending on region. Like USA, it's pretty good internet.
Jan Ozer: What are your big regions other than USA, and what do the numbers look like there?
Yueshi Shen: So in Asia-Pacific, actually pretty good, like Singapore, Korea, are very good. Certain regions, Latin America, some Eastern Europe, but nevertheless, still for our across the board, more than 50% of viewers are watching 1080p60, and this number is a lot higher in the USA, and Western Europe.
Jan Ozer: Yeah, that makes it tough to invest that much research into the lower-quality streams.
Yueshi Shen: Yeah, we actually did some work on the lowest quality. So before our 160p was about 500 kbps, we actually did some work. It's more on the audio side, we transcode audio to a lower bitrate.
Jan Ozer: Do you have any magic numbers for PSNR or SSIM or VMAF?
Yueshi Shen: It really depends on the content. Sorry, I don't have a number at my hand right now.
Jan Ozer: All right, let's close it there, we have an appointment to get to. Yueshi, thanks for sharing all that information, and have a great show.
Yueshi Shen: All right, okay, thank you very much.
H.264 still leads the pack, says JW Player's CTO Dave LaPalomento, but for more sophisticated publishers, VP9 offers benefits including bandwidth savings.
NAB is still on for now, but now that AJA, Nikon, Adobe, and Avid have pulled out, exhibitors are taking a "watch and wait" approach as concerns about coronavirus increase and conferences across the globe cancel or postpone
Twitch Senior Video Software Engineer Nikhil Purushe explains how Twitch has approached packaging and delivery following their migration to VP9 in this clip from his Video Engineering Summit presentation at Streaming Media East 2019.
Twitch Principal Video Specialist Tarek Amara explains which factors publishers should consider when choosing encoders and making codec support decisions in this clip from his Video Engineering Summit presentation at Streaming Media East 2019.
Per-title encoding is on the way out as Brightcove and others demonstrate the value of a more holistic approach. Streaming Media's Jan Ozer interviews Brightcove's Yuriy Reznic at NAB 2019.
When Telestream Vantage customers need a large video library encoded in a hurry, they can now turn to a hybrid cloud option. Streaming Media's Jan Ozer and Telestream's Ken Haren discuss Vantage Cloud Port on the show floor at NAB 2019.
In this interview from NAB 2019, LiveU clears up some of the hype around 5G (it won't make an impact until next year and there's no special health risk) and talks about the testing it's already doing with 5G modems.
Streaming Media's Jan Ozer and NGCodec's Oliver Gunasekara discuss NGCodec's live HEVC 4k60 encoder, and why the company was wrong about the future of H.264.
Streaming Media's Jan Ozer interviews Phenix Technologies' Kyle Bank on the show floor at NAB 2019, and hears why low-latency CMAF isn't good enough for Phenix's demanding customers.
Streaming Media's Jan Ozer and Encoding.com's Greg Heil discuss findings from the 2019 Global Media Format report, such as why better codecs don't always find wide adoption.
Streaming Media's Jan Ozer and NETINT's Ray Adensamer discuss NETINT's Codensity T400, which is aimed at companies that need to do large live video encoding jobs at scale.
At NAB, NPAW showed off its new smart ads service, which not only measures whether or not an ad is delivered but looks at its streaming quality and how it impacts overall viewer engagement.
Epic Labs debuted LightFlow, one of the most exciting services to break at this year's NAB. LightFlow combines per-title encoding with network modeling, as well as per-device-type encoding. In this video interview, Epic Labs founder and CEO Alfonso Peletier explains the benefits it offers.