NAB 2019: Brightcove Talks Cost Savings and QoE Improvements From Context-Aware Encoding
Jan Ozer: Jan Ozer here from the NAB, I'd say show floor, but we're in a booth, in a Brightcove booth. I'm with Yuriy Reznik, who is a Technology Fellow for Brightcove, hey Yuriy.
Yuriy Reznic: My pleasure being here. Thanks for stopping by.
Jan Ozer: My pleasure, so one of the most interesting things that I'm seeing here at this NAB is the transition from content-aware encoding, where we encoded each file differently because of the underlying complexity, to more of a holistic look at not only incorporating the content, but incorporating the effective throughput to the viewers, what type of device they're viewing it on, and coming up with an encoding ladder that really makes sense for the entire infrastructure, rather than just the content. And there's a couple companies doing that here, Brightcove has been the most aggressive in my mind, in both publishing about it and disclosing what they're doing. Yuriy has published some brilliant papers, and he had a discussion here at NAB, and he's going to tell us both what the multi-format delivery system architecture is all about, if that's the product name, and he's also going to tell us about the optimal ladder, kind of a smaller problem but one that's dear to me, the optimal ladder for hybrid, either DASH or HLS, streams that include both H.264 and HEVC. So we're going to start with the entire picture and then we'll move to the second, is that okay?
Yuriy Reznic: Yes. So what you can see here is a sketch of an architecture of the end-to-end delivery systems that we have. And, at Brightcove we call it Video Cloud, and it's been very first online video platforms. It was invented as a category of products. And it's evolved and it's been developed to include many additional technologies to live in the reality of multi-format, multi-product world. So one of these technologies is dynamic transmuxing or what they call dynamic delivery system where they actually don't encode video to different DASH or HLS streams. We encode them to intermediate representation and then we transmux them dynamically. This save dramatically in terms of amounts of data that we need to store internally and we've figured it out how a couple of CDNs such that the number of transmux separations that we have to execute is also minimized and such that the whole system works very smoothly and reliably on a grand scale. But what makes things truly unique and this is something we started doing a few years ago and I'm actually delighted to know that many other companies now repeating same path is that we started incorporating end-to-end optimizations where we collect data from analytics to see what are the bandwidth distributions, sending videos to different categories of devices. What those devices are. What percentages of those devices, as specific to each separator or specific to each geography of viewers and once you know all this specifics of the context of the operator of course you could optimize ladder to make delivery best for this particular operator. If you stream to somebody in India over very slow mobile networks of course you would want big bitrates that are more densely centered around low bitrates which will actually be used by this operator as opposed to someone in Japan streaming to TV sets where 1080p is only thing you really need to worry about. So that's what we call context-aware encoding. We did it, the project started actually 2016 when we started working on this and we shipped it in I think, mid-2017 and it's been up in market for quite a while now. And of course, again, I'm delighted the trend is, the world is reaching the masses and we see other guys following. But it is, if you think about, building a whole system, it's not just profile generation it's taking into account the entire end-to-end delivery process and doing it as specific to your delivery architecture and to the customer and operator and coupling with CDNs as they set it up because all those details effect statistics. All characteristics of CDNs effect statistics, characteristics of how you connect them, geography, everything. So the system we have is highly flexible, it allows you to specify all these parameters and it can optimize to every nuance of the system as you actually have in practice. So that's the end-to-end aspect of it.
Jan Ozer: So let me ask a question. So, you're basically saying that if you were delivering to a high-bandwidth country you would change your encoding ladder to have a disproportionately high share of high-resolution, high-data-rate streams whereas if you were, as you said, if you were encoding for delivery in India, you would focus on the lower end and put more of your emphasis there. So and that's not a real-time thing, I mean, that's after N videos, you have enough experience to start to fine-tune your encoding settings to match your delivery experience.
Yuriy Reznic: Absolutely. You need to measure statistics somehow and if this is the operator that's already been working in the system there's some profiles, you just learn all the characteristics from analytics as data coming in. You could measure the bandwidth using loads from CDNs also from feedback from clients. Sometimes you need to correlate those data because client level like analytics and service analytics don't match always, but that's really where the loop is getting closed.
Jan Ozer: So how would you measure overall success or failure of the system? Are you seeing, is it a higher, I'm sure it's a higher QoE, but how would you measure that?
Yuriy Reznic: It is higher efficiency at the end of the day. It might translate into better quality of experience, if you network-bound it or it could translate into lower bandwidth cost and into lower CDN costs if your network is fantastic, but we just deliver highest encoding efficiency possible. I actually have one slide to demonstrate it. Let me just scroll for you. It's been a talk I was giving. So there are few some detail that probably not needed but this is actually good. Hold on. Just one more. So this is example when we have three types of operators. One is very limited networks about 3 megabits is average bit rate across different categories of devices and most common category in this case is mobile. And the opposite end it's an operator who is sending predominantly to TV sets and his average bandwidth 35 megabits, factor of 10. And when we do encoding we compare to the Apple recommended profile which has 9 ranks or 9 renditions, and you can see that for operator with best network, we generated only five renditions and the reason actually the highest rendition is only ones that's really needed but to stay by guidelines and have 100% increments between different rates and start with something very little that's what we need to do. We need to insert few of this renditions and the total number becomes just 5. So you're of course just wasting some of the renditions but you're wasting much less as opposed to the entire 9-rendition encoding ladder and the Apple default guidelines. And if you look at the profile for operator one, which was bandwidth limited, there are more renditions but they are more densely centered around lower bitrates like 125 kilobits, 223, 398, and the resolutions are smaller there to enable such lower bitrates. So completely different profiles and then if you look at the savings that we achieve, this is the relative savings, so if minus means we reduce, so the number of renditions goes down from 22% to 44%. The storage goes down of course for operator three even more because there are less renditions, but interestingly bandwidth also goes down across but not uniformly. For operator three it's over 30% that we can save because that network is perfect. It's only highest renditions is being pulled all the time. The most you save on that renditions that's what matters. But for operator one which pulls the mix of those renditions in the middle, the bandwidth saving looks less, it's about 8%, but then we look at average resolution, it's up by 27%. So we actually give much better quality in the context of this operator. And if you look at other characteristics like buffering ratios, startup times they also get improved all across. So it's both quality of experience and savings in terms of cost of delivery that we can bring in this context.
Jan Ozer: Are these theoretical numbers or are these real numbers?
Yuriy Reznic: This is the real numbers measured for three different operators.
Jan Ozer: Okay, impressive. Okay, so let's go to the mixed encoding ladder, if you don't mind. So again, the problem that we're trying to solve here is we know that HEVC is playable by a broad range of both Android and iOS devices, how to we integrate HEVC into an encoding ladder so we can present that using both ABR formats. So that's a problem that you've considered. Have you rolled this out or is it just something that's kind of in a nascent stage?
Yuriy Reznic: It is in trials, it is in state of going from research to production, but I can show demos.
Jan Ozer: Okay.
Yuriy Reznic: But the fundamental problem here is that, yes, we have two codecs but then not two categories of devices there are actually three categories of devices that can receive them.
Jan Ozer: Right.
Yuriy Reznic: There is a category of devices that can only receive legacy H.264 stream, then there is a category of devices that could receive only either H.264 ladder or HEVC ladder but you cannot switch between them and that will include majority of DASH devices, even though in DASH there is a way to enable switching across adaptation sets by using so-called supplemental property, but only few players realistically support it as of now. But then there is a third category of devices that can play both codecs and then can switch between them simultaneously and seamlessly. That's most new Apple devices. And if you realize there are these three category of devices that exist turns out that you can generate ladder that will be optimal when you stream to a mix of these three different categories of devices such that you don't need to use twice many streams.
Jan Ozer: Which is what Apple recommends.
Yuriy Reznic: Yes, and why you don't need to use twice many streams is because you could reuse H.264 streams if you just putting them in intermediate positions that when HEVC streams you can see that by following intelligently picking those bitrates you can actually have a better approximation of re-distortion curve going from worst quality to the highest quality and picking different codecs across. And this gives you finer-grain adaptation for those devices that can switch seamlessly between those codecs.
Jan Ozer: I mean, HEVC is a more efficient codec, so shouldn't that get screwed up at times?
Yuriy Reznic: Well, if it would be efficient like by 75% that probably would prevent this from happening but if it's only better by 10 or 15 or 25% which we see for a number of content it actually becomes a situation when you can effectively switch between those.
Jan Ozer: Do you have the actual encoding ladder in a table in the next slide?
Yuriy Reznic: This is just an illustration of rate allocation for this particular example I don't have another slide to show.
Jan Ozer: Oh, so you don't have another ladder for, that's what I wondered. Okay, cut the interview, we don't have this, I'm teasing. Can you show us those rates or get us --
Yuriy Reznic: Yes I could --
Jan Ozer: I mean, can you present, can I somehow get a ladder from there?
Yuriy Reznic: Yes, of course, this is actually published. There is a paper with all this example --
Jan Ozer: So I should be able to figure it out?
Yuriy Reznic: Of course, but even from the numbers that we could see here, I would say that this rate for example, and this is encoded in H.264 is about 300 kilobits and the next --
Jan Ozer: Okay. I'll send the table to you and you say it's either totally off base or not. Okay, so it was probably two-and-a-half years ago, the end of 2015 when Netflix came out with per-title encoding and the encoding ladder was dead and now it looks like per-title encoding is on the way out for a more holistic approach that incorporates all the elements that Yuriy talked about during his talk. So listen, I really appreciate I couldn't, he's got some lovely white papers out there but they're challenging for none math majors like me and I appreciate you taking the time to explain it to us.
Yuriy Reznic: It's my pleasure. Thank you so much.
Jan Ozer: Okay, signing off from NAB. I'm Jan Ozer, thank you.
When Telestream Vantage customers need a large video library encoded in a hurry, they can now turn to a hybrid cloud option. Streaming Media's Jan Ozer and Telestream's Ken Haren discuss Vantage Cloud Port on the show floor at NAB 2019.
In this interview from NAB 2019, LiveU clears up some of the hype around 5G (it won't make an impact until next year and there's no special health risk) and talks about the testing it's already doing with 5G modems.
Twitch uses a head-and-tail encoding strategy where popular content (the head) is encoded one way and less popular content (the tail) another. In this interview, a Twitch engineer explains what's on the video gaming powerhouse's five-year roadmap.
Streaming Media's Jan Ozer and NGCodec's Oliver Gunasekara discuss NGCodec's live HEVC 4k60 encoder, and why the company was wrong about the future of H.264.
Streaming Media's Jan Ozer interviews Phenix Technologies' Kyle Bank on the show floor at NAB 2019, and hears why low-latency CMAF isn't good enough for Phenix's demanding customers.
Streaming Media's Jan Ozer and Encoding.com's Greg Heil discuss findings from the 2019 Global Media Format report, such as why better codecs don't always find wide adoption.
Streaming Media's Jan Ozer and NETINT's Ray Adensamer discuss NETINT's Codensity T400, which is aimed at companies that need to do large live video encoding jobs at scale.
At NAB, NPAW showed off its new smart ads service, which not only measures whether or not an ad is delivered but looks at its streaming quality and how it impacts overall viewer engagement.
Epic Labs debuted LightFlow, one of the most exciting services to break at this year's NAB. LightFlow combines per-title encoding with network modeling, as well as per-device-type encoding. In this video interview, Epic Labs founder and CEO Alfonso Peletier explains the benefits it offers.
Companies and Suppliers Mentioned