Encoding at Scale for Live Video Streaming
Transcoding Today: Current Approaches and Alternatives
How do you deliver transcoding in the network today? Most of the transcoding and encoding done today is using software on a CPU. The biggest benefit of doing it in software is it's flexible (Figure 5, below). When you're not doing encoding or transcoding, you can use it for your billing. You can use it for your data analytics. But in reality, this is going to give you your lowest density and capacities. There are huge resource requirements and economic limitations to the scalability of this type of architecture.
Figure 5. Technology Alternatives for Real-Time Video Encode/Transcode
Typically, in one-rack unit server, if you're doing 1080p H.265 encoding, you might get five streams per server, so how are you going to scale that? How are you going to scale that architecture for the thousands and thousands of encodings that need to happen to meet the demand for user-generated content? Very quickly, especially for live streaming, you realize that you're going to need some sort of hardware assist. Intel has the Quick Sync Video and some of their media software kits that go with that. That provides some level of improvement, but the quality sometimes starts to deteriorate.
The more common one might be GPUs, adding GPUs. Again, it can improve your density and capacity, but GPUs are also known as a big power hog in your data centers as well, so there's a limit to the scalability around that as well.
The most promising of these categories would be FPGAs, and there's a number of vendors that are showing encoding and transcoding solutions on FPGAs. They use very specialized software called RTL that runs on those things. You're going to get your very high densities and capacities. The power efficiency starts to go a little bit better, but you do need to package that FPGA somewhere in the network whether it's in the computer or in purpose-built hardware, so your flexibility is starting to go down a little bit.
Finally, I wanted to make another point that all these options that you see in Figure 5: you can buy these options or you can also rent them on a pay-per-use model. Here is some of the mappings of the Amazon EC2 instances to these various categories including the F1 Instances, which is an FPGA.
I think it's commonly understood in the industry that doing encoding and transcoding in an ASIC, it's going to give you your best densities, your lowest cost, your best real-time performance, and your 80% less power, but the challenge that we've had in the industry in introducing these things is it's sometimes understood that there's scalability problems because typically, these products are built into purpose-built hardware. Also, it's less flexible. It's hard to change the software on these types of chips, and also, sometimes, the encoding quality is something you need to look at.
The ideal solution would be an ASIC to get you the performance, the density, and things like that, but in an architecture that addresses some of these concerns.
Enter the Codensity T400 Video Transcoder
At Streaming Media West this year, we introduced the Codensity T400 Video Transcoder (Figure 6, below). I can hold it in my hand. We can do eight sessions of H.265 video in this little device here, and I'll talk a little bit about this form factor later, but suffice it to say that this is designed to do scalable H.264 and H.265 video encoding and transcoding.
Figure 6. Codensity T400 Video Transcoder
It's got killer capacity. We can do 4K transcodings in this up to 60 frames per second in one of these little modules, so that's the equivalent of eight 1080p streams that I already mentioned, so it delivers more capacity at a lower cost and significantly less power, six watts. It is less than a light bulb to run one of these things, and you put this in your network, and you're going to get capacity and low power.
The other benefit of this is predictable low latency. With software encoding, when you load up those CPUs, when you're getting up to max utilization, it can do funny things to the latency, and we all know that latency is very, very important for live video streaming. With this, you get predictable low latency as little as one frame delay through the box.
These are the benefits, so let's address some of the concerns that some of the audience might have with this. First is, how do I achieve that scalability and flexibility? For that, we brought in some innovations from cloud storage architectures, and we're now applying it to the encoding and transcoding industry (Figure 7, below).
Figure 7. Scalability through innovations in SSD cloud storage
Many of you are probably familiar with the mega-trend in the storage world away from hard disk drives to solid-state drives based on NAND flash memory. With that trend has also come a review of the protocol. In the old days, it was SATA and SAS hard drives. Those were way, way, way too slow for NAND memory, so they've invented instead something called the Non-Volatile Memory Express (NVMe) protocol. It's got high, high throughput, very low latency, but most importantly, it's extendable. It's extendable for other applications like encoding.
Now, let's talk about the form factors. People are going to SSDs. The most common form factor to buy an SSD is called a U.2. There are devices from many, many vendors out there, a class of servers that's growing called an NVMe server, and we have some pictures of that on the right-hand side there where in the storage world, you would buy your SSD drive and you plug it into your NVMe thing, and that's how you build the capacity. An NVMe can be NVMe servers. It can be NVMe storage arrays. It can be even NVMe over fabric, so we can add very, very high-density encoding capacity with this type of architecture.
At the bottom of Figure 7, you see a picture of an NVMe server. It has 10 slots in the front, so 10 times 8 gives you that 80 sessions of 1080p H.265 sessions in a 1RU. The flexibility comes by putting it into this package. You buy it from NVMe Server, so you're not locked in by a proprietary hardware. There's many vendors providing that, and if you don't want to use these modules anymore, you can use that, that infrastructure capacity for your storage, so it's a unique idea to address a common pushback on these types of architectures.
The other thing I want to touch on is encoding quality. In the video on-demand world, you have the benefit of time, so you can do your encodings. You can choose slow presets. You can do double-pass encoding. In software, it's very easy to get very, very high-quality encodings in a software-on-CPU type of encoding architecture, but we need to do this in real time. You need to create that encoding ladder in real time.
Figure 8 (below) shows the results of some internal benchmarking we’ve just done. This is the PSNR quality score for our T400 solution compared to some comparables, so software encoding at faster preset. The NVIDIA GPU, very commonly used to scale your encoding capacity. We showed that we got a better quality scores in that, and then the Intel Quick Sync Video (QSV) at the bottom. We've got a solution. It got decent quality, and I know this is internal benchmarking, but it's been validated by some of our early customers as well.
Figure 8. Codensity T400 Video Transcoder – Encoding Quality
The final thing I want to touch on is how do we integrate the T400 into the network. This is another elegant solution where many people here in the industry are using FFmpeg, an open-source library of video processing and transcoding types of functions.
So, it's a server. It's got Intel multi-core architecture and things like that, but because you're offloading the transcoding on to these devices, the utilization on those servers is very low, so you can add your FFmpeg, and we add a software patch that goes into libavcodec library of the FFmpeg, and through that patch, you can then recognize the resources that you have in that particular server. If you're using FFmpeg, it's a very, very straightforward approach to adding this capacity to add significant capacity and performance to your architecture (Figure 9, below).
Figure 9. Codensity T400 Video Transcoder – Software Integration
Early Customer Validation
We also have some early customers (Figure 10, below). We announced the product on November 13, and we've had some early feedback from some of the customers we're working with in China. On the left-hand side of Figure 10 is a CDN operator called China Net Center or Wangsu, and they validated that, "Hey, this makes a lot of sense. I can see how this can deliver the density and performance at lower hardware cost.”
Figure 10. Early customer validation
But what's more interesting here is YY.com. Most people haven't heard of YY.com in North America, but it is a significantly huge live video streaming social media site in China. It's actually listed on the NASDAQ with a market cap of 3.5 billion, 75 million annual users, and their business model is exactly what I was talking about earlier. They're working at that right side of the edge, so they have many, many individuals streaming live video content, and they would have maybe tens or hundreds of followers following those video streams.
Their infrastructure was getting crushed with the encoding requirements, and they were absolutely thrilled when they started talking with us, so we're continuing to do some early testing with them, but more importantly, I was pleased with them saying, "Hey, the quality is pretty good. We're impressed. It's at least as good or better than what we're doing today."
In summary, I want to say that live video streaming volumes are growing at 15X. I think the encoding capacity problem is even bigger than that. It's going to grow because of user-generated content, increasing video resolutions, and also, the increasing use of high-compression codecs.
We’ve gone through some of today's encoding architectures and some of those inefficiencies. You're not going to be able to scale to what the industry needs by putting in a rack server for every five sessions that you need. It's just not going to work. You need to look for alternative approaches. We think our approach has some merit, because it delivers high-quality scalability and performance at a lower cost compared to alternative solutions. We're the first to combine the innovations of ASIC with those storage cloud types of architectures, and we're already gotten positive feedback from some early customers.
This article is Sponsored Content
Despite all the hype around the cloud, plenty of use cases still call for on-prem video encoding. Here's what to look for when choosing a solution.
Software encoding, especially in the cloud, is all the rage these days. So why stick with on-premises hardware? There are good reasons for keeping operations in-house.
Streaming Media West provided a look at new options for demanding live cloud transcoding jobs. Here's a rundown for those who couldn't be there.