Low Latency Video Streaming - How Low Can You Go?
Mention the word and you will strike fear into engineers and operations folks throughout the streaming industry. Devices tremble when they hear its name. Let's talk latency—a topic where the debate and definition of deficiency is as old as the industry itself. Consider that football game you watched last weekend. When Patrick Mahomes sidearmed that 25-yard touchdown pass and you smiled thinking, "How does that guy do it, every time?," you were roughly 45 seconds behind the game itself. When you walked into the living room and saw the same game, you scratched your head at how far behind you were on your handset.
In order to understand latency, we need to examine the components in a live streaming workflow. It all begins with the live signal, whether it is a fully produced channel, or a unique feed from a towercam, helicopter, or even a single camera. That signal is then ingested into a piece of hardware generally called an encoder, where the signal is digitized. From there, the digitized version usually takes two paths—either into a cloud-based workflow where the contribution stream is then packaged into additional stream variations, or those same variations (called renditions) are created on the encoder itself for delivery. In both cases, the stream renditions are then either pushed or pulled to a content delivery network (CDN) where viewers are connecting to watch the streams via a player on a range of devices. These various pieces and parts have generally delivered streams that are roughly 30 to 60 seconds behind the live broadcast feed. By comparison, cable and satellite channels are delivered to the home 5 seconds behind the live feed. These values and the distance from the live signal to your ability to view the channel are referred to as latency.
While I won't get into the history of streaming formats, protocols, and the like, suffice it to say that with the advent of HTTP Live Streaming (HLS) in 2009 and its dependencies contributing to the great latency debate painted us all into a corner. This ensured some latency even with a new format, given that the video segments introduced a storage challenge for legacy CDNs and an assembly challenge for players. To be clear, this is no longer a problem, and CDNs like Fastly's were designed from the ground up to handle the smaller file fragments native to formats like HLS. This also helped alleviate some of the expense in developing and deploying Flash—at the time the dominant video format—and Steve Jobs had some very hot takes on the format itself. A discussion for another time.
So What Is Low Latency?
So, as I've outlined above, if several seconds of latency is normal, what does it mean to say something is low latency? In the world of video streaming, low latency infers a glass-to-glass delay of five seconds or less—subjectively. Still, some organizations and applications require even faster delivery. As a result, some are using new labels like 'ultra-low latency' and 'near real-time' where video can, in theory, be delivered at less than 1 second. This sector of streaming applications is generally reserved for interactive use cases, two-way chat, and real-time device control (this is common with live streaming from a drone).
So scrub forward in your player bar (see what I did there?) to 2020 and we are on the cusp of many players in the video space implementing Apple's new low latency HLS (LLHLS). This is great news for those who publish news, sports and time sensitive content, in that the specification provides for a path down to 1 to 3 seconds of latency. For reference, your service provider (cable/satellite) generally delivers content to your television at 5 seconds behind the live broadcast. But LLHLS isn't mandatory to achieve a solid streaming experience for your customers or viewers. Good old HLS will suit you just fine for most non-time sensitive scenarios.
Why Low Latency Matters
If news, sports, or critically time-sensitive applications are part of your business, you absolutely want low latency delivery today. Why? Simply so that your viewers can be within 2 seconds or less of the live content.
Take for example Videon's recent work with Fastly. This workflow starts with the creation of the stream renditions on our Edgecaster device, eliminating the added latency that cloud processes like transcoding and packaging introduce. While there is nothing wrong with cloud transcoding approaches, the tradeoffs are clearly delineated in added latency and cost. The stream renditions are pulled from a high-performance web server that lives on the device and are delivered through the Fastly edge cloud. The final step is to couple the workflow with a low-latency enabled player. This approach will work with a number of players. In this case, Videon worked with NexPlayer technology and have seen great results across multiple platforms. The entire workflow is seen in the diagram below.
Using this configuration and workflow, we collectively observe latency in the 1.2- to 1.7-second range. This is outstanding performance across all components of the workflow. Fastly's director of product management Dima Kumets was impressed when he saw the real-time latency readings. "Fastly's 106Tbps global network (as of September 30, 2020) delivers video for some of the largest streaming services in the world. The results we saw when working with Videon were exceptional."
We partnered with Fastly because we knew its network could deliver exceptionally high performance in this scenario. While this solution can work with most major CDNs, we were very pleased with the performance and overall experience that Fastly provides. But one key takeaway is that more and more organizations are looking at this type of architecture for low latency delivery, where video can live at the edge, close to both the source feed and the edge network. This could be a key live strategy that grows in the months to come.
A New Approach to Video Processing
To be clear, we're not applying any 'special sauce' here, nor using a new proprietary process or plugin. As the old industry salt that I am, I've seen far too many very interesting approaches to new video processes that require the user and video processor to jump through far too many hoops to get to first frame. In today's hyperscale video world where video grows exponentially every hour, we're constantly challenged to raise the bar.
Raising the bar, at least this time around, means using a new approach that expends down to the chipset. The Qualcomm Snapdragon is relatively mature ARM processing chip, and well known in the smartphone space, but is an entirely new tool to use to create and manipulate media & streams. The Snapdragon chip itself provides native audio/video functions that can be leveraged which results in far less power draw while exponentially raising the amount of video that can be processed. This means no dependency on relics of the past like graphics processing units (GPUs) and lots of cooling fans. Sure, there are some approaches similar to Videon's Edgecaster that use different chipsets, GPUs and more 'heavy iron', but for its capabilities as a video distribution and compute platform at its price pint, Edgecaster and this approach are highly effective for not only low latency use cases, but many others too. This is something we consider an unfair advantage in the market - Videon's unique relationship with Qualcomm and our ability to develop powerful and advanced edge and media processing capabilities on a device that no one else in the world has been able to do.
On the Horizon
So no matter how you coin the phrase—ultra low latency, super low latency, cold fusion low latency (I jest, of course), we've established what it means to have low latency and how to do it well. As Qualcomm's Snapdragon gains in power with subsequent chipsets, the ability to process more media at the edge and to simply do more will grow exponentially. This means increases in resolution, scale, volume and more. Further, it also means bringing more processes ancillary to video processing down to the device. This means that metadata extraction and grouping, image recognition and basic analysis and many more processes that today run in the cloud could run more cost effectively at the edge and on this device. And we're just scratching the surface. Turns out you can go low—very low—and with other media-centric functions, broad.
[Editor's note: This is a contributed article from Videon. Streaming Media accepts vendor bylines based solely on their value to our readers.]
VideoRx CTO leads viewers through a checklist for prioritizing and achieving low-latency streaming in this clip from his presentation at Streaming Media West Connect 2020.
VideoRx CTO Robert Reinhardt explains what low-latency streaming means in different scenarios and how to measure it from glass to glass in this clip from Streaming Media West Connect 2020.
Hulu's Nick Brookins and Disney's Bill Zurat discuss how their OTT services approach streaming latency and where CMAF fits into the strategy in this clip from their panel at Streaming Media West Connect 2020.