Latency Sucks! So Which Companies Are Creating a Solution?
It's one of the biggest challenges facing live video streaming today, and it's remained frustratingly stubborn. Explore the technical solutions that could finally bring latency down to the 1-second mark.
Learn more about the companies mentioned in this article in the Sourcebook:
What is latency? In a word, it means delay. Why is latency important? That’s a much longer—and, for streaming, a much more costly—answer. Because, when it comes to streaming, latency sucks.
To put latency in the context of streaming media, let’s first consider which of two latencies we are referring to. That’s a question that needs to be asked, whether you’re a live content producer looking to hire a streaming company to acquire a broadcast feed, or you’re planning to do it yourself.
The first type of latency is player startup time. At the request of numerous companies in the industry, I’ve had the chance to perform extensive tests around this type of latency, measuring the time difference between a user request to start a stream and the player’s actual response time.
It’s become a more rigorous and scientific endeavor in recent years, as time-to-start average rates have dropped from multiple seconds to less than 1 second. Gone are the days of the stopwatch—they’re replaced now by reams of data from logging capture tools. The methodology we employ is to run a series of tests, using our impartial test bed with standardized content, and then average out the results for each tested player.
These days, industry averages range in the 900–1,200 millisecond (ms) range from request to playback-ready state, but some customized apps are showing averages down around the 500–650 ms range. These time-to-play tests, which occur in two controlled environments using two types of internet service provider to filter out any single-location anomalies, only tell half the story, though.
The other latency issue is what occurs both before and after the user clicks on a link to request a live stream. This latency could be referred to as a lag time, meaning the time the stream itself lags behind the actual event or live broadcast.
Maxim Erstein (right), an engineer at Unreal Streaming Technologies, offers a good reason why we need to differentiate between the two types of latencies.
"Stream startup time is the amount of time between the moment [when] a user clicks on the Play button and the moment the stream starts playing,” says Erstein. “So you can achieve very low stream startup time with chunk-based protocols like HLS and MPEG-DASH, but only if the segments are large and already reside on the server. The player will download them instantly and will start playing immediately.”
You can achieve very low stream startup time with chunk-based protocols such as HLS and MPEG-DASH, says Unreal Streaming Technologies engineer Maxim Erstein, but only if the segments are large and already reside on the server.
The problem, as Erstein told me, is that pre-existing chunks inherently mean significant lag times (aka high latencies). The reverse is also true.
“So when the latency is low, the startup time can be higher,” says Erstein, adding startup times are also affected by the minimal amount of video frames the browser needs before an HTML5 player begins playback. “It depends on how many frames the MSE [media source extensions] player needs to buffer before it can start playing. In Chrome it’s just few frames, but in Internet Explorer it’s like 20–30 frames.”
Using over-the-air (OTA) broadcast as a measurement, most cable television delivery adds about half a second of delay, which equates to an average of 500 ms or 15 frames. The lag time itself is only noticeable in digital cable delivery, and it is such a short lag time that it often represents itself in a multitelevision household as an annoying echo between two rooms watching the same live linear content.
Streaming lag times, though, can be well in excess of 30 seconds and often in excess of a full minute, depending on which type of streaming protocol is used.
Protocols and Latency
What do protocols have to do with latency? The illustration below may best illustrate the answer. Developed by Wowza Media Systems in conjunction with feedback from the author, the graphic shows a continuum of streaming latencies and use cases, with additional detail about the average latency expectancies by protocol. Each protocol is shown within the continuum arrow as a one of a series of bars.
Protocols affect lag time tremendously, with true streaming protocols such as real-time transport protocol (RTP) and real-time messaging protocol (RTMP) being used for live streaming. In recent years, to address scalability plus ease and quality of delivery, the industry has moved toward segment-based HTTP delivery (essentially large video files parsed down in to small file “chunks” or “segments” that are then delivered in sequential order via a standard HTTP web server).
The upside of streaming protocols like RTP and RTMP is low latency, while the upside of HTTP delivery of protocols such as MPEG-DASH and Apple’s HTTP Live Streaming (HLS) is scalability.
“Among Wowza customers, RTMP has long been a highly used protocol,” says Chris Knowlton, vice president and streaming media evangelist at Wowza. “In recent years, though, HLS use has risen dramatically for its extended reach and massive scalability. Yet RTMP usage is still high. Why?”
Knowlton says that, for some customers, HLS is only used to supplement their traditional streaming reach, by allowing them to reach more modern endpoints—such as iOS-based iPads or iPhones, or even modern Android devices—while their use of RTMP still allows them to reach older devices, desktops, and browsers.
“For many, though, RTMP still provides a reliable way to provide low latency, ultra-low latency, and real-time delivery,” says Knowlton, “for use cases such as webconferencing, gaming, voice chat, and interactive social media experiences.”
The downside of RTMP is that it requires specialized players—many based on Adobe Flash Player, as Adobe initially created the RTMP protocol—while the downside of segment-based HTTP delivery is latency, given the need to do preprocessing of the on-demand video file or the live stream in order to deliver the tens of thousands of small file segments to a given viewer’s screen of choice.
“With reduced support for Flash playback in devices and browsers, RTMP usage for delivery is likely to decline at an accelerating pace over the next 5 years,” says Knowlton.
So what we have here is a fundamental disconnect between technologies and use cases: For playback of on-demand content, HLS and DASH make perfect sense, while for live events RTMP and RTSP (real-time streaming protocol) make perfect sense.
Can HTTP Delivery Match RTMP?
But is there a way to get the benefits of both worlds? Is there a generic HTTP delivery method that approaches RTMP-like latencies? Maybe. Just maybe.
Content delivery networks (CDNs) like Akamai are considering how to best address latency, and a number of companies have been working on innovative approaches to low-latency HLS delivery (L-HLS) in various forms.
Shawn Michels (right), Akamai’s director of media product management, sees the latency problem through the lens of a three-pointed triangle. “We do see a lot of different definitions or criteria in the market for what constitutes low latency for streaming,” he says. “What we’ve learned is when you look across all the market segments, there are three key use cases that we need to solve when looking at low latency: broadcasters, OTT-only content, and what we call ‘personal broadcast’ using the likes of Facebook Live or Periscope.”
Michels says these users might have needs beyond just streaming, and his perspective meshes well with another article I’ve written in this issue of Streaming Media magazine, on streaming’s fit into enterprise unified communications workflows.
“What we see is customers who want to deploy solutions similar to Periscope or Facebook Live, which we often refer to as personal broadcast,” says Michels. “These customers are looking for latency at 3 seconds or less. We refer to this as ultra-low latency.” In the case of ultra-low latency, at least from Akamai’s perspective, both the person or organization doing the live streaming, as well as the viewers, need as little delay as possible to allow for live chat, reactions, or even a question-and-answer portion of the webcast.
Some of the companies are innovating by modifying HLS to use either shorter segment lengths—Apple recommends lengths of 2–10 seconds, with three segments encoded and downloaded at a user’s local viewing device— while others are assessing ways to sidestep the delay inherent in packaging segments into the antiquated transport stream technology that HLS uses: MPEG-2 Transport Stream (M2TS).
Let’s look at just a few examples of how these lower-latency approaches might play out.
A conversation with Pieter-Jan Speelmans (left), chief technology officer of THEOPlayer, uncovered some interesting work that its partner, Periscope, is attempting by modifying HLS.
“With a latency of about 2,000 ms, including encoding, packaging, [and] network transfer,” says Speelmans, “this gets [L-HLS] on-par with RTMP.”
That’s a pretty bold claim, since RTMP has been one of the gold standard measurement tools for low-latency live streaming delivery. Speelmans says that Periscope is seeing these types of lower-end latencies in controlled testing, but the company is not yet free to talk about this from a marketing standpoint.
Speelmans did say that part of the approach is to use 1-second HLS segments. He also says the approach requires modifying HLS in several ways. “The current solution we have in production is actually a modified version of HLS,” says Speelmans. “This is both server and client side. On the client side, we have added a lot of improvements which also benefit other HLS users. This results in a better user experience due to less buffering, lower time to first frame.”
When Speelmans mentioned modifying HLS, my immediate question was one of compatibility, both with the “Pantos spec” and with the large number of HLS-based players in the marketplace. The Pantos spec, so named because of the last name of its sole editor, Apple engineer Roger Pantos, is the common name for an Internet Engineering Task Force (IETF) draft specification that Apple has been submitting and updating for almost 7.5 years. In 2016, though, authorship has been expanded to include input from William May, senior principal engineer of Major League Baseball Advanced Media (MLBAM).
From Apps to Mobile Operating Systems
Android and iOS both support HLS natively in their respective mobile OSes—the need to maintain compatibility to a spec is very important. After all, if modifications go too far, the spec itself will become fragmented (no pun intended), and the industry as a whole cannot advance toward overall lower latencies for HTTP-based delivery.
Ensuring broadcast quality for viewers is about more than just reducing buffering. Publishers need to improve the efficiency of back-end operations, as well.
Now that the major codecs can deliver quality that is acceptable—or better—for most video, the next challenge to be overcome for streaming is latency. New products from HaiVision and W&W Communications aim to bring latency down to the point where streaming becomes tenable for high-speed surveillance apps
The upcoming third edition of DASH will address several missing features, says a Comcast principal architect, and will drive down live video latency.
Despite some much-hyped problems, 2016 was a watershed year for online video quality of service (QoS), and 2017 promises further advances.