The State of Real-Time Streaming 2023
Last year marked a turning point in which organizations are now more attuned to offering live streams. That’s one of the key takeaways from a recent survey that our organization, the Help Me Stream Research Foundation, did for Streaming Media magazine and survey sponsor Phenix Real Time Solutions. The late 2022 survey, titled “The Business Value of Real-Time Streaming” (go2sm.com/realtime), also found that the perceived business value of real-time streaming directly correlates to organizations with higher-than-average annual revenues and concurrent viewers.
In this article, we’ll look at the state of real-time streaming a few months into 2023, including several use cases that hold longer-term promises. But before we do so, here’s a baseline definition of real-time streaming, as used in the survey: Real-time streaming is “device-synchronized delivery to hundreds of thousands of viewers at less than 500 milliseconds per user.”
As we’ll see in the rest of the article, two concepts—lowered glass-to-glass delivery time, counted from camera image acquisition to end-user viewing screen, and synchronization across multiple devices, especially those within earshot of each other—are highly intertwined.
Use Cases for Lowered Latency
During a Streaming Media Connect 2023 session (go2sm.com/zerolatency), a panel of industry veterans from Europe and the U.S. discussed several of the use cases for real-time streaming. I’d encourage readers to view the panel session, but will touch on some of the highlighted use cases.
One use case that immediately springs to mind when low-latency streaming is discussed is sports. At the time of this writing, Super Bowl LVII had just taken place in Glendale, Ariz., with an estimated audience of 130 million worldwide. Some estimates say that 7 million viewed the big game via streaming, although as indicated by an image accompanying a Phenix tweet (bit.ly/3JjNbxi), customers may sometimes watch on multiple devices. “Our latest #SuperBowl latency study is out! #SportsFans were seeing 60+ sec delays,” the tweet said. “When will #SportsMedia prioritize #FanEngagement??? Phenix delivers video in <1/2 sec to millions around the globe; there’s no reason #SBLVII fans shouldn’t enjoy the same!”
The Phenix tweet lays bare an assumption—perhaps accurate, more likely not—that most fans want their sporting event delivered with the lowest possible delay. Yet that premise has two major flaws. First, the issue that Phenix and other “real-time solutions” attempt to address—at least based on publicly available data for its hot-off-the-press Super Bowl LVII latency report—isn’t delay as much as it is the synchronization of delivery, which was noted in our survey definition of real-time streaming but was not mentioned in the Phenix tweet.
I’ll delve more into various synchronization approaches later in this article, but let’s touch briefly on another aspect of the very nebulous methodology Phenix uses to claim superiority for its technology in the Super Bowl LVII report: The methodology does not actually compare the Phenix solution to over-the-air (OTA)—a baseline that would be expected for a company claiming in its tweet that it “delivers video in <1/2 sec to millions around the globe”—but rather compares streaming services against each other.
Second, the basic fact is that everyone in live-event broadcast knows that high-profile live events like the Super Bowl aren’t actually delivered in real time. “In talking to people who work with those networks, it tends to be in the 4–10 second [range],” said Loke Dupont, a solution architect at TV 2 in Denmark, during the Streaming Media Connect 2023 panel. “So, there are definitely delays even in traditional broadcast networks.”
In the U.S., broadcasts have three points of delay: inserting graphics, monitoring for words that cannot be spoken on broadcast television—often called the “dump button”—and the actual transmission to a tower or satellite uplink along with additional delay to the end user’s viewing device. In total, this adds about 8 seconds of overall glass-to-glass latency.
So, what’s the current state of the sports industry’s push toward lower latencies? Magnus Svensson, media solution specialist at Eyevinn Technology, noted in the panel session that recent sports videoconferences in Europe and North America have taken a measured approach. “What I hear in the industry so far is that we’re still prioritizing quality and stability over latency,” he said. “So, most of the viewers watching a sports event tend to want it in a good, high-quality fashion and also stable, not buffering or anything like that.”
Svensson and Dupont noted significant trade-offs when delivering across the public internet network versus an optimized video network. “The internet, of course, wasn’t designed for the same way as if you’re working in a cable network designed to do video distribution,” said Dupont. “Of course, in the best approach, you would design it to do that without having several buffers somewhere that it needs for reliability. But when we’re talking about the internet, there are a lot of hops on the way to get to our users that could delay the stream.”
Because the best-effort nature of the internet isn’t laser-focused on video delivery, tweaking to optimize latency has unintended consequences. “The more you start squeezing the latency down, the more risk you introduce with buffering and lower the video quality,” said Svensson.
With the continued push by major sports leagues in the U.S. to advocate for wagering on their events, the U.S. is following Europe’s early lead in delivering lower-latency streams to casinos and other establishments that cater to sports wagering. Sporting events have traditionally been delivered to sportsbook or casino venues via satellite video broadcast, which itself has latency inherent to the uplink, processing, and subsequent downlink of satellite-delivered video. Yet the delay is often only 2 seconds or less, and satellite has the benefit of being able to multicast—sending one stream that thousands can receive—versus the traditional unicast approach for streaming, in which each stream generates an additional load on the media delivery servers.
When it comes to a sportsbook, though, we also have to factor in the economics of lowering the latency for the overall audience compared to just a select few who do real-time sports wagering outside a casino environment. “It’s still a small fraction of the viewers that actually do live betting at the same time,” said Svensson. That fact has led to discussions around the bifurcation of streaming, in which a smaller group of viewers gets a very low-latency stream, devoid of many or all of the key graphic elements the typical broadcast delivers. In contrast, the vast majority of viewers get the graphics-heavy, albeit longer-to-deliver, stream.
Svensson noted that one use case popping up more and more in discussions over the last 6 months is how to augment the in-stadium experience for fans who have paid to attend a sporting event. He explained that this is ideal for those who may only have one vantage point on a much longer racecourse. “You can use your mobile or tablet or whatever you bring to the stadium as your second device,” said Svensson, “following the rest of the event beyond your vantage point.”
One example Svensson gave was a cross-country skiing event, where viewers in the stands can’t see the complete racecourse. The same might be true for Formula 1 races or even cross-country running events. It’s natural for viewers to want to keep track of their favorite racers when, say, the primary pack of skiers is in the forest area of the race outside the stadium. “In this case, then, low or sub-second latency comes into play because you don’t want to be 20 or 30 seconds behind watching the second screen when you’re sitting at the race itself,” said Svensson.
Setting Expectations Around Synchronization
So do we lower the streaming latency for every viewer because of the needs of a select few watching from an in-stadium vantage point or betting in real time on the game? “That doesn’t mean that the whole broadcast of that event needs to be that low latency,” said Svensson, “but for the people in the stadium, it is actually quite an interesting use case to be able to watch the rest of the race following your skier or your driver at the same time as viewing the complete race.”
Yet even with Svensson’s example of lowering delay for in-stadium event viewing, it’s still a small subset of a small overall group of viewers who consume sports via streaming. And that brings us back around to the greater need: synchronization of content delivered to multiple devices in a given location. “I think if we can get down to a latency around 3, 4 seconds,” said Svensson, “we can remove the frustration of the tweets and the news tickers and others watching it a minute or more before we see the key play or event. I think that’s where it will settle in.”
For the general viewer, though, there’s another frustration: data delivery, such as texts, tweets, and graphics visualizing telemetry data from a race car or skier, can be accomplished at speeds well under a half-second. “I think one of the big challenges there is that you get your updates from not just the video stream that you’re watching, but plenty of sports have apps that will push events to you very, very soon after they happen,” said Dupont. “Nobody really wants to get a notification about the updated score and then afterward see the goal on video. You want to see the goal first and then get the notification.”
Dupont’s point mirrors several comments that respondents to “The Business Value of Real-Time Streaming” survey made, with one response summing up the issue succinctly: “Saw a live tweet about a massive play I hadn’t even seen the setup for yet.”
In addition, there’s the conversation around whether interactivity is best handled by lowering video latencies—and whether it’s even possible. “We also have examples like The X Factor show, where you vote,” said Dupont. “You would also want to have that at least be somewhat low latency because otherwise, you have a voting period that’s very long because you need to make sure that everybody’s actually seeing all of it before they vote.”
Don’t Drift Away
Synchronization also considers the drift of live-event streams when a dynamic play or series of plays occurs and streaming audience viewership rises. In that regard, the Phenix Super Bowl LVII report gets one key point right: Almost every streaming solution has a “drift” factor in which a stream that was delayed by around 20 seconds could end up with an overall delay of more than a minute if hundreds of thousands of unicast streams have to be generated from an origin server.
That point further solidifies the need for synchronization. To emphasize it even more, the Phenix tweet is accompanied by a picture that shows synchronization across three of four devices—including OTA delivery. However, the Phenix streaming solution is noticeably missing—with the fourth device being so far off that it appears to be on a completely different play.
“If you have no sort of time synchronization in the devices, you can end up with different delays of different devices of the same sort,” said Svensson. “I’ve actually been to sports bars where a different TV set has a different latency, which in the sports bar is very noticeable when half of the bar is 5–10 seconds ahead of the rest of the bar.”
Let’s Just Talk
Sometimes, this drift is dealt with by dropping the video completely as a way to get very timely messaging through. There are two key use cases for this: auctions and breaking news.
In discussions between the Help Me Stream Foundation and auction companies that we’ve asked to help sponsor our research into in-the-field interactivity, we hear consistently that the latencies of 15 seconds or higher with typical streaming video are too lengthy to allow adequate synchronized bidding. While it’s true that Zoom and other web conferencing solutions might work in a perfect world, the other feedback we receive has to do with the location of auctions, many of which are literally in a field (think cattle, farm, or horse auctions). As such, we’re told that a judgment call is often made to drop the video and go to an audio-only stream.
If there’s severely limited connectivity, auctioneers will fall back to using a phone call on an audio bridge, since auctioneers don’t want a disadvantage for people who are listening to the audio-only bidding versus bidders who are waiting several seconds for the video to come through at a point where the bidder feels it’s reached a particular dollar or euro level.
The other use case is disaster areas, where two things are critical: getting information out to as many people as possible and getting it across a medium that almost everyone will have access to. It’s unlikely in those situations that civilian communications networks will remain intact or operate at optimal throughout, which in turn means that an audio-centric approach—accompanied by key data such as weather conditions, visualizations of storm movement, etc.—provides a better option for potential disaster area victims who aren’t spending their time sitting in front of a large screen. This approach has the added benefit of lowering battery consumption for listeners who may need to continue to receive details of the disaster for hours on end. The latency goal from a streaming standpoint should be to try to hit radio delays, which are much, much lower even than the sort of video delays that we find in entertainment, enterprise, or educational settings.
Where Do We Go From Here?
Technologies that are currently in use, and some that are on the horizon, offer potential approaches to lowering latency.
With the vast majority of streams being delivered from HTTP-based servers, either Apple’s HTTP Live Streaming (HLS) or the industry standard DASH, exploring ways to lower latency at that point is a good place to start. In recent years, there have been two new low-latency and ultra-low-latency protocols being tested, one each for HLS and DASH. Early trials have focused on attempting to solve the latency issue by tuning the manifest and chunk delivery across the entire chain, where manifests are made available as early as possible and “partial” chunks of the video are made available from the packager and sent out as soon as they are ready.
Media server solutions that are codec-agnostic are also offered, although many of these operate in a closed-loop system in which the player and media server both come from the same company. “There are also many proprietary solutions,” said Dupont. “THEO has HESP, and Softvelum has SLDP, and there are others. The problem with many of those is that they work in a way completely different from how any other streaming works. So, you have to have support on encoders. You have to have support on all sorts of players everywhere. And if it’s a closed ecosystem that depends on one vendor, it’s tough to get it out on the standard web.”
There has also been a push by some satellite providers to talk up their lower latency. But one of the underlying questions is how much traffic really needs low-latency delivery. Traditional satellite delivery providers like Viasat peg the amount of total traffic on their networks that requires low-latency delivery at around 10%. In Viasat’s comments to the Federal Communications Commission (FCC), the company noted that “the Commission has acknowledged that the impact of latency can be mitigated through appropriate network design,” adding that “the [International Telecommunication Union] has confirmed that the technical model underlying the 100 ms latency standard adopted in the [Connect America Fund] context likely overstates the impact of latency on service quality.”
A newer breed of satellite broadband providers, though, are taking the approach of saying that their delivery is on par with standard broadband latencies. SpaceX, for instance, has Starlink, a constellation of low Earth orbit (LEO) satellites for which it claims no more than 20 ms of latency is added. SpaceX contends that its Starling delivery can make 70 round trips of data transfer to everyone for a geostationary satellite’s delivery—with latency delays of approximately 20 ms versus the half-second or more required for geostationary satellites. The claims are disputed by both other satellite providers—many of which have their “birds,” or satellites, at 22,236 miles above the Earth’s surface to accommodate a wider swath of delivery—and the FCC.
While the Starlink LEO sits at approximately 330 miles above the Earth—a limited area of coverage of the Earth’s surface that requires many more satellites in orbit to accomplish the same geographic coverage—it does have the added benefit of delivering much lower latency internet service than a geostationary satellite can provide. Still, the FCC noted at the beginning of a recent rural internet delivery auction that those seeking to bid as a low-latency provider using LEO satellite networks will face a “substantial challenge” demonstrating to FCC staff that “their networks can deliver real-world performance to consumers below the Commission’s 100ms low-latency threshold.” However, SpaceX overcame this challenge and received approval to bid on rural broadband initiatives. It was the only LEO satellite provider in the approved list of applicants.
While web conferencing was a primary focus during the pandemic, even the big low-latency communications players like Zoom are now going through layoffs. But the general concept of real-time communications on the web—with the WebRTC bundle of protocols leading the charge—has been progressing through a series of technical advances. The most recent involve ingest (WHIP) and egress (WHEP) in a nod to the fact that there may be a number of traditional streaming providers that only want to use the near-zero-latency benefits of WebRTC for acquiring content while continuing to use HTTP-based delivery to the overall global streaming audiences. This is one to watch, as WHEP might create the technological path forward for the bifurcation mentioned earlier in the article.
The pandemic highlighted the need for ultra-low-latency video. Here's a look at how the industry has responded, from LL-HLS and DASH to WebRTC
26 Mar 2022
A Stanford University research team has created an architecture called Salsify that might offer a better way to deliver video for real-time applications
25 May 2018
Companies and Suppliers Mentioned