Buyers' Guide to Media Servers 2017
What that means, in real-world terms, is that the “live” MP4 stream would require twice the length of the keyframe distance. So if the keyframes were set 5 seconds apart, the MistServer approach would be delayed by at least 10 seconds, on top of the actual delay from the encoder itself. In some ways, this was similar to the way that HLS works, in that HLS needs at least two segments—of approximately 2–10 seconds in length, based on Apple’s best practice guidance—to download to the end-user’s device before “live” playback begins.
What a difference a year makes, though, as Viëtor says that the most recent version of MistServer, released in mid-January 2017, has reduced latency down to a static ~1500 milliseconds.
“That’s right, a static ~1500 ms, regardless of key frame interval,” says Viëtor. “You can watch a stream with a key frame every 20 seconds, with only 1.5 seconds latency, without graphical glitches, without stuttering, and in any quality, with no plug-ins or scripts whatsoever.”
This 1.5 second latency is, of course, in addition to encoder latency, which a media server cannot control. But the goal of the DDVTECH team is to reduce it further, down to around a static ~200 ms range, moving much closer to real-time delivery through the media server.
How Now, Latency!
With additional work being done by all media server companies to reduce the latency of their core products, it’s no surprise that the Wowza Streaming Engine is also being optimized for a variety of new formats.
We covered a number of Wowza features in the recent “Latency Sucks!” article (go2sm.com/latency), but one of the key takeaways was Wowza’s focus on reducing latency in two key ways: WebRTC and WebSockets.
WebSockets are geared toward problems faced when using TCP to deliver streaming via HTTP (using technology such as Apple’s HLS or MPEG-DASH). Essentially, WebSockets help keep up the appearance of persistent TCP connections so that smaller segment sizes (of 1 second or less) can be delivered to an OTT device without requiring the request and delivery to be acknowledged.
Using WebSockets to simulate TCP means that the server and end-user player can both freely send messages in either direction without needing to wait for a request from the other side. Internally at the browser, WebSockets use a TCP connection, meaning that they overcome any issues with delivering segments that may be smaller than the TCP windowing size set by the ISP, default browser settings, or even the media server itself.
On the other side of the basic delivery protocol spectrum is UDP, an approach that doesn’t require any messaging acknowledgment at all. UDP has traditionally been used by standards-based, real-time streaming protocols such as RTSP. Given the aggressive nature of UDP, many publishers have shunned it in favor of RTMP, which offered low latency streaming across TCP.
The emergence of WebRTC, which we also covered in length in the “Latency Sucks!” article, essentially allows the benefits of UDP—including the ability to traverse firewalls and other potential network obstacles—to be used directly within the browser. According to one industry source, the combination of UDP and WebRTC “thus theoretically can achieve the absolute best latency of any in-browser method, hands down.”
Media servers have offered support for closed-captioning solutions, but newer advances in live captioning are beginning to emerge. To understand how live captioning realities differ from traditional timed text—such as SAMI, SMIL, Timed-Text Markup Language (TTML)—and the need for traditional CEA-608/CEA-708 compliance from acquisition through the media server or transcoder and on to the end-user’s player, see “Captioning Live Online Video” in the 2017 Streaming Media Sourcebook.
Web broadcasters and CE manufacturers alike face a challenge standardizing on a common timed-text standard, which means that media servers will play a significant role in converting between the various types of timed text for the foreseeable future.
In some ways, this support for CEA-708 is a chicken-and-egg conundrum. Even if the live captions can be acquired, if either the media server or end-user player don’t support it fully, this important metadata won’t make it through the delivery process.
“YouTube and Ustream are the only big players that support [CEA 708] that I’ve found in the live space,” Hurford told me in an interview at Streaming Media West 2016. “Others will say they support it, but they don’t actually support the 708 standard, and they offer a different solution which oftentimes isn’t a great user experience, [such as] a scrolling transcript that pops up in a separate window.”
Far from being relegated to the fate of older streaming media technologies, the growth in media server options continues to be strong. Innovation, offered by newer companies with novel ideas, coupled with enhancements of legacy protocols and formats, which allow live content to scale to television-sized audiences with fixed latency timing, means that 2017 should prove to be a year in which several key problems are solved.
Watch closely for the ratification of updated WebRTC standards by the Internet Engineering Task Force, and expect to see media server companies continue to push hard to limit the amount of delay between over-the-air broadcasts and OTT delivery to streaming devices.
Finally, now that 4K streams have been shown for key sporting events over the past several years, expect to see 4K live streaming become more commonplace. We have to thank media server companies for this too, as their research and development makes it possible for online video platforms to offer up adaptive bitrate streams at a variety of resolutions, from Ultra HD to 1080p to 720p.
This article appears in the March 2017 issue of Streaming Media magazine.
From 4K to PPV, media and entertainment platforms can help publishers realize and monetize their OTT visions. Here are the extras that set the best apart.
4K and DASH muddy the waters, but the need for servers that can deliver multiple formats and handle closed captioning and ad insertion has never been clearer.
MPEG DASH is the biggest factor to consider -- or is it? Here are the key features to know about before making a decision.
Your streaming server is the lifeblood of your business. Make sure you pick the right one for your needs.
Companies and Suppliers Mentioned