-->
Save your seat for Streaming Media NYC this May. Register Now!

HESP: Sub-second Latency, Fast Channel Change and Improved ABR over Standard CDNs

Article Featured Image

High efficiency streaming

Online viewers are more demanding than ever. They want low latency video streaming experiences for interactive use cases such as betting, fan engagement and polls. They also expect a leanback TV experience over OTT, with a fast channel change, on par or faster than the zapping experience they have grown to love from the broadcast environment.

However, media companies cannot deliver upon these increasing viewer expectations with protocols such as HLS, DASH and WebRTC as these typically bring a trade-off between live latency, fast channel change and/or cost of scaling.

This is where the High Efficiency Streaming Protocol (HESP) comes in, an HTTP-based streaming protocol which provides for high efficiency streaming to deliver upon the increasing viewer expectations. In addition to sub-second latency at scale, it brings benefits such as improved adaptive bitrate (ABR), up to 20% bandwidth savings, and fast channel change, as low as 100ms, which makes it on par with analogue zapping. On top of this, the HESP specification is interoperable with the CMAF standard and includes capabilities such as content protection with DRM and subtitles which are lacking in other ultra-low latency protocols such as WebRTC. 

Increasing viewer QoE expectationsFigure 1: Increasing viewer QoE expectations require high efficiency streaming solutions

HESP explained

HTTP-based streaming protocols typically use a segment-based approach. This means a video is cut up into segments of a few seconds each, which requires video players to wait until the start of a new segment to start playback. This approach increases channel change times and introduces additional latency.

HESP leverages a frame-based streaming approach, which does not require a trade-off between live latency and channel switching time. More specifically, HESP uses two streams:

  1. An Initialization Stream, which contains only key frames. This stream is not regularly used. It is only used when a new stream is started.
  2. A Continuation Stream, which is a regularly encoded stream for low latency purposes, which can continue playback after any initialization stream image.

Leveraging chunked transfer encoding (CTE) with byte-range requests, it is possible to very quickly start a stream or to change qualities upon changing network conditions. As a result, lower player buffers are needed to bring the same viewer quality of experience, and hence lower latencies can be achieved, and this in combination with a fast channel change. Moreover, as HESP is an HTTP-based streaming protocol it can easily and cost-efficiently scale to any audience size over standard CDNs.

HESP complementary streams

Figure 2: HESP uses two complementary streams. Whenever a user wants to start a new video, firstly an image or frame is fetched from the initialization stream. Images can be requested at any moment to start playback. Subsequently, images are fetched from the continuation stream. The continuation stream can playback at live latency after any initialization stream image.

HTTP-based streaming

A big advantage of HESP is that it’s an HTTP-based streaming protocol. This means that it can scale over standard CDNs, just like HLS and DASH, and hence it can reach any audience size. HESP brings a number of benefits over HLS and DASH though, such as lower latencies, fast channel change and improved ABR.

When all components of the workflow are optimized for low latency, including the encoder, HESP can also provide for sub-second latency, just like WebRTC. An important difference compared to WebRTC is that each WebRTC client requires a direct connection with the backend, and scaling happens through spinning up additional server infrastructure. This makes WebRTC scaling to bigger audiences complex and expensive, and it’s also more difficult to deal with flash crowds, which are typically handled through the CDN for HTTP-based protocols such as HESP.

Just like most other HTTP-based streaming protocols, HESP is also fully compatible with CMAF. It makes use of the CMAF container for media delivery and follows the CMAF media model. This makes it compatible with streaming protocols such as HLS and DASH and allows it to bring all features which are CMAF compatible. This is a critical advantage as it allows HESP streams to be protected with studio compliant DRM such as Widevine, PlayReady and Fairplay using common encryption in CENC and CBCS mode. Another such advantage when comparing to WebRTC is the compatibility with subtitles: HESP supports your standard subtitles in TTML or WebVTT as used today in HLS and DASH. Thanks to these capabilities, HESP can replace existing pipelines today, without the need for proprietary solutions for content protection and subtitles.

Increasing HESP adoption

An increasing number of vendors are implementing HESP into their solutions, or ensuring compatibility, through the HESP Alliance. This includes vendors across the video workflow, from encoding/packaging to CDN, DRM and player. The HESP Alliance has a verification program for HESP-ready solutions to ensure compatibility across the ecosystem so that media companies can comfortably put together their own workflow with point solutions of HESP Alliance members.

HESP is available as an IETF specification. The HESP standard includes details about, for example, the HESP manifest, the continuation stream and the initialization stream. The advantage is that the HESP continuation stream is CMAF compatible. This means that it’s very easy to handle captions/subtitles, timed metadata and Digital Rights Management (DRM), items which are also included in the HESP IETF specification.

Secure high efficiency streaming

A big advantage of HESP is that it allows studio-approved DRM systems to be implemented through the common encryption standard. Many live streaming services typically have contractual obligations to use DRM but cannot go sub-second latency as they are currently limited to LL-DASH or LL-HLS implementations, with 2-7 seconds of latency. When you want to create interactivity, however, or want to allow for betting, it’s important to have latencies as low as possible, to maximize your betting window. This is exactly what HESP can provide for.

Secure high efficiency streaming demo setupFigure 3: Secure high efficiency streaming demo setup

To demonstrate DRM capabilities, EZDRM, Synamedia and THEO have set up an end-to-end HESP workflow and measured the impact of DRM on latency and startup time. Measurements showed that DRM did not have an impact on end-to-end latencies with HESP-based streaming, and only a 250-400ms impact on startup time, related to the time needed to retrieve the DRM certificate & license, and CDM initialization time.

This article is Sponsored Content

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Tulix Executive Predictions: George Bokuchava

The industry had a very fruitful and interesting 2022. In 2023 we will see continuations and finalizations of some popular initiatives.

Magewell Executive Predictions: Nick Ma

From medical applications to the metaverse, one of the hottest streaming-related topics today is achieving ultra-low latency for near real-time interactivity. Of course, latency has been a key issue for many years, but as streaming became viable for more use cases, the desire to go from seconds to milliseconds has only increased.

Companies and Suppliers Mentioned