HEVC in HLS: 10 Key Questions for Streaming Video Developers
Most encoders will have some kind of trade-off between complexity and quality. For example, the x265 codec uses the same presets as x264 (ultra fast to placebo) while MainConcept uses multiple levels from 1 to 30. Once you get familiar with these controls for your codec/encoder, you should be in good shape.
7. What Are the Requirements for HEVC?
The requirements fall into three rough classes:
HEVC Encoded Files: The HLS Authoring Specification states, “Profile, Level, and Tier for HEVC MUST be less than or equal to Main10 Profile, Level 5.0, High Tier.” Table 1 shows the level restrictions from the Wikipedia HEVC page which details the level and tier restrictions. Significantly, while you can encode 1080p video at frame rates as high as 128 frames per second, 4K resolutions are restricted to 30 fps or lower. Note that the HLS Authoring Specification prohibits frame rates beyond 60 fps for all codecs.
Table 1. Level and Tier restrictions for HEVC encoding
Another notable requirement from the Authoring Specification is that “The container format for HEVC video MUST be fMP4,” or fragmented MP4 files, which means that MPEG-2 transport streams are out. This should simplify delivering unencrypted HEVC encoded video to DASH and HLS clients since both should be able to deploy the same bitstreams. In the short term, differences between PlayReady and FairPlay encryption schemes may prevent interoperability of encrypted fMP4 content to DASH and HLS end points, though Microsoft has committed to resolving this for compatible hardware devices in 2018 with the release of PlayReady 4.0.
The HLS Authoring Specification contains two bitrate ladders, one for video files, the other for trick play files used for scrubbing and scanning. The video bitrate ladder is included as Figure 4. Note that the suggested bitrate ladder indicates that the frame rate for 2K and 4K resolutions be the same as source, which is identical to all other resolutions down to 540p.
However, if you’re working with 60 fps 4K source, the aforementioned Level 5 limitation restricts you to 30 fps as shown in Table 2. Unfortunately, Apple hasn’t posted any HLS examples with 2K/4K videos, which might resolve this seeming inconsistency. Until it is resolved, I recommend the conservative route and restricting 2K and 4K HEVC videos to 30 fps.
H.264 Encoded Files: As mentioned above, the Authoring Specification requires that some videos should be encoded with H.264, but provides no further guidance. So we looked at the mixed HEVC/H.264 ladder on the Apple developer site, and saw that Apple provided completely separate encoding ladders for both HEVC and HLS, nine rungs each, just as specified in Table 2, though the highest resolution supported in either format was 1080p. Looking at the master M3U8 manifest file, the player selects the codec first, then the appropriate rung (note that the Apple playlist calls the rungs “gears”).
Table 2. Apple’s suggested encoding ladder for H.264, HEVC, and HDR
This is interesting, because before Apple provided its example, there were multiple theories for the optimal composition of an HEVC/H.264 ladder, including a ladder that provided H.264 for lower quality rungs and HEVC for the higher resolution rungs. At the session, several attendees and the two producers from RealEyes suggested that it would be tough for any software-based player to smoothly switch between H.264 and HEVC playback, which tends to support the Apple approach. The obvious downside is that it doubles your encoding costs and substantially increases storage costs.
At least until proven otherwise, I would recommend adopting Apple’s approach. You should also download the Master M3U8 file and mine this for other encoding and presentation details.
I-Frame/Trick Play Support: Apple added trick play support for fast forward and reverse playback, either in the video playback window or as thumbnails, in iOS 5, and detailed how to create I-frame playlists to support this feature in Apple Technical Note TN2288. In TN2288, Apple states, “you don’t need to produce special purpose content to support Fast Forward and Reverse Playback. All you need to do is specify where the I-frames are. I-frames, or Intra frames, are encoded video frames whose encoding does not depend on any other frame. To specify where the I-frames are, iOS 5 introduces a new I-frame only playlist.” According to TN2288, you don’t need to create a separate encoded file for trick play support, just a playlist that points to I-frames in existing content files.
In the HLS Authoring Specification, Apple modified this recommendation, stating, “You SHOULD have one frame per second ‘dense’ I-frame renditions. These are dedicated renditions that only contain I-frames. Alternatively, you MAY use the I-frames from your normal content, but trick play performance is improved with a higher density of I-frames.”
The spec also states, “If you provide multiple bit rates at the same spatial resolution for your regular video then you SHOULD create the I-frame playlist for that resolution from the same source used for the lowest bit rate in that group.” As further guidance, Apple provides the suggested encoding ladder shown in Table 3. As you would expect, the Apple sample presentation implemented these recommendations to the letter, with separate I-frame encoded files for both H.264 and HEVC at all suggested resolutions.
Table 3. The suggested trick play encoding ladder from the HLS Authoring Specification
By my count, between H.264 and HEVC content and I-frame-only files, Apple encoded the source video to 28 separate files, which may strain the budgets of some producers. This is particularly true for 4K producers, since Apple’s ladder didn’t include 2K/4K iterations, which are the most expensive to encode, and would have swelled total encoding requirements to 31 files, with potentially 17 more required for HDR.
During the session, these requirements generated significant discussions among the attendees, many of which had been producing HLS presentations for years. Most stated that they provided one or two trick play files, with few providing at all resolutions, and most pointing to the I-frames in existing files rather than encoding separate, I-frame-only files. Producers will have to make their own cost/benefit analysis to decide upon the optimal approach for them.
8. Should I Use Apple’s Suggestions Verbatim?
Sometime during the last revision or two of the Authoring Specification, Apple addressed per-title encoding implementations, stating that “The above bitrates are initial encoding targets for typical content delivered via HLS. We recommend you evaluate them against your specific content and encoding workflow then adjust accordingly.” So Apple isn’t dictating a fixed encoding ladder.
Beyond data rates, if you study Apple’s ladder, you’ll note that it uses essentially the same resolutions for HEVC and H.264 for all rungs below 2K. At the preconference session, one of the more technically savvy attendees suggested that Apple’s ladder should have completely different rungs for HEVC to account for the codec’s greater efficiency with high-resolution videos. This led to the analysis presented in an article entitled, “Apple Got It Wrong: Encoding Specs for HEVC in HLS.”
Long story short, the article proposes that the optimal ladder for HEVC would eliminate several lower resolution rungs, and push higher resolution rungs lower in the ladder. This is shown in Table 5, which shows Apple’s suggested ladder on the left and a more optimal ladder on the right (customized for the animated movie Sintel), along with VMAF scores rating the quality of both alternatives. For optimal QoE, you’ll get better results with the Should Be ladder, rather than the Was ladder designated by Apple.
Table 5. Apple’s HEVC encoding ladder on the left, proposed encoding ladder on the right
9. What Are My Live Options?
Live options are nascent but rapidly becoming available, and the presentation handout lists encoders from Bitmovin, Elemental, Harmonic, and Hybrik, as well as transcoding solutions from Wowza and Nimble Streamer. For developer-level producers, MulticoreWare, MainConcept, and Beamr all have SDKs, and the handout details how to produce output using FFmpeg and Bento4.
10. What Does the Spec Say About High Dynamic Range (HDR)?
The Authoring Specification states that HDR video must be encoded as either HDR10 or DolbyVision, and that HDR encoded streams should be provided at all resolutions. If you provide HDR content, you should also provide SDR content for the main video files and trick play files, as well as H.264 content, boosting the stream count to potentially dozens of individual files.
Note that Apple doesn’t yet provide an example file with HDR, leaving several questions unanswered, such as whether the required H.264 content can also serve as the SDR content, or whether producers should also supply separate HEVC-encoded SDR streams (and trick play files). I’m guessing that Apple will always supply the most expansive (and expensive) way to meet the requirements stated in the Authoring Specification, leaving developers to choose their own configuration based on cost and the desired QoE.
It’s early days for HEVC in HLS, and the topic and technology will be fast moving. Hopefully, these questions and answers have helped you off to a quick start.
[This article appears in the January/February 2018 issue of Streaming Media Magazine as "HEVC in HLS: 10 Key Questions."]
The goal is to create, store, and distribute only one version of each piece of media. HTTP Live Streaming is the key to that kind of efficiency, Apple says.
Thanks to a fractured HEVC licensing system companies no longer have the financial incentive to innovate, but Leonardo Chiariglione suggests steps to reverse the damage.
Adding HEVC to your HLS streams is looking like a no brainer, but if you decide to do so, you may not want to take Apple's HEVC encoding recommendations verbatim. You'll deliver noticeably higher quality video if you follow the advice detailed below.
If you're adding HEVC to your HLS video, you're likely concerned about the playback frame rate and battery live on the iPhones, iPads, and computers to which you're delivering. We tested a range of devices, and found the CPU impact to be negligible on most of them.
Bitmovin surveyed respondents across the globe, finding the lowest DASH usage in the U.S. and the highest in APAC and LATAM.