Apple Changes Encoding Recommendations for HLS
Modifications to audio recommendations and segment size simplify things for producers across the board
From my perspective, Apple Technical Note TN2224 is the Rosetta Stone of adaptive streaming, a document that translates multiple factors into concrete recommendations that should be considered by all producers streaming to Apple Devices. Which is to say, of course, virtually all streaming producers. So when Apple makes big changes in the Tech Note, as they did on February 28, 2014, it’s worth noting, even if I didn’t notice for a month. And there are two big changes.
First, and most significant, is that Apple now recommends using different audio bitrates for the different quality streams, starting at 64Kbps and scaling up to 128Kbps. In all previous versions of the Tech Note, Apple used 64Kbps across the board, which most authorities attributed to the potential for “popping” artifacts if a viewer switched to different rates during a stream switch. Adobe’s adaptive presets took the same approach, and most white papers on the subject advised the same. Many of the producers who have shared their settings with Streaming Media over the years, including Turner Broadcasting and MTV Networks, use a single set of audio parameters for all streams in an adaptive group, though many others do boost audio quality with higher bitrate streams.
Adapting the most conservative path, I have always recommended that producers use the same audio parameters for all streams, as per TN2224. If they did scale the audio with the video, I recommended that they tested to make sure that the artifacts don’t occur. Going forward, it seems that increasing the quality of the audio with the video is the recommended course, though I would still test to make sure there is no popping or other artifacts.
The other major change is the adjustment of the recommended segment size from ten seconds to nine seconds, which cleared up the most enduring mystery of the Tech Note. As I wrote in How to Encode Video for HLS Delivery:
In terms of segment duration, the most confusing aspect of TN2224 is the recommendation of a segment size of ten seconds, and a keyframe interval of three seconds, as this wouldn’t seem to produce a keyframe at the start of each segment. Interestingly, the new default settings in Apple Compressor 4.1 follow these recommendations, creating a segment duration of ten seconds, but using a keyframe interval of three seconds.
In contrast, most authorities recommend making sure that the keyframe interval divides evenly into the segment size. For example, cloud encoder Zencoder’s well-written Best Practices for Encoding HLS Video states, “keyframe rate should be an even interval of the segment size.”
While I was writing the article, I had pinged an acquaintance on the TN2224 team, asking about this discrepancy, but never heard back. (The first rule of Fight Club is: You do not talk about Fight Club.) I’m not claiming any responsibility for the change, but it makes things a lot clearer for all producers.
Apple also dropped a stream from the group, specifically one of the 960x540 streams that I’m not sure many producers actually used anyway, and adjusted some of the data rates, including pushing the remaining 960x540 stream from 2.5Mbps to 3.5Mbps, which feels high to me. In general, all stream recommendations beyond the 640x360@ 1200Kbps are a bit rich for my blood data rate-wise, and probably could be ratcheted down a bit. Otherwise, TN2224 has always presented a logical and very useful starting point for producers seeking to produce one set of streams for mobile and desktop playback. Overall, if you based your adaptive encoding settings on TN2224—particularly the audio parameters or the segment size—it’s time to have another look.
Content owners already following Apple's HLS encoding recommendations should have nothing to fear from the iPhone 6 Plus's larger, sharper screen.
While it's clear that Flash's time is coming to an end, it's less clear what will replace it. A survey shows DASH support, but its real-world use is around one percent.