The Algorithm Series: Video Player Performance
In addition, citing Sitaraman's 2013 work on network performance and its impact on viewers, they say, "We consider two primary performance metrics that affect the overall QoE of the user." The first is "time-average playback quality which is a function of the bitrates of the chunks viewed by the user," and the second is the fraction of overall viewing time spent not rebuffering.
BOLA, they posit, is a way to limit the overall buffer from either consistently being depleted (underruns) or filling up. (See Figure 2 below.) There's a finite buffer size, which can be measured in the number of chunks or segments that can be held in a queue for playback. If the buffer is full, the player waits to request the next chunk; if the available bandwidth drops in that gap between requests, however, the requested chunk (which would be at the higher data rate) may take longer to download. This could cascade into a buffer underrun scenario. The authors consider this repeated cycling (whether underruns or download pauses due to a full buffer) as oscillations that are often, but not always, caused by available bandwidth fluctuations.
Yet, lest we assume that bandwidth selection changes will not occur when a viewer is consuming content at a constant bitrate, the BOLA authors point out an issue that's confounded buffer-filling solutions as far back as Burst Technologies and its somewhat unsavory inclusion in Windows Media Player 9: stable-bandwidth buffering choices. "Having a stable network bandwidth and widely-spaced thresholds still does not avoid all bitrate switching," they write.
Using the example of a viewer with a constant 2Mbps bandwidth and two ABR renditions of a show, one at 1.5Mbps and one at 3Mbps, the authors note that the player's performance could actually be detrimental when the buffer fills up: "While the player downloads 1.5Mbps chunks, the buffer keeps growing. When the buffer crosses the threshold the player switches to 3Mbps, depleting the buffer. After the buffer gets sufficiently depleted, the player switches back to 1.5Mbps, and the cycle repeats."
If the end viewer wants to keep a constant quality, he or she then has two choices: view the entire show at the lower-quality 1.5Mbps rendition or employ the old Burst Technologies trick and watch the entire show at a higher bandwidth than the user has at his or her disposal. The BOLA authors call this a choice "to maximize utility and play a part of the video in the higher bitrate of 3Mbps at the cost of more oscillations," but offer up solutions that focus on either oscillations (BOLA-O) or utility (BOLA-U). See Figure 3 for an illustration of how the BOLA algorithm responds to buffer levels.
The BOLA algorithm's ability to choose between higher or lower oscillations when switching among available content bitrates is accomplished in the last part of the algorithm through the introduction of bitrate capping. I asked Spiteri if it's accurate to describe bitrate capping as limiting the selection of renditions in the MPD or manifest file to ones that are at a lower bitrate than the video player's device's currently available bandwidth. He affirmed that it is an accurate description, rather than a description in which some try to erroneously equate bitrate capping with the Net Neutrality third-rail term "bandwidth throttling."
"BOLA-O verifies that the higher bitrate is sustainable by comparing it to the bandwidth as measured when downloading the previous chunk," the BOLA authors write. "Since the motive is to limit oscillations rather than to predict future bandwidth, this adaptation does not drop the bitrate to a lower level than in the previous download" as a way to limit the buffer from growing too large, as it would if the lower Mbps rendition were downloaded.
The second option is to use BOLA to intentionally choose a content bitrate above the sustained bandwidth, with BOLA-U acting in accordance with the overarching principle of not filling the buffer too much. "Excessive buffer growth is avoided by allowing the bitrate to be one level higher than the sustainable bandwidth," the authors write. "[T]he added stability of BOLA-U pays off when using a small buffer size and BOLA-U achieves a larger utility than BOLA-FINITE. … In practice the lost utility is limited by the distance between encoded bitrates; if the next lower bitrate level is not far from the network bandwidth, then little utility will be lost."
Spiteri explained it to me in a bit more detail. "BOLA-U occasionally uses the bitrate just higher than the device bandwidth, thus obtaining a higher average bitrate," he said. "Of course it has to be occasional; downloading at a higher bitrate all the time leads to rebuffering. BOLA-U only downloads at such a high bitrate when the buffer level is sufficiently high, so it does not risk rebuffering."
Spiteri also said there is empirical evidence that users stay engaged when content is presented at a higher bitrate and resolution, citing the paper "Understanding the Impact of Video Quality on User Engagement," which was presented at ACM SIGCOMM 2011.
So, in practice, does this distance between encoded bitrates cause practical issues? "Understanding Video Streaming Algorithms in the Wild," a January 2020 paper by Melissa Licciardello, Maximilian Grüner, and Ankit Singla, seems to indicate that there's room for improvement in terms of using more of the available bandwidth to increase end-user viewing quality. It measures real-world use of ABR video streaming algorithms on players across various online platforms.
The authors say, "We … find evidence that most deployed algorithms are tuned towards stable behavior rather than fast adaptation to bandwidth variations, some are tuned towards a visual perception metric rather than a bitrate-based metric, and many leave a surprisingly large amount of the available bandwidth unused." The authors don't address the conscious choice that the BOLA maximum utility approach takes toward stability, but they do point out an additional wrinkle: visual perception metrics.
In some ways, this may be a semantic distinction. For instance, the BOLA authors discuss "empirical evidence that the user is more engaged and watches longer when the video is presented at a higher bitrate," but the discussion is framed around the difference between standard-definition and high-definition content, so it's more likely that the engagement is due to content that is visually more pleasing than the fact that it's being presented at a higher bandwidth.
Yet, the use of visual perception metrics to tune playback is potentially fraught with peril, especially with earlier metrics like peak signal-to-noise ratio (PSNR), which has notorious examples of a visually incorrect image being rated as acceptable if PSNR is the sole factor. (See these lighthouse side-by-side pics for a good visual example.)
Is there more algorithmic work to be done in tweaking player performance? Yes.
Licciardello, Grüner, and Singla recently wrote "Reconstructing Proprietary Video Streaming Algorithms," a paper that details their research attempts to reverse engineer a number of proprietary scheduling algorithms, including BOLA. They were scheduled to present it at the 2020 USENIX Annual Technical Conference in July.
Not that the BOLA algorithm is standing still. In fact, in 2019, two of the authors of the original BOLA paper (Spiteri and Sitaraman), along with Daniel Sparacio, a colleague they'd thanked in the paper, published "From Theory to Practice: Improving Bitrate Adaptation in the DASH Reference Player," a research paper based on the fact that many player-scheduling algorithms for ABR content often fall along two lines: throughput-based and buffer-based. A better model, they contend, is a hybrid algorithm that uses "both throughput prediction and buffer levels in an attempt to exploit the advantages of both."
To help move the hybrid approach forward, the three authors updated the BOLA algorithm to include an enhanced version called BOLA-E. This version introduces concepts such as a "virtual segment" that contains no video data and can be used to change buffer levels, as well as a "placeholder algorithm" to better allow BOLA to make informed bitrate switching decisions. More importantly, BOLA has now been implemented into Video.js, the reference video player that's championed by the DASH Industry Forum (DASH-IF).
In addition, a new algorithm developed by the authors called Fast Switching has been implemented into DASH-IF reference players. Fast Switching has a quite novel concept: improving video quality by replacing "lower-bitrate segments in the client buffer with higher-bitrate segments" if bandwidth suddenly improves and there's time to re-populate the buffer with these higher-quality segments. This has the potential to improve low-latency throughput while not forcing the viewer to endure an indefinite lower-quality experience across the entire content-viewing experience.
Finally, Spiteri informed me that an updated version of the 2016 BOLA paper has been published, which addresses a few more details around the theory section and compares BOLA to a number of other algorithms. It also includes a change in notation. "While the original version uses bitrate m=1 to indicate the highest bitrate, the new version uses bitrate m=1 to indicate the lowest bitrate," said Spiteri, adding that the shift was "mostly to be consistent with the dash.js player, where the lower bitrates have a lower index."
With the explosion in streaming in the first half of 2020, including on-demand content viewing at home during lockdowns and the increased use of low-latency, multiple-participant web-conferencing software, the need for player performance optimization has never been greater. Fortunately, as this article tries to explain in rudimentary terms, the math behind the magic of player performance continues to build on fundamental algorithms, while novel and enhanced versions are being demonstrated and tweaked to allow for ever-increasingly better end-user viewing experiences.
Michelle Fore-Siglin provided technical assistance with this article.
Delivering content at scale requires a number of precise, discrete, yet interconnected steps, starting with an acute awareness of server loads.
The Bottleneck Bandwidth and Roundtrip Time (BBR) algorithm is now being deployed to 80% of the CDN's customers, resulting in performance improvements of up to almost 19%.