How to: Video Quality Optimization
The idea is that, with enough end users downloading a U-vMOS app, conducting their own video quality tests, and uploading the results, a common set of crowd-sourced testing results will emerge.
A major issue with this approach, though, is that a smartphone or tablet app can be used in a wide variety of settings, so it would be difficult to baseline video quality opinion scores without knowing what ambient or environmental conditions the end user faced while scoring a particular video.
Should End-User Measurements Be Used?
Conviva and THX built on this basic concept in early 2017, joining forces to use Conviva’s real-user measurements (RUM) to scale out these kinds of tests.
The idea behind the partnership is to allow THX to pitch a certification model similar to the one it did for movie theaters over the past 2 decades. Conviva, for its part, would provide the data that THX could use to attempt to certify video streaming quality.
“The program will help streaming companies perfect their encoding and delivery methods for various challenges specific to streaming,” the companies noted in a Conviva press release, adding that this would assist in “guaranteeing consumers experience the highest possible quality of audio and visuals.”
It’s true that THX has expertise in “picture quality, encoding and compression techniques, and content packaging,” but it’s also true that the Conviva RUM data won’t contain information about ambient or environmental factors surrounding the end-user’s perception of quality of experience or delivery.
Can Machine Vision Mimic Human Vision?
In July 2016, in a trade publication called the Journal of Visual Communication and Image Representation, a paper was published that may lead to common ground on how to perform HVS-based quality assessment.
Partially authored by a Huawei researcher, Yin Zhao, and titled, “Spatial Quality Index Based Rate Perceptual-Distortion Optimization for Video Coding,” the article describes two key elements: which HVS properties to use in quality measurements and how to approach a spatial quality index so that an automated metric can “predict video quality much close[r] to subjective judgments.”
The study, led by Zhejiang University researchers, notes that there are two key HVS elements to consider when building a quality measurement tool: contrast masking effect (CME) and motion masking effect (MME).
These two HVS areas are then used to measure perceptual-distortion in video coding, with the results applied to a rate perceptual-distortion optimization (RpDO) algorithm.
Per Video (or Even Per Scene)
To get to true optimization, as mentioned before, it will be necessary to move beyond a per-title approach and to a per-video approach—or even a per-scene approach, computing power permitting—either during the encoding or transcoding stages.
Some companies, such as EuclidIQ, a company I worked for in late 2015 and early 2016, take the approach of applying perceptual quality optimization at the time of initial encoding. This requires applying spatial quality algorithms within a frame and also looking ahead a few frames to apply temporal quality algorithms. To avoid slowing down the overall encoding process, the perceptual quality optimization should only look forward a few frames, rather than across the length of a typical HLS segment (2–10 seconds).
Other companies, such as Beamr, take a post-processing approach, which offers a bit more automation but requires quite a bit of brute force to re-encode the file multiple times in an attempt to find a level where the processing begins to add in artifacts.
Going beyond that point is counterproductive. Nigel Lee, chief science officer at EuclidIQ, explains the video-bitrate breakdown (VBB) point in detail in a series of blog posts.
When Beamr acquired Vanguard Video in early 2016, the combination allowed Beamr to begin to integrate at the encoding level. Vanguard is known in the industry for having above-average codecs, and tight integration between the encoder is key for any company looking to reduce overall bandwidth while maintaining perceptual quality.
Other companies are exploring optimization as well, and 2017 will undoubtedly bring more clarity to what Streaming Media’s Dan Rayburn called the ecosystems surrounding encoder (and codec) advancements. The AOM has already shown a bit of progress in optimizing the new AV1 codec thanks to past work done by Mozilla and Xiph on the Daala codec and Google on the VP10 codec.
This encoder-optimization approach is tailor-made to allow media publishing companies to pursue encoding on a per-video level, thanks to the tight integration between optimizing algorithms and well-respected encoders.
Yet to truly increase adoption and ease the burden on all those golden eyes from a quality control standpoint, the industry also needs a set of tools that truly mimic the human visual system. Until we have tools that move beyond PSNR and SSIM and toward more human-like perception of quality errors, the labor cost of optimizing videos to their best possible quality may prove too expensive for all but the highest-valued premium content.
This article appears in the March 2017 issue of Streaming Media magazine.
Today's market is too competitive for subpar experiences. If companies aren't monitoring quality of service and quality of experience, they're likely losing viewers—and profits.
EuclidIQ's Frank Capria and Streaming Media's Tim Siglin discuss Euclid's new OptiiQ.ly and UptiiQ solutions and improving the viewer experience for online and OTT video in this interview from Streaming Media West 2016.
The sheer number of video quality measurement tools makes it difficult to choose the right metric. Here's a quick overview of some of the options and what they offer.
Mux makes it simple for media companies to learn exactly what problems their viewers are experiencing, and then find solutions.
Netflix announced the open-source availability of the Video Multimethod Assessment Fusion, which it's now using instead of PSNR to analyze the quality of transcodes in its vast catalog