SSIMWave SQM Review: Frustrating Video Quality Measurement
To run SQM in batch mode, create a text file specifying the test video and source file, with optional controls for setting the offsets and number of frames processed. Load the text file into the program, which checks your inputs before allowing you to run the batch—another useful feature. SQM saves off the results of each analysis as individual .csv files.
I tested performance first. SSIMWave’s website claims that SQM “performs the QoE of a 4K resolution video at more than 100 frames per second.” I ran SQM on an HP Z840 workstation with two 3.1GHz Intel Xeon E5-2687W v3 CPUs and 64GB of RAM running Windows 7, and all analyzed files were stored on HPs Turbo SSD G2 drives.
Working in the GUI, I analyzed 10 seconds of three files using five device settings, comparing each to its source MP4 file. A 10-second segment of a 4K HEVC file took 56 seconds to analyze, a 10-second segment of a 720p HEVC file took 15 seconds to analyze, and a 1080p MP4 file took 16 seconds to analyze. I preconverted the 1080p files to YUV and tested again and saved only 2 seconds.
I asked SSIMWave about these discrepancies and learned that the website claims reflect a GPU-based implementation that’s available “upon request” and relates only to the SSIMplus computation, not the demuxing, decoding, and display performed by the GUI tool. I didn’t test the GPU-based implementation, so can’t verify the company’s claims, though the company representative said SSIMWave will adjust the language on its website to clarify performance-related expectations.
I ran through several quality-related scenarios that I’ve used the Moscow VQMT tool for in the recent past. The first is shown in Table 1, wherein I analyze the encoding quality of two clips, an animation and a real-world video. For each clip, I created three short files at 3850Kbps, 4300Kbps, and 5800Kbps, which were the presets used by the client, and then a 5-minute test for the 3850Kbps preset to verify the results of the shorter tests. I encoded all files in CBR and 125 percent constrained VBR mode, and then tested with both tools. The green highlighted box identifies the quality leader; which for VQM is the lowest score, while for SQM it’s the highest.
Table 1. Comparing the MSU VQMT tool and the SSIMWave SQM tool
I was testing to learn two results. First, did 125 percent-constrained VBR produce better quality than CBR, and second, were all three 1920 iterations necessary? As to the first test, the numerical results were very similar between the two tests; overall, VQM showed VBR better by .09 percent, while SQM showed VBR better by .87 percent.
The issue here is that this and other comparisons often hinge not on overall quality but on the quality of one or more frames within the file. This is shown in Figure 3, where the CBR frame looks mangled, and the VBR frame looks only slightly blocky. In GUI mode, the VQM tool makes these results easy to see by presenting a quality graph for the duration of the clip and allowing you to move through the clip and view the actual source and encoded frames. To produce the same analysis with SQM, you would have to run the tests, scan through the CSV file results to identify low-quality frames, load each video into a player, navigate to the frames, and grab them.
Figure 3. While the quality ratings are similar, CBR can produce the occasional ugly frame.
The second question I was testing for was whether the client needed all three 1080p streams; after all, if the quality difference between the three was minimal, why incur the costs of encoding and delivering the higher quality streams? In the animated clip, the difference in quality between the 5800Kbps and 3850Kbps using VBR encoding, was 1.27 percent for VQM and .35 percent for SQM—no major difference there. In the real-world video, quality improved by 1.53 percent with the VQM, but by 5.47 percent in the SQM, where the score improved from a high-good quality (79.87) to a low-excellent quality (84.49). This is another scenario in which the ability to actually see the quality differences would have been very valuable.
HEVC vs. VP9
The second analysis relates to HEVC versus VP9 quality comparisons that I’ve been tracking since December 2014. The results were originally published in “The Great UHD Codec Debate: Google’s VP9 vs. HEVC/H.265,” and I updated the results for a presentation given at Streaming Media East in May (you can watch the video and download the presentation at streamingmedia.com/ConferenceVideos), then finalized the tests in June.
Table 2 shows the results. Again, the best scores are in green, and the results are close with both metrics. Overall, x265 scored 8.69 percent better in VQM, where lower scores are better, and .97 percent higher in SQM, where higher scores are better, though both VP9 and HEVC rated in the excellent range (80–100) in the SQM test.
Table 2. VQM vs. SQM analysis of VP9 vs. x265
Multiple Resolution Tests
The final tests related to SQM’s ability to test multiple resolution renditions against a common source, an analysis that can’t be performed in the Moscow University tool. Table 3 shows how a 640x360 file compares to an 854x480 file when encoded to the same data rate. This answers the question, if you can only have one file encoded at 1050Kbps, should you encode at 640x360 or 854x480?
Table 3. Multiple resolution tests across multiple presets with the SQM.
The 640x360 file has lower resolution, but with more bits per pixel, which means the encoded quality of each pixel should be higher than the other file. The 854x480 file has greater resolution, or detail, but the quality of each pixel is lower because of the lower bits per pixel value. The results presented in Table 3 indicate this smaller file would be perceived as higher quality on all viewed devices. Again, these results assume that the video is watched at full screen, and it would have been interesting to see if testing at the native resolution of the video, which would have required custom presets from SSIMWave, would have changed the results.
So where does that leave us? SQM has many unique features, including an absolute grading system, the ability to rate quality on different devices, and multiple-resolution testing, but provides little ability to actually see the differences that it measures. SSIMWave is working to address many of these concerns by 2015 year end, including video seeking in playback mode, and the ability to visualize results for multiple clips. Until then, if you consider SSIMplus the holy grail of video quality algorithms, you’ll find SQM a highly usable and efficient tool for obtaining these ratings. If you need to be seduced into a new algorithm by confirming quality scores with your own eyeballs, you may want to wait until SSIMWave provides these features.
This article originally appeared in the October 2015 issue of Streaming Media magazine as “Review: SSIMWave.”
The tool predicts how a human being would see a piece of video, and consistently gets 90% or higher correlation with the video's mean opinion score.
Netflix announced the open-source availability of the Video Multimethod Assessment Fusion, which it's now using instead of PSNR to analyze the quality of transcodes in its vast catalog
Objective quality benchmarks provide exceptionally useful data for any producer wanting to substitute fact for untested opinions.
How can video compressionists assess the quality of different files? Only by combining objective mathematical comparisons with their own professional judgments.
Which delivers better quality, encoding time, and CPU performance—HEVC or VP9? We put them to the test to decide once and for all.
The delivery optimization company finds that viewers quickly stop watching poor video and often leave the site that's providing it.
Never heard of it? Learn why we call this video encoding analysis tool invaluable, and a must-have for anyone serious about encoding or compression.