Online Music Services Require Improved Audio
With Pandora continuing to grow, and rumored streaming music services on the way from Apple and Google, it's time to focus on the challenges of processing and encoding sound for the cloud
[This article appeared in the April/May issue of Streaming Media magazine under the title "Sound in the Cloud."]
Last year was the year of Facebook (or at least the year of Mark Zuckerberg) the movie (and the ubiquitous nerd). But 2011 is shaping up to be the year of streaming music, and there’s no monolithic personality behind this juggernaut, not even Steve Jobs, whose streaming music proposition for iTunes is reportedly launching this year. It will have to compete with a couple dozen other streaming music entities. This includes the smartly helmed Pandora, which has been cutting deals to get under our skins through whole-house automation systems and automotive audio. Spotify, the Swedish service that gave music streaming its best traction in Europe with 10 million subscribers, keeps delaying its arrival in the U.S.—which remains the world’s biggest market with music sales worth $4.6 billion in 2009. Last.fm, the U.K. service that CBS acquired in 2007, topped 40 million users and offers a well-rounded array of content besides music. Others have been waiting for the techno-entertainment culture to migrate from ownership of file-based media to being able to pull it down from anywhere, anytime.
The shift from downloading to streaming is tectonic by digital standards, and, like the transition from optical media to downloads before it, it will be accompanied by much teeth gnashing and lawyer wrangling as royalty payment standards are worked out as much in the courts as in the smoke-free back rooms of Palo Alto, Calif.. In fact, London’s Financial Times asserts that attempts by Spotify and others to launch in the U.S. have been delayed mainly by the complexity of negotiating licensing terms with record companies and music publishers, not the underlying technology.
The Sound of Streaming Music
But what the transition means for the audio quality of music is less murky to parse. Just as download files were finding an audiophile channel, with bulked up 256Kbps versions and even lossless WAV, FLAC, and ALAC files making their way around the internet, streaming comes along and presents music with an even broader landscape.
“Wouldn’t it be nice if one of these days people who write music and people who write code could actually get together with each other?” asks Robert Reams, a tech entrepreneur who sold his most recent company, Neural Audio, to DTS, Inc. 2 years ago. He says that the processing of music is going to have to change to accommodate streaming. Reams says that spatial (imaging) and temporal (timing) differentials between the left and right axis of music are to blame for most of the artifacts that lossy codecs such as MP3 and AAC create when they process music.
“Dynamic spatial and energy offsets within the content affect the coding efficiency of modern lossy codecs,” Reams explains. “Any time the content is mixed [in such a way as to create these conflicts], the sound may be rendered needlessly flawed. Linear formats [i.e., magnetic media] are forgiving in regards to sloppy spatial mapping, perceptually irrelevant spectra, and noise. Lossy coding is not.”
The offsets Reams refers to are created as part of the music mix, which is implicitly unbalanced by nature. Differences of as little as 1 dB between the same information in each stereo channel can cause the codec to view it as an error rather than a work of art. Ideally, the effect of a codec should be addressed at the point the music is mixed (and MP3 developer Fraunhofer-Gesellschaft debuted something exactly like that at the NAMM Show in Las Vegas in January, which we’ll get to in a minute). But, Reams acknowledges, with the biggest news of last year being the arrival of The Beatles on iTunes, the vast majority of music that will be streamed will be legacy content, anything from a year old to 5 decades old.
“If the mastering [deck’s] dead azimuth was not perfectly aligned, the audio on the left channel may not be perfectly synchronized with the audio on the right, a little before or a little after. An offset of even one sample can generate enough energy to induce a problem with the [codec] processing,” he explains.
Clean Streaming Music Tracks
Tim Carroll, founder and president of Linear Acoustic, Inc., which makes processing equipment used in streaming and file delivery, emphasizes that starting with a clean track is crucial to music surviving the streaming process with as much integrity as possible. “Most codecs are pretty benign these days—AAC at 224Kbps compared to what we’ve been used to is pretty spectacular, and the HE-AAC data rate can be lower but still maintain the same quality,” he says.
However, the data rates needed to stream fully lossless music files are high and are not going to be the norm in the streaming environment anytime soon, especially to mobile users. Thus, Carroll stresses, the music-mastering stage of the record-making process will be critical, and it will have to work even harder to avoid migrating the problems caused by the ongoing “loudness war” to the streaming environment. The phrase refers to an unspoken, but very real, competition on the part of music content distributors (i.e., record labels) to digitally master and release recordings with higher real, and perceived, levels of loudness. Labels and artists are looking for records that stick out aurally, simply by being louder, whether on the radio or in a club. The practice goes back decades, to the days when Motown Records founder Berry Gordy Jr. would have his staff engineers analyze the top 10 records every week, including how relatively loud they were, so he could have a benchmark for the loudness of Motown’ singles. The result is a loud record, perceptually speaking, but it is often sonic mush, artistically speaking, with the dynamic range of each individual instrument blunted and subsumed by the force of the track’s overall level.
“When you master a [track] to the point of clipping, every transient, like the kick drum, becomes a square wave, and the processing engine is going to have the hardest time trying to code that, because [clipping] creates harmonics that distract the codec,” Carroll explains. “That’s just bitrate wasting, because the codec is having to work harder. Codecs choke on clipping artifacts. In a low bitrate environment like streaming, the name of the game is cleanliness from the beginning, free of clipping and square waves and overloads; stuff you can’t recover from.”
Major release shows music lovers thousands of classic and new rock music performances in remastered video.