Sure-Fire Tips for Encoding High-Quality, Low-Bandwidth Audio, Part 2
In the following tutorial, we’re going to explain how to get the best results when encoding audio for low-bandwidths. By low bandwidths, we mean up to 20Kbps audio for mobile phone networks and dial-up 28.8Kbps and 56Kbps targets. Anyone with more bandwidth to spare might want to listen up, too, as any bits you can squeeze out of the stream and still produce the audio quality your audience desires will save you bucks on your bandwidth delivery costs.
Editor’s note: While the following tips, tricks and techniques have been tested in real-world settings, your results may vary slightly depending on your audio content. We suggest you use the following tutorial as a foundation with which to begin experimenting.
Last week, we covered what you’ll need to get started and some recording tips. This week, we outline editing and encoding tricks.
Tip 16: Choose uncompressed, digitized files such as WAV and AIFF files over MP3 or other compressed files to eliminate one more file conversion step. Also, avoid type I cassette and VHS tapes, which can be noisy. CD and DAT sources are preferable.
Tidbit: Music ripped from a CD or DAT has better signal-to-noise than audio brought through the analog circuitry of a capture board. But, CD and DAT audio generally has a wider dynamic range and higher sample rate (44.1kHz or 48kHz) that will be difficult for the codec to pass through low bandwidth links. You’ll need to down-sample, equalize and normalize the audio before passing it through the encoder.
Tip 17: Use your audio editor to cut out the frequencies at the top and the bottom of the audio spectrum. Start at the low end of the spectrum. Cut low frequencies off below 120Hz if it’s music, and below 200Hz to 300Hz if it’s voice only. Vocal doesn’t have useful frequency information below 250Hz. Start cutting off the high end of the spectrum. Cut high frequencies above 6KHz if it’s music, and above 5kHz if it’s human voice. A soft, first order roll-off (6dB per octave) is preferable.
Tip 18: If you’re experiencing a low frequency hum (caused by electrical power), apply 60Hz notch filtering with your editor to filter it out.
Tidbit: The human ear is especially sensitive to frequencies between 1.5kHz and 4kHz (sensitivity decreases beyond and below this frequency band, as per the Fletcher-Munson curve). These midrange frequencies get the most bang for the bandwidth, so record and edit with a preference for these frequencies.
Tip 19: Some voice-only clips can be made more intelligible by slightly boosting frequencies in the 1 to 4kHz range using the EQ function on your editor.
Tip 20: Other than live streaming scenarios, use your software editor rather than the encoder to down-sample. The sample function built into the encoder is generally built for speed and made to run in real time, whereas a software editor’s sample feature is usually skewed for quality first, speed second.
Tip 21: Listen to all possible players your audience may have. Some will hard clip audio, or sharply clip off sounds, even if no clipping occurred during encoding. If this is a recurring problem with a particular player in use, be especially careful to keep input levels low.
Tip 22: Get the file as small as possible using linear methods before applying codec compression to it. The general rule of thumb is to first change stereo to mono. Then, get the sample frequency down, and only as a last resort, reduce the sampling bit width/depth. A WAV file converted to 22kHz sample frequency at 16-bits mono and then encoded for 20Kbps will sound better than if it were encoded at 44.1kHz sample frequency at 8-bits stereo. To see how this tip works in practice, check out the audio samples at: http://www.users.qwest.net/~ccadams/AudioDemo/
The AndyCompOriginal.wav is a voice and music sample from Warner Brothers’ "Man on the Moon" soundtrack CD.
The AndyComp22Kmono_20KbpsMono.rm file was down-sampled from the Wav’s original 44.1kHz to 22kHz sample rate and converted to mono, then compressed to 20Kbps mono using Cool Edit Pro.
The AndyComp44Kstereo_20KbpsStereo.rm file was not re-sampled and not converted to mono, but compressed to 22Kbps stereo using Cool Edit Pro. Notice the watery, wavery, flutter sound compared to the better sounding second file.
Tip 23: Avoid setting the sample frequency below 22kHz. There’s a strong psychoacoustic breakpoint at 11kHz bandwidth, which translates to a 22kHz sample frequency. Human hearing is attuned psychoacoustically to frequencies up to 11kHz, so setting the sample rate frequency below 22kHz will be more noticeable to the ear than setting it above 22kHz. Settings below 22kHz can cause a whistle, bell-like noise.
Next Page: Encoder tips >>