Streaming Media

Streaming Media on Facebook Streaming Media on Twitter Streaming Media on LinkedIn

Are You So Focused On Video That You’re Neglecting Audio?

Even when your video is perfect, nothing sours a streaming experience more quickly than poor audio. Whether you are running the whole show yourself or receiving an audio feed from the "sound guy," there are several areas you need to pay attention to when combining audio into your live video stream.

This article is sponsored by Epiphan Systems.

When people think about video streaming, video is usually the first thing they think of. But audio is also very important, and it plays a huge role in helping you to convey the message of your video. This a primer for audio to get you speaking the right lingo to the sound guys, and to help you understand what an audio waveform looks like and how to treat it properly. This will help ensure that you get your audio into your video in a clean and efficient manner.

It really does make a big difference for your audience when you have pleasing audio. Generally, when we refer to “pleasing audio,” we mean audio that’s nice and clear with no distortion. It's not too quiet, so they really have to strain to listen to it, and it doesn't vary too much, with very quiet sections followed by very loud sections. Good audio allows your viewers to enjoy an easy listening volume that plays throughout the whole presentation.

Understanding Audio Signals

To capture pleasing audio, you need to understand a little bit about audio signals to know how to treat them, how to record them properly without distorting them, so that you get a nice, clean signal. Figure 1 (below) shows a waveform of an audio clip that I took from one of our videos promoting one of our products. It looks very chaotic, but in fact it's very regular. The peaks you see represent the points where the voice is actually speaking. The quiet parts are the little bits in between that really don't make any sound.

Figure 1. A typical waveform from a video recorded with good audio. Click the image to see it at full size.

When you speak about audio with sound engineers, it's important to use the language that they use. Let’s begin by talking about decibels (dBs), the unit used to measure sound. A lot of circuit engineers, when they’re dealing with amplifiers and such will refer to voltage, but in general, people talk about decibels. The reason we use decibels to talk about audio is that they more accurately reflect the perception of audio in the human ear. It’s a logarithmic scale that can manage very, very quiet sounds and very, very loud sounds.

Linear units, like volts, can't do that very easily. The normal range in audio processing, in decibels, is -60 dBu to +30 dBu (or -60 dB to +30dB)--a nice, manageable range with easy numbers. In voltage, that same range is .0007 volts to 25 volts. Decibels are a logarithmic scale rather than a linear one like volts, expressing a power ratio rather than a specific amount. If I increase volume by 6 dB, and I increase it by 6 dB again, and again, and again, that is perceived to be the same amount of change over and over. By contrast, if I use a linear scale like volts, as I increase the signal level, I need to increase it more in successive times to get the same perceived value. If I'm increasing by one volt, that's a perceived value. Then if I increase by one volt again, it doesn't seem like I increased it by the same amount, and so decibels more accurately reflect the way that you hear sound, and thus it make it a lot easier to talk about sound.

Decibels are the unit that most engineers will use when they're talking about audio. There are two ways to use this unit of measure. The first is an absolute reference, so this tells you, literally, what the sound level is. You can take that and compare it to specifications. You can use it in order to meter. Line level on record outs is -10 dBV, with DBV representing decibels referenced to a volt. A bolt is a very specific number, and that tells you what the level is.

Similarly, if I said the maximum rating on a recorder input is 12.3 volts peak, that's an absolute number. If I say the nominal level on a mixer is +4 dBu, that is an absolute number that you can measure, but they also use decibels to be something that's relative. When I say I need something quieter, or I say, "We've got 18 decibels of headroom," it's implied that the reference is relative to where I will clip.

The clipping point (the point at which the sound becomes overdriven and distorted) is the implied reference. This is not really an absolute number. That signal could be coming in at all kinds of different levels, depending on the equipment I'm using, and be 18 decibels away from the clipping point. It's relative to some implied reference. Similarly, if I said, "I need 10 dB of attenuation," that doesn't refer to what level the signal is at, or what level I need the signal to be. It just says I need it to be 10 dB different than what it is now.

These are just some examples of how decibels are used, both in an absolute reference and in a relative reference to describe the signals. Even though we have absolute numbers measured in decibels to describe a signal, signals vary a lot over time.

As a person is speaking, or during a musical passage, if you look at an audio waveform such as the one in Figure 1, you’ll see the audio level go up and down drastically from time to time. It’s not possible to assign a singular value and say, “This audio signal is at 4 dB,” and completely describe the nature of the signal, or tell you everything you need to know about the signal in order to record it and not have it clip.

So we break this into two measurements. The first measurement is an average value. It's a root mean square, or RMS value. It tells you the relative average level of that signal. That tells you roughly how loud it seems to the human ear. Then we look for the peak amplitude, which describes how much above that average value the signal will get at its highest point. The difference between those two is called the dynamic range.

Dynamic range is important because it tells you how much the signal will vary. The average level is very easy to measure, and very easy to deal with, but that average is just that--an average. The waveform will also have peaks, which will come in much higher than that average value, and that’s where you have a real risk of clipping the signal. You want to avoid clipping the signal as much as possible.

Related Articles
Here's a look at how the Epiphan Pearl Production Switcher performed under pressure on live-switched keynotes at the Computers in Libraries 2016 conference in Washington, DC.
Streaming Media's Tim Siglin and Epiphan's Dave Kirk discuss Epiphan's capture, encoding, and live streaming workflow solutions at Streaming Media West 2015.
Jan Ozer demonstrates how even the most non-technical user can pull off a live-switched stream with the Epiphan Pearl streaming appliance.