Streaming Media

Streaming Media on Facebook Streaming Media on Twitter Streaming Media on LinkedIn

Are You So Focused On Video That You’re Neglecting Audio?

Even when your video is perfect, nothing sours a streaming experience more quickly than poor audio. Whether you are running the whole show yourself or receiving an audio feed from the "sound guy," there are several areas you need to pay attention to when combining audio into your live video stream.


Figure 2 (below) illustrates what I mean by clipping. In the signal on the far left there's no clipping, so it's a nice rounded waveform. In the waveforms in the middle and on the right, you can see analog clipping and digital clipping. While they're slightly different, they're very similar in that they're going to cut off the top of that waveform. When you do that, you’re not just making that peak quieter. You’re actually changing a lot about the waveform, adding harmonics that don't exist there. That's when you hear distortion, so similar to an electric guitar effect when you put the guitar through a distortion pedal. Those are just extra harmonics that are being added to the signal when you clip the top off the signal.

Figure 2. Non-clipped and clipped waveforms. Click the image to see it at full size.

When you hear audio being overdriven or overloaded, that's the result of clipping. Every piece of equipment is going to have some upper limit at which it's going to start to clip the signal. It can handle only a certain upper limit of signal level, and you need to know what that is in order to avoid this clipping.

Clipping not nice to hear, and it completely changes the texture of what's been recorded. In Figure 3 (below), you can see the difference visually between the original recording, which I've got small on the right-hand side, and the clipped one which I've amplified, on the left-hand side. You can see those peaks that play nicely on the original are all cut off on the amplified version. That's what gives you that really harsh, distorted sound once you've clipped it. That's what we're trying to avoid as we set up our equipment.

Figure 3. Another view of clipped and non-clipped audio represented in a waveform. Click the image to see it at full size.

There are a few things you can do to avoid clipping. The first is to know both the nominal level, or the RMS average level of the signal that you're dealing with, in absolute terms, so know whether you're getting a signal that's approximately -10 dBu, or dBV, and then know the dynamic range of that signal. How far above that average are those peaks, and how close are you going to be to the limit of the equipment that you're using?

Then you need to ensure sufficient headroom on that equipment. You need to calculate how much room you need, and then, with the equipment and the meters, which I'll show you a little later, you'll be able to detect whether or not you're actually achieving that.

As a last resort, there is a technology, called "limiting," which will essentially clip the signal for you, but in a gentler way, so if you have a signal which is definitely going to clip, you can do it proactively, rather than let the equipment do it reactively. Of course, it's not that easy just to use amplitude or signal levels to avoid clipping. If you just turn things down, so that the peaks don't clip, then you run into the problem where parts of the speech are too quiet to hear.

If you simply turn down the volume, you're not affecting only those peak levels, and moving them away from the clipping point, which is what you want; you're also taking the quieter passages and turning those down as well. Turning down the volume or turning down the level is one way to avoid clipping, but you can run into the problem with signals with high dynamic range where the quiet portions are too quiet. That's the first problem.

The second problem is a scenario like a conference panel, where you've got multiple speakers and multiple microphones. One speaker might sound fine, and then another presenter, using a different mic, or speaking a little more loudly, will begin to clip. When you're producing a live event like that, you need to be aware that different microphones or even different speakers will produce different levels, and you need to be able to accommodate that. That's what a sound check is usually for.

Then, similarly, if you're doing things like streaming live bands, you might find that during the sound check, everything sounds great, but then when they get up on the stage, especially in the last set, everybody's all excited, they start singing a little louder, playing a little louder, and you start to clip. Getting your levels right at the outset is one thing, but there are other challenges around dynamic range that really make it difficult.


That's where compression comes in. The whole idea behind compression is to be able to reduce that dynamic range. If you've got peaks that are way above your average value, and you know they're going to clip, so you have to turn the signal down, what you want to do is bring up the low sections and bring down the high sections and kind of push them together. That's why it's called "compression."

In Figure 4 (below), you can see this signal is kind of compressed. It's the same waveform that you’ve seen in the previous figures, but now you don’t see those tiny little quiet parts. Those have been brought up to higher levels, and the peaks have been tamped down a bit. Bringing those levels together makes the audio more pleasing, and it makes it much easier to record. In the original clip I showed you, we had a dynamic range that was about 30 dB. It was very broad. In Figure 4 it’s been compressed down to 6 dB, so it looks very different.

Figure 4. The same audio waveform seen in other figures, with an original dynamic range of 30 dB, compressed to 6 dB to avoid clipping. Click the image to see it at full size.

Classical music typically has a large dynamic range. By contrast, most pop music has much less dynamic range. Figure 5 (below) shows the differences, with a waveform from an opera on the top and one from a pop song on the bottom.

Figure 5. High dynamic range (top) and lower dynamic range (bottom)

Related Articles
Here's a look at how the Epiphan Pearl Production Switcher performed under pressure on live-switched keynotes at the Computers in Libraries 2016 conference in Washington, DC.
Streaming Media's Tim Siglin and Epiphan's Dave Kirk discuss Epiphan's capture, encoding, and live streaming workflow solutions at Streaming Media West 2015.
Jan Ozer demonstrates how even the most non-technical user can pull off a live-switched stream with the Epiphan Pearl streaming appliance.