Levelling and Normalizing Audio for Better-Balanced Soundtracks
Create masterfully balanced audio for video with this guide. Learn about normalization and its limitations, sound like a pro with ducking, and create an immersive experience using the stereo field
Why is it important to balance your audio? What’s the point of tweaking audio levels when the eventual viewer/listener will have a volume control? What are some of the tools you can use to achieve balance? Before we dive into that, let’s look at an even more important question.
For those new to video production, it might be hard to imagine that what you hear in a video is often times more powerful than what you see. The right sound effects, soundtrack, and even ambient sound can bring the viewer further into the story, inspire emotion, and completely set the tone of your video. But just throwing some audio elements into a video isn’t enough. They all have to play well together, and that’s where audio balancing, or mixing, comes in.
What’s the point of balancing your audio and tweaking audio levels when the viewer will have control over the listening volume? Simply put, audio balancing has very little to do with the overall volume of the video. The overall volume gets set in the mastering phase, but that’s a subject for another article.
Audio balancing is the process of making sure that everything the viewer needs to hear can be heard in the video, no matter how loud they have the volume set. It ensures that the dialogue isn’t overpowered by the music, that the ambience is just loud enough to make the video sound realistic, and that the sound effects don’t blow out the viewer’s ear drums. A properly balanced soundtrack can breathe life into a video, and make the viewer feel as though they are in the story with the characters.
There are many tools that you can use to get well-balanced audio in your videos. Here we’ll be looking at normalizing, ducking, panning, and stereo width.
The simplest, yet most important part of balancing your audio is setting the levels for each individual sound element. For this task, we use a combination of normalization and the basic level slider.
In order to get the full use of both normalization and your sliders, it’s important to have your audio organized properly in your timeline. Each dialogue audio source should be on its own dedicated track. Same goes for sound effects, foley and music. When each audio source is put on its own dedicated audio track, it becomes much easier to set your levels correctly.
The first step to getting the proper levels is to normalize your audio clips. Normalization is a process that raises and lowers the volume of your audio clips so that the peaks of those clips are set to a certain level. In theory, this would normalize the loudness of each audio clip, but there are some limitations that we’ll cover later.
Normalization is less important for sound effects, foley and music, but it is imperative for dialogue clips. Normalizing your dialogue will ensure that all these clips are around the same basic volume. Most DAWs and NLEs have a built-in normalizing plugin, and it’s just a matter of dropping that plugin onto your audio clips or on to the entire audio track, and setting the plugin to normalize your audio. At this stage, normalizing your dialogue to around -3dB should be sufficient.
The problem with normalizing is that most of the plugins available normalize based on the peaks in the audio, and not the general loudness. Any clips that are mostly quiet save for a few loud peaks might not be raised in level very much, while those that are consistent (without peaks) at a medium volume level may be turned up more!
This is why it’s important to not just rely on the tools you’re using, but also your ears. After normalizing your audio, review the dialogue and tweak the levels of each individual clip to make sure they have the same general volume.
After you’ve normalized your audio, it’s time to dial-in the levels of each audio track. Remember that the goal here is to make sure that nothing is overpowering anything else. If your music is too loud, it will distract the viewer from the dialogue. If sound effects are too quiet, they won’t have the desired effect.
Let’s pause for a moment to look at our volume meter. You’ll notice that along with decibel markers, there are also colors indicating how loud your sound is. As a general rule, the green zone is where ambience and music should live; the yellow zone is for dialogue and sound effects; and the red zone should be avoided.
Most audio professionals use mixing boards to set their levels, but most DAWs (digital audio workstations) and NLEs (non-linear editors) have sliders built into the program. Those are what we will be using to set the levels for our audio tracks.
First let’s look at the music. Use your sliders to bring the music up or down until it’s sitting at the maximum volume that it will be during your video. Later, we’ll use ducking to automatically bring down the volume of the music to compensate for the dialogue.
Next do the same for the ambiance – this should stay in the green zone of your audio meter, just a little higher than your music. After that, set your dialogue and your sound effects so that the general loudness sits between -12dB and -18dB. If the dialogue peaks into the red zone from time to time, that’s OK. We can use EQ, limiting and compression to fix that. We’ll cover those tools in a future article.
If your background audio is too loud or too similar to your dialogue, that dialogue can easily be drowned out. Audio ducking is the process of bringing down the volume of music when there is dialogue present, and bringing it back up during periods of no dialogue.
Typically, there are three ways to do this: using audio sliders, auto-ducking (using compression or audio ducking plugins), and keyframing. Since using sliders for audio ducking can be complicated, and not all DAWs and NLEs have a clear-cut auto-ducking feature, we’ll focus on keyframing in this article.
Using keyframes for audio ducking is easy. Simply set a keyframe on your music clip about 8 to 16 frames before the dialogue starts, another at the beginning of the dialogue, one at the end of the dialogue, and a final keyframe 8 to 16 frames after the end of the dialogue (you can set your first and last keyframe wherever you want, but 8 to 16 frames will give you a nice smooth transition). Bring the volume on your inner-two keyframes down low enough so that it doesn’t interfere with the dialogue.
One of the easiest ways to add a dynamic element to your videos is by panning your audio. Every professional-grade DAW and NLE gives you the ability to pan your audio from left to right. This simple yet powerful tool can completely immerse your viewer into the story and make them feel as if they are actually there with the characters.
The best way to pan your audio is to pay attention to what is happening in the scene. If you have a car driving from left to right, pan the audio from left to right. If you have two people talking in the scene, pan the dialogue from the person on the left slightly to the left, and the dialogue from the person on the right slightly to the right. A little bit of panning goes a long way into making your video a more realistic experience for your viewers.
If you really want to give your viewer an immersive experience, you need to be taking advantage of the stereo field. In order to understand what this means, it might be easier to look at how music is mixed.
In a basic music mix, the vocals are usually front-and-center, while the rest of the music sounds more spread out. This is done for two reasons; to give the song a fuller, more complete sound, and to make the vocals more present so they don’t get drowned out.
In video, the goal is the same: keep the dialogue front and center, while spreading out the rest of the audio elements in order to give the video a more complete soundscape. Each DAW and NLE has a different way of accomplishing this, but they all have built-in tools for manipulating the stereo width of each audio element.
There are no rules when it comes to setting the stereo width of your audio elements, but generally you want to keep your dialogue in the front, with a narrow width, the sound effects slightly wider, and the foley and music even wider.
The difference between an amatuer video and a professional video can often be found in the audio. By implementing these tools in your next video, you can start getting a more professional sound, and giving your viewers a video they’ll not soon forget.