How to Fix Bad Audio Quality from Video
You’ve recorded the most amazing interview or scene using your brand new cell phone which can shoot at 4K. It looks amazing, and the performances are perfect. But when you get it into the editing program, the sound quality is terrible. Echo. Distortion. Noise. Artifacts. Microphone rustle or bumping. No matter how good your video is, that horrible sounding audio ruins it. What are you to do?
The good news is that there are dozens of professional audio solutions available for nearly every kind of audio issue these days. And most of those solutions are either already built into your video editor or DAW or are incredibly affordable. There are even some “old-skool” hacks which can greatly assist in solving bad audio problems. Let’s take a look at some common issues found with shooting with phones, consumer recording equipment, or even professional recording equipment in noisy environments.
What is distortion? It’s what happens when the recording device/audio file runs out of digital “bits” to record the levels of audio. As a result, a “crushing” or “harsh” sound is heard. It wasn’t too many years ago where if you had distortion of any kind in your recording, you had to re-record the audio. There was no solution. In fact, no one could imagine how distortion could ever be fixed. The best we could do was add equalization in order to try to minimize the “frying egg” sound which told everyone that “you recorded that too loudly.”
Using EQ never really worked then or now. It only took distorted audio from “unusable” to “terrible.” Now, there are several affordable solutions which can so completely erase the distortion caused by both overloading audio inputs and digital distortion, that some folks don’t even watch their meters anymore. Of course, this is a terrible practice, but I have taken distorted voice audio which was so “fried” that you couldn’t understand the talent - and turned it around to 90% fidelity. How? Using modern “declipping” plugins. Most notably, Accusonus’ ERA 5 De-Clipper
In the audio example below, take a listen to the original audio - in all of its fried glory, and how De-Clipper makes audio clean up a breeze.
To be clear: using these kinds of audio plugins is a great way to fix distorted audio, but you will always get the best possible result by recording non-clipping audio in the first place.
The lesson is: always record your audio below zero dB. But if you get distortion, De-Clipper is an excellent way to save the recording.
What is room noise? Simply put, room noise is whatever ambient sounds that are constant (and sometimes not so constant) and present in the room in which the video was shot. Sounds like refrigerators, lights, traffic, crickets, fans and even camera noise can all be considered room noise. This noise rarely completely ruins a recording, but it can make listening to dialog or voice more difficult, and even muffle the sound of the voice altogether.
Like distortion, in the old days we used to use EQ to help with runaway room noise. We do this by finding the frequency of the noise(s) and rolling them off. The problem with this procedure is that you’re also rolling off the good sound of the voice. And because we have to use very narrow “Q” or widths, it creates weird phasing effects in the final sound of the recording.
A much better “old-skool” method to remove background noise from audio is the implementation of an expander. An expander or noise gate (either one) opens as the voice crosses a threshold, and closes after it stops. This allows the noise to be masked by the sound of the voice and immediately be cut off when the voice stops. Ideally, the threshold of the expander/gate is set just below the lowest volume of the voice so that no part of the performance is accidentally “cut off.” Then a short attack with a quick release and a few ms of hold allows for the expander/gate to open and the voice to be heard.
When the voice stops, the expander/gates close tightly after the end of the last syllable or room ring. This method works well when the noise is moderate, and the recording is well done. However, if the recording is low volume, or the noise is excessive, expanders and gates sometimes exacerbate the situation by introducing “pumping” noise every time the voice is heard.
The modern method of removing room noise from recordings is using plugins. For constant noise, the simple use of an algorithmic de-noiser like the ERA 5 Noise Remover or ERA 5 Noise Remover Pro are highly effective ways to remove background noise from audio.
These plugins are powerful audio clean up tools, and they’re super easy to use!
In addition to room noise, another common ruiner of recordings is echo and reverb. Echo occurs when the sound of the artist/talent bounces around the room a few times and comes back to the microphone slightly delayed. This effect creates a repeating kind of sound like what you’d hear yelling in a canyon, “HELLO!, Hello...helllo…”
Of course, you would have to be in a large room indeed to have that kind of echo. Most of the time the rooms we shoot in are small, but the echo is there nonetheless. It repeats every few milliseconds, and it creates a weird phasing sound to the voice.
Reverb is much the same as echo, only instead of a single echo bouncing around the room, it’s thousands of echoes happening simultaneously everywhere and coming back to the microphone at various times. This creates that “sound decay” that happens when you’re in a large cathedral or big arena. You clap your hands and there’s a long, “Paaaaaaaahhhhhhh” that diminishes as time goes on. In a small room this is even more destructive than echoes, because the sound is usually more dense. If you’re in some über reflective room like a bathroom, it can ruin performances.
There has never really been a good “old-skool” method for dealing with this except for the brutal use of an expander/gate cutting off after the end of the last word. It always sounds unnatural. But with Accusonus’ Reverb Remover the simple turn of a button can solve this with ease. Using Reverb Remover Pro also brings the added benefit of multiband de-reverbing and de-noising simultaneously or in parallel.
Muffled voice recording can occur from any number of problems: bad microphone placement, low fidelity mic, low sample rate, covered microphone etc. In nearly all cases (save the low sample rate), muffled voice recordings can be solved with good use of EQ. Taking the high frequency shelf of most EQs, you can spruce up the crackle and high frequencies of a dull sounding voice recording as shown in the image below:
Unfortunately, doing this also strongly raises the noise floor at the worst frequencies: highs. Instead, using a dynamic EQ can allow for the unvarnished sound of the recording to be heard when there is no voice (and highs are left alone). When the voice comes in, the high shelf boost is reintroduced. The resulting clear sound generally masks the raised noise floor. Once the voice finishes, highs are returned to flat - and no one was the wiser.
You can do the same thing with a two-or-more band multiband compressors. Take the lower band and have it set to a strong setting with a low threshold. This will reduce all frequencies but the highs. The result is a “virtual boost” of high frequencies via reducing all others.
Perhaps the easiest way to accomplish this is to simply use the ERA 5 Voice Auto EQ plugin and move the slider toward the “air” or “clarity” corners of the selector triangle. Although the automatic adjustments the plugin makes can be used to great effect. The example of the Voice Auto EQ Process below is with default settings.
Here is our raw, muffled audio:
Here’s how it sounds with just a high shelf boost:
This is what it sounds like with some multi band compression:
And this is what Voice Auto EQ can do to it:
What is sibilance? It is the “shrill” sound of consonants when “S” or “T” sounds are heard. Whether it is from an overly bright microphone or the talent’s vocal production being too bright, sometimes those “sibilances” can be really strong and annoying.
We can use the “old-skool” method of rolling off the high frequencies with an EQ, but, of course, that kills the high frequencies of the overall voice.
We can also use a compressor to cut out any shrill tones which are too loud, but because high frequencies have less energy than low frequencies, they are rarely pulled down sufficiently using this method.
Instead we can use either a two-band-or-more multiband compressor or a de-esser. A De-esser is essentially the same thing as the former except that it is dedicated to dealing with sibilance issues. It may also have machine learning programming to hone in specifically on “S”/”T” vocal noises - not just a specific frequency band.
By setting the compressor or the de-esser to the middle of the offending frequency - usually between 6kHz and 9kHz - whenever that frequency crosses the threshold, ONLY those frequencies are reduced. The rest of the sound is left alone. We’ve been using De-essers since the early clunky 80s varieties, but modern de-essers are a joy to use, and have all manner of extra bells and whistles which make the process of reducing sibilance nearly imperceptible. Using the ERA5 De-Esser or De-Esser Pro we can alter the sibilance to create a beautiful reduction in sibilance-only sound that feels natural.
One of the banes of mixing is getting lead vocals, podcasts and voice recordings to have a consistent volume level. We have loads of tools to do this including fader automation, compression, limiting, bussing etc. But the modern tools for doing this include “volume riding” plugins like the ERA 5 Voice Leveler. Instead of simply reducing the volume of loud sounds (compression) or raising the of low sounds (reverse expansion), volume riding is literally a “phantom hand on a fader” which algorithmically scans a file and implements appropriate volume changes to keep the overall volume (RMS) the same. It makes mixing dynamic vocals and audio clean up a snap.
Drop Voice Leveler into a channel and you’ve instantly got a near flat volume mix. Other settings include the incredible ability to control breath noise automatically and add mid/high frequency emphasis to dull vocals. It’s a must have plugin for anyone doing podcasts or interviews.