If you've heard the term "audio normalization," or just anything like "you should normalize your audio?" you may well wonder "what the heck does that mean? Isn't my audio normal? Do I have abnormal audio?" And you would be right to wonder that. Because the term is not really very self-explanatory. So what else is new in the audio recording world?
Here is a quick animated gif that kind of explains what normalization is in 2 seconds.
To answer the questions in the title - let's take them one at a time, but in reverse order, because it's easier that way:).
That was easy. Next.
You COULD just use the definition from Wikipedia here. But good luck with that. As is typical with audio terminology, that definition is super confusing.
It's actually pretty easy to understand and not really that easy to put into words. But I'll try. When you "normalize" an audio waveform (the blobs and squiggles), you are simply turning up the volume. Honestly, that's really it. The only question is "how much does it get turned up?"
The answer to THAT takes just a little tiny bit of explaining. First, let's recall that with digital audio, there is a maximum volume level. If the audio is somehow pushed beyond that boundary, the audio gets really ugly because it distorts/clips.
This maximum volume level I mentioned is at 0 decibels (abbreviated as "dB"), by the way. Digital audio is upside down. Zero is the maximum. Average levels for music are usually between around -13 dB to -20 dB for short. Really quiet levels are down at like -70 dB. As the audio gets quieter, it sinks deeper into the negative numbers.
Yeah, digital audio is weird. But as long as you buy into the fact that 0 dB is the loudest the audio can get before clipping (distorting), you'll know all you need to know.
In case you didn't know this, sound/audio is caused by waves in the air. Those waves cause air molecules to vibrate back and forth. The way a microphone is able to pick up audio is that those "air waves" ripple across the surface of a flat thing inside the mic. That causes the flat thing to move back and forth (in the case of a dynamic mic) or to cause back-and-forth electrical pressure in the case of a condenser mic. For more detail on this, check out my post What Is the Difference Between Condenser and Dynamic Microphones?
Audio waves show up in audio software in a weird sort of way. For instance, you might expect the quietest audio to be at the bottom and the loudest to be at the top. But that isn't the way it works with audio.
Because of the fact that audio comes from waves - the back-and-forth motion of air molecules - the loudest parts of audio are shown at the top AND bottom, and absolute silence is in the middle. Since pictures make things easier, see Figure 1.
So rather than thinking of the maximum allowable volume level as a "ceiling," which is what I was going to do, let's think of it like a swim lane. And in this swim lane, BOTH edges are boundaries not to be crossed.
This also means that when looking for the loudest part of your audio, you have to look both up AND down. Admit it. That's a little weird. As it happens, the loudest part of the audio in our example below is in the bottom part of the audio.
OK, back to normalization. Let's get back the the question of how much audio is turned up when it's being normalized. The normalization effect in audio software will find whatever the loudest point in your recorded audio is.
Once it knows the loudest bit of audio, it will turn that up to 0 dB (if you are going for the greatest amount of normalization). So the loudest part of your audio gets turned up to as loud as it can be before clipping.
Let's say the loudest part of your audio, which is a vocal recording in our example, is a part where you shout something. See Figure 2 for an example.
And let's say that shout is measured at -6 dB. The normalization effect will do some math here.
It wants to turn up that shouted audio to 0 dB. So it needs to know the difference between how loud the shout is, and the loudest it could possibly be before distorting.
The simple math (well, that is if negative numbers didn't freak you out too much) is that 0 minus -6 equals 6. So the difference is 6 dB.
Now that the software knows to turn up the loudest part of the audio by 6 dB, it then turns EVERYTHING up by that same amount.
Let's look at a "before" picture of our example audio.
And here is what it looks like AFTER normalization.
While most people use to raise the overall volume of their audio, you can ALSO turn audio DOWN with normalization. Remember, all the program is doing is changing the level of the loudest bit of audio to a target you choose, and changing all the rest of the audio by that same amount.
So that means if you place the setting to a target that is LOWER than the loudest part (the shout in our example above), the normalizing will turn everything down by the amount it takes for the loudest part to meet the target.
So let's say the loudest part of your audio - the shout - is at -2 dB. If you set your normalization target to -3 dB, then the effect will LOWER everything by 1 dB, which is the amount of reduction you need to get your -2 dB peak down to -3 dB.
So why would you want to do this? Certain services have maximum loudness standards. For example, if you are recording an audiobook for Audible (using their ACX marketplace), they will not accept audio that has a peak level above -3 dB. That means our audio with the shout that goes as high as -2 dB breaks their rule because that is louder than -3 dB. So you can use normalization to reduce your loudest peak by setting the target to just under -3 dB, like say -2.99 dB.
A normalization effect might offer percentages as targets as well as specific dB targets. For example, normalizing to 100% is the same as raising the volume to the maximum of 0 dB.
A 50% target will be roughly equivalent to -6 dB. Yeah, another thing about measuring audio is that it isn't linear. For example, a 30% target is about -10 dB. 90% is -1 dB. 80% is -2 dB. 70% is -3 dB, 60% is - 4.5 dB, etc.
I know. More weirdness.
The point here is that if you use a percentage, might be setting the max loudness to lower than you think. If you have a choice to specify the actual dB level, do that. It will make things much easier to understand.
Yeah. That's it. Remember I said that audio normalization was really just turning it up? I know it took a fair amount of explaining, but yeah. The only reason to normalize your audio is to make sure that it is loud enough to be heard. That could be for whatever reason you want.
As I have preached again and again, noise is the enemy of good audio. Before you even normalize your audio, you'll want to be sure you've gotten rid of (or prevented) as much noise as possible from being in your audio recording. See my post series on how to do that here: Improve The Quality Of The Audio You Record At Home.
The reason it's so important is that by normalizing your audio, you are turning it up, as we have seen. But any noise that is present in your recording will ALSO get turned up by the same amount. So be very careful of that when using this tool. Audio normalization is powerful, and so can also be dangerous.