ACX Audio Submission Requirements: What The Heck Do They Mean?
Trying to record an audiobook for the Audible Creation Exchange? Are you confused or frustrated by the "ACX Audio Submission Requirements"? You might wonder "why do I keep failing the ACX Check?" You are not alone!
ACX is Amazon's marketplace for audiobooks. If you're an author, you can find someone on the platform to record your book so you can have an audiobook version of it. If you are a voice over actor, you can sign up to be the one recording those audiobooks and getting paid for it. Also, if you've already created an audiobook, you can sell it on ACX.
If you use the free Audacity software for recording, you can You can download ACX Check here. It's a plugin specifically for Audacity that will tell you all the measurements you need to meet the ACX requirements.
Those Confusing ACX Audio Requirements
This article is for the folks trying to record an audiobook for ACX. It seems simple enough. Get a microphone. Attach it to your computer. Download the free Audacity recording software (see our course - The Newbies Guide To Audio Recording Awesomenss 1 for more on Audacity). Record the book.
But then you try to submit the audio and you are told it doesn't meet their requirements. Now what? When you go to their page on what those requirements are, you're presented with a bunch of technical jargon that may not make much (or any) sense to you whatsoever. The requirements are:
- 192 kbps or higher MP3, constant bit rate (CBR) at 44.1 kHz
- Average loudness must be between -23dB and -18dB RMS
- No peak values can exceed -3dBs
- Have a maximum -60dB noise floor
Gah!! What? OK, if you are into audio recording, or are a recording engineer, you might know what all this means. But for the typical non-audio person, this can be super confusing. So let's just take these one by one.
The MP3 Stuff
Any program that allows you to export or save audio as an MP3 will have these options available for you to just put a check mark in. So you absolutely do not need to understand what it all means. Just make sure you choose these.
The only exception is the "44.1 KHz" thing, which needs to set BEFORE you record. In Audacity, it's the default. So you don't really need to worry about it. See the pic on the right for the other settings in Audacity.
Being between -23dB and -18dB RMS
This is average volume. Seems simple when you say it like that, right? And it makes sense to want the average volume to be in a certain range so you can have consistency. So what does this particular set of numbers and letters mean?
First, let me clarify that digital audio is measured in negative numbers. One of the many weird things about it. Remember back to school math when you did negative numbers? It's like that. For audio, 0dB (dB = "decibels") is the maximum. It's the ceiling. Anything higher than that (anything in positive numbers) gets distorted and sounds awful. So audio at, say, -2dB is much LOUDER than audio at -18dB. Once you understand that, things get easier.
Also, "RMS" (stands for root mean square) is just one of the ways average audio levels are calculated. Don't worry about what it means.
So ACX audio must have an average level at least -23dB. But it cannot be louder than -18dB. How do you get your average volume in the zone?
Well, there are several ways. Tools you can use include a combination of Normalization and Amplify tools in Audacity. But to really understand this, it helps to think about your school years again.
Remember how to get an average? If you have 5 numbers, say 7, 3, 8, 7, and 6. You find the average by first adding them up 7+3+8+7+6 = 31. Then you just divide that total by however many numbers there are. There are 5 numbers. So 31 divided by 5 gives us an average of 6.2.
If someone told you that you needed an average of between 7 and 9, what could you do? Well, how about making one or two of those numbers bigger? If you change the 3 to an 11 (so you have 7, 11, 8, 7, and 6), your average goes up to 7.8. And that would be great. We're between 7 and 9.
With your audio loudness, if you find out that the average level is too low (say, -25dB), you would know that you need some higher numbers, just like in our above example. You do that by turning up some or all of the audio. In Audacity, you can do this using the "Amplify" control. It might take some trial and error. And even if you get your average in between -23dB and -18dB, you might then run into the problem of having a peak value that is too loud somewhere, which leads us to...
Peaks Cannot Exceed -3dB
Let's go back to our 5-number average calculation from above. Just for fun, we said the average of our 5 numbers needed to be between 7 and 9. And we adjusted numbers higher or lower until we got an average into the zone between 7-9.
But what if I ALSO told you that none of the 5 numbers could be higher than 10? We would have passed the test for AVERAGE value. But we would fail on PEAK volume because one of the numbers (the 11) was louder than our maximum of 10.
What do we do then? Well it's a bit more trial and error. You need to find the peaks in your audio that are louder than -3dB (between -3dB and 0dB), and turn THOSE down. If you were doing that manually, you'd go through the audio waveform and see where it got loud. The waveform would be quite close to the edge.
Then you could just select that section and lower the volume using the "Amplify" effect. Do this for every peak that is louder than -3dB, and you will pass the peaks test.
One quicker way to do this is would be to use the Normalize effect (choose Normalize from the Effect drop-down menu). If you set "Normalize peak amplitude to:" at -3.0 dB, Audacity will find the loudest part of your audio, and turn everything up (or down!) until that loudest peak measures -3.0 dB. That way you know that nothing in your recording is louder than -3.0 dB.
However, by lowering volume in those areas, you will lower the overall average volume of the whole thing! It would be like making some of our 5 numbers smaller. So now what? Sheesh!
Run the ACX Check again and see if your average still passes. If it does, that's terrific. Your average AND peak values now pass.
If the average volume is too low now though, you'll need to raise the volume of all your audio. So highlight all your speaking parts and raise the level using the Amplify effect again. Just make sure you don't raise it so high that something pokes above -3dB. See what I mean by trial and error?
Have a Maximum -60dB Noise Floor
Again, this is not as confusing as it sounds. All it means is that the quiet parts where you're not talking (or breathing) - the spaces between the phrases - need to be really quiet. They don't want a lot of background hiss, or computer fan noise, etc. And there obviously should not be any barking dogs, purring cats, lawn mowers, leaf blowers, etc.
The best way to do this is to make sure you have a quiet environment when you record. It doesn't have to be dead-silent. But it does have to be quieter than -60dB.
For most of us recording at home, there will be SOME noise either from your mic (especially USB mics) or your computer or the A/C, etc. I highly recommend using noise reduction to deal with that. See our post New Noise Reduction Tool In Audacity for how to do this. There's a video on that page too.
After running noise reduction, take a listen to your audio to make sure it still sounds good. If you overdo the amount of noise reduction you apply, or if the noise was too loud in relation to your voice, then your audio might sound weird afterward.
If that happens, Undo the noise reduction and run it again, lowering the number in the Noise Reduction amount box in Audacity. This probably won't be an issue as long as the noise was low enough to start with.
Putting it All Together
There is a free plugin for Audacity that you can use to check and see if your audio passes all the tests. It's called ACX Check. You can find it here. Instructions for installing Audacity plugins are here.
Regardless of all these requirements, I highly recommend you not let them affect your performance while recording. If you think too much about things like that, it will be hard to focus on your performance.
So just do a level check before you start, to make sure your audio is loud enough but is not too loud (goes beyond the edges). Then just concentrate on your performance until you are done. Then AFTER you're done recording, use the above advice to make sure your audio meets ACX requirements.
I'll put together more stuff in the coming days here to help give you some details on how to do some of this stuff. But hopefully this post will help you to at least understand all the ACX stuff a little better.
THANK YOU FOR YOUR HELP! A year ago I decided to change directions, diversify (reinvent myself AGAIN) & start creating audio books as a new way to supplement my income (musician/songwriter). I was diligently working on every detail like a full time job but I could never hit the mark with the ACX requirements. I had to step away from the entire project due to a series of family emergencies which required us to move house, uprooting the studio and of course the computer which I had been working on gave up the ghost so... Moving my DAW to a new tower too. I find technical things a bit of a yawn but I do what I have to do. It has always been a creative killer for me. So... JUST yesterday, after finding your help here, I FINALLY have my first submission ready to upload. I am pretty happy and wanted to take the time to say THANK YOU. You made it very clear and easy to understand! CHEERS!
Hi Liz. thanks for letting me know this. I am thrilled that you found it so helpful! Best of luck on your audiobook!
Hello, I’m just starting out in voice-overs and audiobook narration and got a Blue Yeti mic to start with. However, I recently saw something on the ACX site that said that ACX would not accept VO done with USB mics, but I can’t find it again to refer to. Can you help clarify for me please?
Thanks for the useful advice in the article!
Hi Rebecca. That is really interesting. I doubt this though, for no other reason than - how would they know what kind of mic was used? Now I can certainly see them saying something like we want the best quality audio, and too often, people use cheap USB mics like gaming headsets or karaoke mics, etc, which often sound terrible. So they might recommend not doing that. But it is VERY possible to record professional quality audio if you have a DECENT USB mic. I would put the Yet in that category. I have done paid VO work using a Samson C01U, which is another large diaphragm condenser mic.
When audio is submitted via ACX, it has to pass a bunch of standards - as you probably know from the article above. But that stuff has more to do with making sure things aren't too loud or too noisy. One thing that many USB mics have in common is that they can be a bit noisy. Usually this is a steady hiss in the background. THAT background hiss COULD be too noisy for their noise floor measurement. But I ALWAYS recommend doing noise reduction on audio before submitting or using it for anything. And that steady kind of consistent hiss is the easiest kind of noise to get rid of using standard noise reduction. And that makes it easy to pass their noise floor test. So that is the only thing I could possibly imagine them referring to with USB mics not being recommended.
But there is no way for anyone to know if a finished recording came from a standard condenser mic recorded with an interface unit, or a USB mic. So I would not worry about that at all.
I hope that helps!
Your article was very good and very helpful. Thank you. I am having a problem with my Floor raising when I adjust the RMS and Peak to fit ACX requirements. My floor noise is below-60 dB pre and during recordings but once I adjust the RMS and Peak it jumps up into the -40s dB.
Hi Jenny. I think if you do noise reduction as your very first step, then the noise floor will remain well below the max. Have you tried that?
You lost me at the Math. How to get an average...... I hated Math in school and any mention of it sends me into the fetal position. I suppose I will just have to figure it out by trial and error. I have one book in production but I am fairly sure ACX will send it back for me to adjust, although they stated there were no issues with the original uploads. Thanks anyway.
You don't really NEED to know how to calculate an average for this. The ACX check plugin will tell you what the average is. So if it's too high, it would be helpful to be able to see your audio "blobs" (the waveform) in Audacity and just check if there are any areas that are much bigger (louder) that the rest of the waveform. If so, you can lower the average JUST by lowering that (those) area that is much louder than the rest. this is better than trying to lower the volume of the entire audio file/waveform, because then the entire thing could be hard to hear just because a few loud words were too loud. I hope that makes sense.
Hi, Thank You For This! For In-between sentences and on the breaths, I like to go into EFFECTS and use the FADE OUT button on the breaths 2-3 times in-between sentences. (It gives the recording a real smooth sound). Question, Do you think this will pass the ACX standards? I know it says to use the quiet part of the audio and use that on the breaths and in-between the sentences). I just haven't installed the ACX plug-in yet on audacity, I'm trying to figure that out! Thank you.
Hi Chris. Unless you have super loud background noise in your room, and the difference between silence and that noise behind the voice is jarring, then I just go with silence as my "quiet part' or "room sound." And that does pass the ACX check. The only requirement for that, though, is to put some of it before the audio starts and after it finishes. The only thing ACX checks for in between audio is the noise floor, and it averages that. So it would be fine to do as you say. I use the fade in and fade out tools so often for breaths that I have them mapped to my keyboard as A and S (fade in and fade out).
The ACX Quality Assurance team may reject titles that do not meet these standards, and their retail release may be delayed. The following requirements help ensure customers get a great listen. Consistency in audio levels, tone, noise level, spacing, and pronunciation gives the listener an enjoyable experience.
Are you aware of a dynamics processor plugin for Pro Tools (or otherwise), with a preset that will automatically compress to -23 to -18, with peaks not to exceed -3?
Yes. You can do it with Audacity and Reaper and just about any software, really - with just stock plugins. Here is for Audacity https://youtu.be/LaqJDTGZ_Hs. And here is Reaper - https://youtu.be/rBsWKJkDtBg.