Most microcontrollers have an ADC (Analog to Digital Converter) to get audio into your device, but how do you get audio out? You could use a CODEC to do the job, but this is more costly and more difficult. For a lot of applications, the onboard PWM (Pulse Width Modulation) is good enough. We will show how to use the PWM feature of the Arduino (ATmega328 microcontroller) to generate high quality audio, with a minimum of components. We will then go into all the gory details of how to optimize the settings to make the best DAC (Digital to Analog Converter) for your application.
1. What is PWM, and how does it work?
The most common microcontroller feature is a timer. This is a simple internal clock that counts up to some number, and then goes back down to zero. PWM is generated by these timers, by having an external pin go high when the timer hits zero, and then go low at some other number, which you can vary. In this manner, you can make an external pin stay high for a specific amount of time, without having to manually toggle it with the code. If you low-pass filter the output of this pin, you get the average value, which is proportional to how long it was high. A typical PWM output is shown below, along with its average value (in red).
The period length is the same for each PWM value, and this is set as the TOP or MAX value that the counter will count to. This also sets the resolution of your PWM output. For example, if your TOP value is set to 255, the output will go high at 0, and then low again at some number between 0 and 255, so you get 256 possible positions, or 8 bit resolution (2^8 = 256). This also sets your PWM frequency (sometimes called the carrier frequency), as this is the number it will count to, before it resets to zero. So, the higher the TOP value, the higher the bit depth, but the lower the PWM frequency (longer count time). This is the fundamental trade-off with PWMs, as you go faster, you get less precision. This is exactly the same problem we showed with the ATmega ADC. What follows below, is information that will help you decide what frequency is right for your application.
2. PWM frequency / noise floor trade-off.
The smallest signal you can hear is determined by your noise floor. This is the low level “hiss” that you hear in the background of most signals. For audio applications, you want this to be below the level detectable by the ear. But, this is often not achievable with a PWM generator. Sometimes this noise is interesting, and sounds good – for example, 8 bit video game music. But, if you want “CD Quality” audio, you will need 16 bits of data. Exactly what level of noise is acceptable is up to you, but to get a lower noise floor, you need to use more bits, and the exact amount is set by the equation:
SNR(dB) = (Bit Depth)*6.02dB + 1.76dB.
A good explanation of this equation is given in this Analog Devices App. Note.
To get more bits, you have 2 options: lower your PWM frequency, or increase the number of PWMs you use. Ultimately, the PWM frequency must be at least twice the highest frequency of interest (due to Nyquist theory). Furthermore, if the PWM frequency is in the audible range (less than 20KHz) you will need to filter it heavily to not hear a high pitched squeal behind all your sounds. This sets a hard floor for how many bits you can achieve with your Arduino without having to add a ton of extra circuitry.
Inside the ATmega328, the timers can be clocked as fast as the system clock (16MHz for the Arduino, 20MHz max elsewise). There are other microcontrollers in the Atmel line that use an internal PLL to achieve a higher system clock (AT90PWM can run at 64MHz). 16MHz/20kHz = 800, so you can get a maximum of 9 bits (TOP = 511) before you need more circuitry. At 20MHz you could probably get away with 10b (TOP = 1023), and at 64MHz you could go all the way up to 11 bits (TOP = 2047). The graph below shows the bit depth/frequency trade-off for various options with a 16MHz CPU clock. If you want to calculate the PWM frequency for a different setup, it is Fpwm = (Fcpu/m)/(2^(B/n)) where B is the bit depth you want, m is the slope number (1 for Fast PWM, 2 for Phase Correct), and n is the number of PWMs used.
So what are all those different options shown above? First off, we have Fast PWM (Single slope) versus Phase Correct PWM (Dual Slope). With Fast PWM, the counter will increase to TOP, and then reset to zero, whereas Phase Correct PWM will reach TOP, and then count backwards to zero, where it will count up again. Phase Correct takes twice as long to complete a cycle, so it will only go half as fast for any given bit depth, but it is much, much higher fidelity, as will be explained later on. Single versus Dual PWMs refer to how many PWM outputs you sum together to create your analog signal, and that will be explained next.
3. Dual and triple PWMs.
A single PWM output scheme uses just one output pin to create its analog signal. But, you could just as easily create a second analog signal and add it to the first one to get a higher resolution signal. And, by selecting your summing ratio correctly, you can make this second signal much smaller than the first, and represent a series of much lower order bits. For example, if your resistor ratio was 1:256, then the first PWM could be the high 8 bits, and the second PWM could be the lower 8 bits, for a total of 16 bit resolution! An example of single and dual PWMs is shown below.
This method of PWM mixing can be taken to triple and quad implementations, but the benefits quickly begin to dwindle. If you are interested building one of these circuits, we have done an in-depth analysis of higher order PWMs, showing how to pick values, and listing common pitfalls. We also found an amazing version that went into space. So follow the preceding link to learn more!
4. Distortion in PWMs.
Just because you are getting a lot of bits at a frequency over 20kHz doesn’t mean you will have good sounding audio! Harmonic distortion and other unwanted frequencies can quickly destroy your signal. Luckily, there are things you can do to reduce these pesky problems. The first is to operate at a higher PWM frequency. PWM can be thought of as amplitude modulation (AM) of your carrier (PWM) frequency. As with any AM method, you end up creating side bands spaced at your signal frequencies. So, if you have a 30kHz carrier, and you are creating a 5kHz tone, you will have signals at 30kHz +/- 5kHz, 10kHz, 15kHz, etc. These sidebands decrease in amplitude the further away they get from the carrier, but they still exist. Usually it takes around 5 or 6 submultiples before they drop below the noise floor. So, for the above example, you will have 30kHz, 25kHz, 20kHz, 15kHz, 10kHz, and 5kHz signals generated, half of which are in the audible range. When your signal is an even multiple of the carrier (as above), this problem is really bad, as the signal’s harmonics add with the carrier’s harmonics.
Pushing your carrier to 60kHz would fix the above example, as the sidebands would fall at 55kHz, 50kHz, 45kHz, 40kHz, and 35kHz – not in the audio band. a good rule of thumb, is that the carrier should be 7 times higher than your largest frequency of interest (for no sidebands). So if you want the full 20kHz, you need to have a PWM frequency of 140kHz, which is rarely attainable in practical applications. For really high fidelity applications, like Class D amplifiers, this is usually between 200kHz and 500kHz. But, as with all things, you should test out what you can attain, and see if it is good enough.
The second effect of pushing your PWM frequency higher, is that you increase the time resolution of your signal, and greatly reduce it’s harmonic distortions. To find out more about these effects, and details on the frequency/distortion trade-off, check out our PWM distortion analysis page. It is chocked full of practical information. And really, how else will you find out how far the rabbit hole goes?
5. Final Considerations.
There are a number of things to take into account when selecting your PWM scheme – number of PWMs summed, PWM frequency, Fast versus Phase Correct, etc. But, how do you select the perfect sweet spot for your application? Probably the best starting point is your data generation rate. How quickly will a new sample be ready for output? If you are doing straight playback, you can go to relatively high frequencies, as the microcontroller won’t need to do any calculations before outputting the data. But, if you are processing the audio, or generating it on the fly, as is done in an effects pedal or synthesizer, then you want as low of a playback frequency as possible, in order to maximize the data processing time. For an 8 bit microcontroller, this is probably best kept between 30kHz and 80kHz (the latter only giving 256 clock cycles for processing). You can double your sample playback speed, and repeat samples (effectively oversampling the data), but unless you are very clever about the implementation and avoid servicing an interrupt to output the repeated sample, you will end up wasting too much time in interrupts, with little left over for processing. Furthermore, this oversampling technique does not simply double the PWM frequency, it merely attenuates the original PWM frequency, and adds a second component at double the frequency. The harmonic distortion is also not substantially improved. So it has very little benefit unless it can be implemented for no processing overhead.
Next, the decision between single, dual, or triple PWM can be made. If you have a spare PWM output, it is very easy to implement a dual PWM, and it makes a great deal of improvement. The triple and quad PWMs are probably not worth the extra effort, as the PWM frequencies they require are quite fast, and the processor won’t be able to create new data in time. Also, the resistor mixing ratios need to be almost impossibly precise.
Finally, once you have your topology and frequency selected, you can see what the bit depth trade-off is for using Fast PWM or Phase Correct PWM. Unless the signals you are generating are going to be very low frequency, it is almost always better to sacrifice a bit or two of resolution for the reduced distortion that Phase Correct PWM gives you.
A few other odds and ends to keep in mind: 1. Generating and outputting 2 x 8 bit values is less computation than 2 x 7 bit values, and 2 x 4 bit values is also fairly easy to generate. 2. If your PWM frequency is out of the audible range, you won’t need to spend much effort filtering it, unless the amplifier you are putting it into doesn’t have a good low-pass filter at 20kHz. It is quite common on sound cards and mixing boards to use 30kHz or 40kHz as the low-pass frequency, so be wary. Your signal could easily distort the input amplifier stages, even though your signals are outside of the audible range. In these cases, a 3rd order active filter with a single opamp will solve the problem.
6. Setting up your ATmega328 (Arduino).
So, now that you have an idea of what PWM can do, here is a quick tutorial and Arduino sketch that has register settings for using Timer1 as a high quality, audio rate PWM generator. All you have to do is enter a few parameters, add a low-pass filter, and you’ll be making weird sounds in no time! So click the link above to get started (and sorry for all the external linking, there is just too much info to fit coherently on one page).