Microcontroller polyphony

There are a variety of tricks for persuading a small microcontroller to make sounds.

Roughly in order from simplest to most difficult:

single beeps
... circuit drawing goes here ...

polyphony the "standard" DAC way
... circuit drawing goes here ...


 * (a) set up an interrupt to handle sound -- most people use either 8000 Hz (telephone-quality) or 44100 Hz (CD-quality).
 * (b) Every time the interrupt occurs, get a number for the sound source. Square waves are the simplest -- the number is +1 or -1. If you imagine this sound playing by itself out a speaker, this number indicates how far out the cone is extended (or how far in the cone is retracted).
 * (c) If you have 7 sound sources (i.e., you want to simulate playing a chord of 6 notes on a piano, and also a flute), get 7 numbers.
 * (d) add the numbers together to get a total.
 * (e) send the total out to the DAC
 * (f) Use op-amps to buffer the DAC output voltage, do low-pass filtering to eliminate ultrasonics, and send it to the speaker.
 * (g) In the background (perhaps in the main loop, or a lower-priority interrupt), when the time comes to finish the chord, turn off some or all of the sounds. (One of many ways to "turn off" a sound source is to set its delta to zero, and then set its phase accumulator to zero).
 * (h) When the time comes to start playing another note (perhaps in addition to the notes already playing), set the "delta" number for the corresponding sources to produce the appropriate frequency.

In practice, to reduce jitter, the interrupt first (at high priority) sends the total out to the DAC (e), and then (at low priority) calculates a total (b-d) that won't be used until the next sound interrupt.

You've seen a sound-board mixer, right? The most general way to do step (d) is to multiply each number from each sound source by an independent loudness value, and then add the products together, and then (if necessary) subtract the DC bias and (if necessary) clip to maximum or minimum amplitude. Many embedded system programmers choose to pre-calculate things such that the bias has already been removed and the amplitude already included in the number from the sound source, so step (d) can be reduced to simple addition and (if necessary) clip to maximum or minimum amplitude.

A few programmers use some function for (b) to calculate the sound on-the-fly. But many programmers use a table containing a full cycle of that source's sound (i.e., one table for the flute, and one table for the piano), and a "phase accumulator" for each source (1 for the flute, and 6 phase accumulators, one for each finger pressing a note on the piano). Many programmers don't bother with "attack time" or "decay time" or other nuances of orchestral music, leading to characteristic "8-bit sound".

So every time the interrupt comes in (say, once every 1/8000 of a second), the software does something like

finger_1_accumulator = (finger_1_accumulator + finger_1_delta) & 0xFFFF; ...   finger_6_accumulator = (finger_6_accumulator + finger_6_delta) & 0xFFFF; flute_accumulator = (flute_accumulator + flute_delta) & 0xFFFF;

finger_1_sound = piano_table[ hi_byte( finger_1_accumulator ) ]; finger_6_sound = piano_table[ hi_byte( finger_6_accumulator ) ]; flute_sound = flute_table[ hi_byte( flute_accumulator ) ];

total = finger_1_sound + ... + finger_6_sound + flute_sound;

Once you know the frequency in Hz of the next note in the song, (perhaps a musical frequency or a DTMF frequency) ("Concert A" is 440 Hz), you can calculate the delta from a few constants combined with that frequency: the number of entries of the sampled data ( piano_table[] or flute_table[], often 256 entries], and the output sampling rate (often 8000 Hz):

frequency = song[ next_note ]; finger_2_delta = (number_of_entries * frequency * 0x100 )/sampling_rate; (Or, I suppose you could pre-calculate these numbers and store all notes in terms of "delta" rather than "Hz").

The "0x100" here and the "hi_byte" function used above allow us to play very low-frequency sounds that need to repeat each sample in the sampled table more than once, a kind of fixed-point notation.

For more details on fetching the next note, see TRAXMOD.

polyphony the "clever" 1-bit way
... serial circuit drawing goes here ...

... parallel circuit drawing goes here ...

... 1-bit circuit drawing goes here ...

Serial DACs were once pretty expensive.

Parallel DACs made from an R2R ladder seem to cost less, but they require a bunch of digital input pins, which once required an expensive CPU with lots of pins to handle that *and* all the other stuff you wanted to hook to it.

To avoid that cost, many hobbyists have tried to generate sounds "directly" from one or two digital I/O pins of a CPU, without a DAC.

They've figured out several relatively clever ways of doing that -- but they all have drawbacks. In step (a), they require higher interrupt frequency. In step (e), they require a bit more work. In step (f), they generally require much more filtering. Pretty much all the other steps stay the same. Since the cost of high-pin-count CPUs and single-chip serial DACs has plummeted, you have to ask if the drawbacks of the "clever" method are worth it.

Often the hardware guy assumed that the speaker is just going to be used for simple beeps, and hooked to the speaker to a single digital output pin (such as the Apple II speaker and the PC speaker). Then software guy is forced to use a 1-bit method. Especially when he absolutely *must* generate 2 simultaneous tones for DTMF.

BTc Sound Compression Algorithm A system to encode sound to 1-bit format to be played on a PIC or other cheap micro.

MP3-quality sound
... Music Player ...