Arduino and AY-3-8910 – Part 3
I suggested in Part 2 that it might be possible to do some simple modulation of the amplitude of the AY-3-8910 channels rather than drive frequencies directly. This is taking a look at the possibilities of some kind of lo-fi direct digital synthesis using that as a basis.
https://makertube.net/w/uCSiBG5RBufGqspoHMYFPt
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
These are the key tutorials for the main concepts used in this project:
- Arduino AY3891x Library: https://github.com/Andy4495/AY3891x
- Arduino Nano AY-3-8910 PCB: https://github.com/GadgetReboot/AY-3-8910
- AY-3-8910 on synth DIY wiki: https://sdiy.info/wiki/General_Instrument_AY-3-8910
If you are new to Arduino, see the Getting Started pages.
Parts list
- Arduino Uno.
- AY-3-8910 chip.
- Either GadgetReboot’s PCB or patch using solderless breadboard or prototyping boards.
- 5V compatible MIDI interface.
- Jumper wires.
Direct Digital Synthesis on the AY-3-8910
I’ve talked about direct digital synthesis before, so won’t go into full detail again. For more, see Arduino R2R Digital Audio – Part 3 and Arduino PWM Sound Output.
But the top-level idea is to set the level of the signal according to a value in a wavetable. If this value is updated at a useful audio rate then it will be interpreted as sound.
There are some pretty major limitations with attempting to do this on the AY-3-8910 however. The biggest one being that there are only 15 levels for the output on each channel.
So I’ll be working to the following properties:
- 4-bit resolution for the output.
- 8-bit wavetable.
- 8.8 fixed point accumulator to index into the wavetable.
- 8096 Hz sample rate.
YouTuber https://www.youtube.com/@inazumadenki5588 had a look at this and showed that the AY-3-8910 needs to be set up as follows:
- Frequency value for the channel should be set to the highest frequency possible.
- All channels should be disabled.
This is due to comments in the datasheet stating that the only way to fully disable a channel is to have 0 in the amplitude field.
Note: for a 8192 sample rate, that means writing out a sample to the AY-3-8910 registers approximately once every 124uS. With a 256 value wavetable, it takes almost 32 mS to write a complete cycle at the native sample rate, which would be around a 30 Hz output.
I’m not sure what the largest increment that would still give a useful signal might be, but say it was 8 values from the wavetable, then that would make the highest frequency supported around 1kHz. Not great, but certainly audible, so worth a try.
Setting up for DDS
I want a regular, reliable, periodic routine to output the levels from the wavetable, and the usual way to achieve this is using a timer and interrupt. As Timer 1 is already in use to generate the 1MHz clock for the AY-3-8910, I’m going to be configuring Timer 2 as follows:
- Timer 2 is an 8-bit timer.
- Use prescalar of 32 which gives a 500kHz clock source (16MHz/32).
- Use CTC (clear timer on compare) mode.
- Generate a compare match interrupt.
- Do not enable any output pins.
The appropriate ATMega328 registers to enable this are:
// COM2A[1:0] = 00 No output// WGM2[2:0] = 010 CTC mode
// CS2[2:0] = 011 Prescalar=32
ASSR = 0;
TCCR2A = _BV(WGM21);
TCCR2B = _BV(CS21) | _BV(CS20);
TCNT2 = 0;
OCR2A = 60;
TIMSK2 = _BV(OCIE2A);
Although it is worth noting that enabling OC1A can be quite useful for debugging. The following toggles the OC2A output (on D11) every time there is a compare match. The frequency seen on D11 will thus be half the anticipated sample frequency.
pinMode(11, OUTPUT);TCCR2A |= _BV(COM2A0); // COM2A[1:0] = 01 for OC2A toggle
And this does indeed generate a signal. Here is a trace showing a timing GPIO pin and the AY-3-8910 output.
The problem is that this is meant to be a 440Hz sine wave, and whilst the shape isn’t too bad (it is a little distorted as the amplitude isn’t a true linear shape), the frequency is much nearer 100Hz than 440.
Analysis of Performance
The clue is the other trace, which is a timing pin being toggled every time the Interrupt routine is called. This is showing a 1kHz frequency, which means the IRS is being called with a 2kHz frequency rather than the anticipated 8192Hz. Curiously though I am getting an accurate 4kHz toggle on the timer output pin OC1A indicating the timer is correctly counting with a 8kHz frequency.
No matter how I configured things, the interrupt routine just would not do anything at a faster rate. I had to drop the frequency right down to 2kHz to get the output pin and interrupt routing running together. This means that something in the interrupt routine seems to be taking ~ 450uS to run.
After a fair bit of prodding and probing and checking the ATMega328 datasheet and double checking the register values, I have to conclude that the AY3891x library is just too slow at updating the registers for it to be able to run from the interrupt routine at this speed.
Taking a look at the register write() function in the library, which I need to use to update the channel level, I can see the following is happening:
void AY3891x::write(byte regAddr, byte data) {latchAddressMode(regAddr);
daPinsOutput(data);
noInterrupts();
mode010to110();
mode110to010();
interrupts();
daPinsInput();
}
void AY3891x::latchAddressMode(byte regAddr) {
mode010to000();
daPinsOutput(_chipAddress | regAddr); // Register address is 4 lsb
mode000to001();
mode001to000();
mode000to010();
}
void AY3891x::daPinsOutput(byte data) {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], OUTPUT);
}
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) {
digitalWrite(_DA_pin[i], data & 0x01);
data = data >> 1;
}
}
}
void AY3891x::daPinsInput() {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], INPUT);
}
}
And every one of those modeXXXtoYYY() functions is a call to digitalWrite(), so I make that 22 calls to ditigalWrite() in order to write a single register value, plus around 16 calls to pinMode(). There are also 5 loops each looping over 8 values.
One person measured the Arduino Uno digitalWrite() function and concluded that it takes 3.4uS to run, so that is a minimum of 75uS of processing in every run through the interrupt routine just for those calls alone. That doesn’t include the calls and other logic going on. It could easily be more than twice that when everything is taken into account.
Dropping in some temporary pin IO either side of the call to the AY write function itself, and I’m measuring just over 250uS for the register update to happen, and that is just for one channel. This means that anything with a period of that or faster is starving the processor from running at all.
Measuring the Basic Performance
At this point I took a step back and created a free-running test sketch to really see what is going on.
#include "AY3891x.h"AY3891x psg( 17, 8, 7, 6, 5, 4, 3, 2, 16, 15, 14);
#define AY_CLOCK 9 // D9
void aySetup () {
pinMode(AY_CLOCK, OUTPUT);
digitalWrite(AY_CLOCK, LOW);
TCCR1A = (1 << COM1A0);
TCCR1B = (1 << WGM12) | (1 << CS10);
TCCR1C = 0;
TIMSK1 = 0;
OCR1AH = 0;
OCR1AL = 7; // 16MHz / 8 = 2MHz Counter
psg.begin();
// Output highest frequency on each channel, but set level to 0
// Highest freq = 1000000 / (16 * 1) = 62500
psg.write(AY3891x::ChA_Amplitude, 0);
psg.write(AY3891x::ChA_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChA_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChB_Amplitude, 0);
psg.write(AY3891x::ChB_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChB_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChC_Amplitude, 0);
psg.write(AY3891x::ChC_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChC_Tone_Period_Fine_Reg, 0);
// LOW = channel is in the mix.
// Turn everything off..
psg.write(AY3891x::Enable_Reg, 0xFF);
}
int toggle;
void setup() {
pinMode(11, OUTPUT);
toggle = LOW;
digitalWrite(11, toggle);
aySetup();
}
void loop() {
toggle = !toggle;
digitalWrite(11, toggle);
for (int i=0; i<16; i++) {
psg.write(AY3891x::ChA_Amplitude, i);
}
}
All this is doing is continually writing 0 to 15 to the channel A level register whilst toggling a GPIO pin. Putting an oscilloscope trace on the IO pin and the AY-3-8910 channel A output gives me the following:
This is running with a period of 6.96mS, meaning each cycle of 16 writes takes 3.5mS, giving me almost 220uS per call to the AY write function which seems to align pretty well with what I was seeing before.
And this is generating an audible tone at around 280Hz, so regardless of any timer settings or waveform processing, this is going to be the baseline frequency on which everything else would have to rest, which isn’t great.
Optimising Register Writes
So at this point I have the choice of attempting to write to the AY-3-8910 myself using PORT IO to eliminate the time it takes for all those loops and digitalWrite() calls. Or I could try some alternative libraries.
The library I’m using aims for the most portable compatibility: “This library uses the generic digitalWrite()
function instead of direct port manipulation, and should therefore work across most, if not all, processors supported by Arduino, so long as enough I/O pins are available for the interface to the PSG.”
It is a deliberate design choice, but does require all three bus control signals to be used: BDIR, BC1, BC2.
Alternatives are possible with less pin state changes, but much stricter timing requirements. Some options include:
- https://github.com/53175ddd/AY-3-8910_Arduino – uses a mixture of PORT IO and digitalWrite(). Assumes use of D0-D7 for data channel.
The following are projects that have not used a library, but just done their own thing:
- https://github.com/internalregister/AY-3-8910 – uses a mixture of digitalWrite and PORT IO. Assumes use of D0-D7 for the data channel.
- https://github.com/GaryA/TB-AY-3_MIDI – uses direct PORT IO for D2-D9
Unfortunately none of these really solves the problem as the PCB I’m using does not neatly map onto IO ports to allow the use of direct PORT IO for the data.
So to improve things whilst using this same PCB will require me to re-write the library myself.
As a test however, it is possible to take the IO pin definitions used with the PCB and write a bespoke, optimised register write routine as follows:
void ayFastWrite (byte reg, byte val) {// Mode=Addr Latch
digitalWrite(BC1, HIGH);
digitalWrite(BDIR, HIGH);
// Latch address
// NB: Addresses are all in range 0..15 so don't need to
// worry about writing out bits 6,7 - just ensure set to zero
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
delayMicroseconds(10);
// Mode = Write
digitalWrite(BC1, LOW);
digitalWrite(BDIR, HIGH);
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
}
I’m using the following mapping of data pins to Arduino digital IO pins to PORTS:
DA0-DA5D2-D7PORTD Bits 0-5DA6D8PORT B Bit 0DA7A3/D17PORT C Bit 3To make this happen I have to ensure that the right bits are set to OUTPUTs and that BC2 is held HIGH prior to using the fastWrite function.
digitalWrite(BC2, HIGH);DDRD |= 0xFC;
DDRC |= 0x04;
DDRB |= 0x01;
This now improves on that previous 280Hz and gives me 1600Hz performance.
So can I do any better? Well there are still between 6 and 8 calls to digitalWrite going on to handle the control signals…
#define BC1LOW {PORTC &= 0xFE;} // A0 LOW#define BC1HIGH {PORTC |= 0x01;} // A0 HIGH
#define BC2LOW {PORTC &= 0xFD;} // A1 LOW
#define BC2HIGH {PORTC |= 0x02;} // A1 HIGH
#define BDIRLOW {PORTC &= 0xF7;} // A2 LOW
#define BDIRHIGH {PORTC |= 0x04;} // A2 HIGH
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
BC1HIGH;
BDIRHIGH;
// Latch address
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Need 400nS Min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS settle then 50nS preamble
delayMicroseconds(1);
// Mode = Write
BC1LOW;
BDIRHIGH;
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Need 500nS min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS min
}
The timings come from the AY-3-8910 datasheet:
The actual minimum and maximum timings for the various “t” values are given in the preceeding table. Most have a minimum value, but tBD has to be noted: the “associative delay time” is 50nS. This means that any changing of BC1, BC2 and BDIR has to occur within 50nS to be considered part of the same action.
There is no means of having a nano-second delay (well, other than just spinning code), so I’ve just used a delayMicroseconds(1) here and there. This isn’t reliably accurate on an Arduino, but as I’m have delays of around half of that as a maximum it seems to be fine.
This now gives me the following:
This is now supporting a natural “as fast as possible” frequency of around 24kHz, meaning each call to the write function is now around 3uS. That is almost a 100x improvement over using all those pinMode and digitalWrite calls.
The downside of this method:
- It is ATMega328 specific.
- It is specific to the pin mappings and PORT usage of this PCB.
- It does not support reading or other chip operations between the writes.
It is also interesting to see that the traces also show the high frequency oscillation (62.5kHz) that is being modulated regardless of the channel frequency and enable settings.
DDS Part 2
Success! At least with a single channel. This is now playing a pretty well in tune 440Hz A.
Notice how the frequency of the timing pin is now ~4.2kHz meaning that the ISR is now indeed firing at the required 8192 Hz.
Here is a close-up of the output signal. The oscilloscope was struggling to get a clean frequency reading, but this is one time I caught it reading something close! I checked the sound itself with a tuning fork (see video). It is indeed 440Hz.
Closing Thoughts
I wanted to get something put together to allow me to drive a DSS wavetable over MIDI, with different waveforms, and so on, but it turned out to be a little more involved getting this far than I anticipated, so I’ll leave it here for now.
But hopefully filling in the gaps won’t take too long and will be the subject of a further post.
Now that I have something that works, I’m actually quite surprised by how well it is working.
Kevin
#arduinoNano #ay38910 #dds #define #directDigitalSynthesis #include #midi