Using DMA to transfer data with Embedded Rust

2019-04-23

Why DMA?

Usually when we are copying data in a program we'll use the CPU to execute copy instructions that move data between memory regions or device registers.

Sometimes this can create problems when we work with one or more of...

Large amounts of data: The CPU cannot continue to execute the program until it has completed the copy operation. (e.g. disk drives)
High speed data: The CPU may not be fast enough to produce/consume data at the rate a peripheral requires. (e.g. network interfaces)
Precisely timed data: The CPU may not be able to produce data at sufficiently accurate timing intervals for the requirements of the peripheral. (e.g. audio codecs)

Direct Memory Access (DMA) controllers are dedicated peripherals that solve these problems by providing a programmable memory access interface that works independently of the CPU.

A great analogy I came across recently is that DMA provides an asynchronous memcpy interface that lets you:

Specify a source peripheral or memory address.
Specify a destination peripheral or memory address.
Specify the behaviors that control how and when these transfers occur.
Specify an interrupt handler to be invoked when the transfer is complete or an error has occured.

In this article you will learn how to use the Direct Memory Access (DMA) controller to transfer data from memory to the built-in Digital to Analog Converter (DAC) on the STM32F3DISCOVERY board.

What do you need to know?

In order to implement DMA transfers to the built-in DAC you will need to know:

The mapping between the DAC outputs and the discovery's GPIO connectors
How to configure the TIM2 timer to trigger DAC conversions
How to configure the DAC and connect the trigger to the TIM2 timer
How to configure the DMA2 controller and enable transfer requests from the DAC
How to configure the DMA2 controller to fire an interrupt on transfer completion
How to generate a signal for the DAC to convert to audio

We'll be using the stm32f303 Peripheral Access Crate (PAC) as our primary interface to the board.

Mapping hardware connections and peripheral registers

Hardware connections

The STM32F303VCT6's DAC outputs are connected to pins PA4 and PA5, later you will also make use of one of the GND outputs to connect an output device such as an oscilloscope or speaker:

Internally, the processor's DAC trigger connects to the TIM2 timer and channel 3 of the DMA2 controller. (It's possible to remap these but this is beyond the scope of the article.)

Peripheral registers

We'll be interacting with the following peripherals and registers:

RCC_AHBENR: clock enable for GPIOA and DMA2
RCC_APB1ENR: clock enable for TIM2, DAC
TIM2_CR1: TIM2 control register 1
TIM2_CR2: TIM2 control register 2
TIM2_ARR: TIM2 auto-reload register
GPIOA_MODER: GPIO port mode register
GPIOA_PUPDR: GPIO port pull-up/pull-down register
DAC_CR: DAC control register
DMA2_CMAR3: DMA2 Channel 3 memory address register
DMA2_CPAR3: DMA2 Channel 3 peripheral address register
DMA2_CNDTR3: DMA2 Channel 3 number of data register
DMA2_CCR3: DMA2 Channel 3 configuration register
DMA2_ISR: DMA2 interrupt status register
DMA2_IFCR: DMA2 interrupt flag clear register
NVIC: Nested vectored interrupt controller

Implementation

For the implementation you will need to:

Declare global objects
Initialize the TIM2 timer
Initialize GPIOA and the DAC
Initialize the DMA2 controller
Implement the DMA2_CH3 interrupt handler
Generate a signal the DMA2 controller can feed to the DAC
Start DMA2 data transfer
Connect the DAC outputs to something useful

1. Declare global objects

The DMA2 data transfer will be copying data from a memory buffer to the DAC in response to requests made by the DAC.

If you want to change the contents of this buffer between transfers you will also need to access it from the DMA2_CH3 interrupt handler.

This means it will have to be declared globally:

const DMA_LENGTH:usize = 64;
static mut DMA_BUFFER: [u32; DMA_LENGTH] = [0; DMA_LENGTH];

There are no concurrency primitives wrapping DMA_BUFFER but it will only ever be accessed from the DMA2_CH3 interrupt handler. This means you can still build a safe API which avoids the unnecessary overhead of synchronization or access wrappers.

You'll also want to access the DMA2 register block both within the main loop of the program and the DMA2_CH3 interrupt handler.

If you've already worked through my article on Programming GPIO Interrupts you'll know to wrap it in a global Mutex to do this safely:

lazy_static! {
    static ref MUTEX_DMA2:  Mutex<RefCell<Option<stm32f303::DMA2>>>  = Mutex::new(RefCell::new(None));
}

2. Initialize the `TIM2` timer

The TIM2 timer drives the DAC conversion process.

Each time the timer is fired, the DAC will request a samples worth of data from the DMA2 controller and output it as an analogue signal.

Most embedded development tends to follow a similiar pattern when working with peripherals:

Enable peripheral clock to power it on.

Configure peripheral functions by setting the corresponding bits in the control register(s).

Configure connections to associated peripherals, if any.

Configure interrupts, if any.

Enable the peripheral.

Once you get used to this pattern you may find much of the seeming complexity in bare metal code to be somewhat reduced. Looking out for this pattern will also make it easier to figure out what's going on when you encounter a new platform or peripheral with no example code and only a datasheet for documentation!

Let's go through the relevant steps to initialize the TIM2 timer peripheral.

pub fn init_tim2(dp: &stm32f303::Peripherals) {

First you'll need to enable the peripheral clock to power it on. See if you can find the appropriate bit to set in the RCC (Reset and clock control) register to do this:

Also, take a look at the PAC documentation and find the corresponding value: stm32f303::rcc::apb1enr::W

You can now enable the TIM2 clock:

    // enable TIM2 clock
    let rcc = &dp.RCC;
    rcc.apb1enr.modify(|_, w| w.tim2en().set_bit());

Before you can configure TIM2 we'll need to calculate the rate we need the timer to run at.

The timer has a built-in counter which increases by one with every clock-cycle. When this counter reaches the value configured in the auto reload register TIM2_ARR the timer will fire and reset the counter back to zero.

We can therefore control the frequency of the timer (and by extension the output frequency of the DAC) by calculating an appropriate value:

    // calculate timer frequency
    let sysclk = 8_000_000;       // the stmf32f3 discovery board CPU runs at 8Mhz by default
    let fs = 44_100;              // we want an audio sampling rate of 44.1KHz
    let arr = sysclk / fs;        // value to use for auto reload register (arr)

To configure the timer you'll need to set the TIM2_CR1, TIM2_CR2 and TIM2_ARR registers:

And the corresponding PAC documentation:

Now configure TIM2:

    // configure TIM2
    let tim2 = &dp.TIM2;
    tim2.cr2.write(|w| w.mms().update());       // update when counter reaches arr value
    tim2.arr.write(|w| w.arr().bits(arr));      // set timer period (sysclk / fs)

Finally enable TIM2:

Why am I enabling the peripheral in a separate step? Can't we just set the cen bit during the configuration call to tim.cr1.write()?

Maybe.

The reason I didn't is that many peripherals need to be disabled during configuration phase.

Trying to configure such a peripheral while it is enabled can cause endless headaches where configuration settings don't seem to apply or, worse, behave differently every time you run the program.

    // enable TIM2
    tim2.cr1.modify(|_, w| w.cen().enabled());
}

3. Initialize `GPIOA` and the `DAC`

For the next step you'll need to enable power for the GPIOA and DAC, configure the GPIOA pins and finally the DAC.

pub fn init_dac1(dp: &stm32f303::Peripherals) {

Again, clock control is managed by the RCC peripheral so you'll want to look at the RCC_AHBENR register for GPIOA and the RCC_APB1ENR register for the DAC:

    // enable GPIOA and DAC clocks
    let rcc = &dp.RCC;
    rcc.ahbenr.modify(|_, w| w.iopaen().set_bit());
    rcc.apb1enr.modify(|_, w| w.dacen().set_bit());

To configure the GPIOA pins appropriately you'll want them as analog pins in a floating configuration (analog signal range is -V/+V so no pull-up/down resistors)

To configure the mode, set the GPIOA_MODER register: stm32f303::gpioa::moder::W

    // configure PA04, PA05 (DAC_OUT1 & DAC_OUT2) as analog, floating
    let gpioa = &dp.GPIOA;
    gpioa.moder.modify(|_, w| w.moder4().analog()
                               .moder5().analog());

For pull-up/down configuration you'll need to set the GPIOA_PUPDR register: stm32f303::gpioa::pupdr::W

Which should give you this:

    gpioa.pupdr.modify(|_, w| w.pupdr4().floating()
                               .pupdr5().floating());

Next, DAC configuration is relatively straight-forward:

disable the output buffer for improved SNR at the cost of limited output current. (AN4566, page 5)
enable the trigger inputs to drive the conversion and DMA requests
connect the trigger inputs to TIM2

This time we only need to look at a single register, DAC_CR: stm32f303::dac::cr::W

    // configure DAC
    let dac = &dp.DAC;
    dac.cr.write(|w| w.boff1().disabled()     // disable dac output buffer for channel 1
                      .boff2().disabled()     // disable dac output buffer for channel 2
                      .ten1().enabled()       // enable trigger for channel 1
                      .ten2().enabled()       // enable trigger for channel 2
                      .tsel1().tim2_trgo()    // set trigger for channel 1 to TIM2
                      .tsel2().tim2_trgo());  // set trigger for channel 2 to TIM2

Finally, enable the DAC:

    // enable DAC
    dac.cr.modify(|_, w| w.en1().enabled()   // enable dac channel 1
                          .en2().enabled()); // enable dac channel 2

4. Initialize the `DMA2` controller

Kicking off initialization by enabling power to DMA2 should be second-nature by now:

pub fn init_dma2(cp: &mut cortex_m::peripheral::Peripherals, dp: &stm32f303::Peripherals) {
    // enable DMA2 clock
    let rcc = &dp.RCC;
    rcc.ahbenr.modify(|_, w| w.dma2en().set_bit());

Next you will need to know a) the memory address of the data to be transferred b) the peripheral memory address for the DAC and c) the number of items to transfer.

You can calculate the peripheral memory address for the DAC by looking at the register reference. We'll be using the DAC in dual-mode so you'll want the DAC_DHR12RD register:

    // dma parameters
    let ma = unsafe {
        DMA_BUFFER.as_ptr()
    } as usize as u32;            // source: memory address
    let pa = 0x40007420;          // destination: Dual DAC 12-bit right-aligned data holding register (DHR12RD)
    let ndt = DMA_LENGTH as u16;  // number of items to transfer

Configuration for the DMA2 controller may look intimidating at first but it's actually quite straightforward once you break it down.

a) DMA2_CMAR3 sets the source memory address:

    // configure and enable DMA2 channel 3
    let dma2 = &dp.DMA2;
    dma2.cmar3.write(|w| w.ma().bits(ma));     // source memory address

b) DMA2_CPAR3 sets the peripheral register address:

    dma2.cpar3.write(|w| w.pa().bits(pa));     // destination peripheral address

c) DMA2_CNDTR3 sets the number of items to transfer:

    dma2.cndtr3.write(|w| w.ndt().bits(ndt));  // number of items to transfer

The channel configuration gets set up in DMA2_CCR3 and is reasonably self-explanatory:

    dma2.ccr3.write(|w| {
        w.dir().from_memory()   // source is memory
         .mem2mem().disabled()  // disable memory to memory transfer
         .minc().enabled()      // increment memory address every transfer
         .pinc().disabled()     // don't increment peripheral address every transfer
         .msize().bit32()       // memory word size is 32 bits
         .psize().bit32()       // peripheral word size is 32 bits
         .circ().enabled()      // dma mode is circular
         .pl().high()           // set dma priority to high
         .teie().enabled()      // trigger an interrupt if an error occurs
         .tcie().enabled()      // trigger an interrupt when transfer is complete
         .htie().enabled()      // trigger an interrupt when half the transfer is complete
    });

You may notice that we are enabling three different events that can trigger the DMA2_CH3 interrupt. The error event is obvious but why do we need two interrupts for the data transfer?

The reason for this is that we are going to be treating the DMA_BUFFER memory as a double buffered queue.

While the DMA2 controller sends one half of the buffer to the DAC we will be filling new data into the other half from our interrupt handler.

This is why we are enabling these two events: one that fires when the buffer is halfway and the other once it is complete.

Finally, enable the DMA2_CH3 interrupt in the NVIC (Nested vectored interrupt controller):

    // enable DMA interrupt
    let nvic = &mut cp.NVIC;
    nvic.enable(stm32f303::Interrupt::DMA2_CH3);

...and tell the DAC to request its data from DMA:

    // enable DMA for DAC
    let dac = &dp.DAC;
    dac.cr.modify(|_, w| w.dmaen1().enabled());
}

Phew!

5. Implement the `DMA2_CH3` interrupt handler

For the DMA2_CH3 interrupt handler we need to figure out a) which event has triggered the interrupt, b) clear the interrupt flag and c) handle each of the interrupt events.

You can get the interrupt event from the DMA2_ISR register and then clear the corresponding bits in the DMA2_IFCR register:

Something like this:

#[interrupt]
fn DMA2_CH3() {
    // determine interrupt event
    let isr = cortex_m::interrupt::free(|cs| {
        let refcell = MUTEX_DMA2.borrow(cs).borrow();
        let dma2 = refcell.as_ref();

        // cache interrupt state register (before we clear the flags!)
        let isr = dma2.unwrap().isr.read();

        // clear interrupt flags
        dma2.unwrap().ifcr.write(|w| w.ctcif3().clear().chtif3().clear().cteif3().clear());

        isr
    });

Finally, invoke the correct handler for each event:

    // handle interrupt events
    if isr.htif3().is_half() {
        audio_callback(unsafe { &mut DMA_BUFFER }, DMA_LENGTH / 2, 0);
    } else if isr.tcif3().is_complete() {
        audio_callback(unsafe { &mut DMA_BUFFER }, DMA_LENGTH / 2, 1);
    } else if isr.teif3().is_error() {
        // handle dma error
    } else {
        // handle unknown interrupt
    }
}

Note the last parameter for the audio_callback() where we pass a hint about which half of the buffer should be filled next.

6. Generate a signal the `DMA2` controller can feed to the `DAC`

The audio_callback() function provides a buffer, the length of the data that should be filled and an offset representing which half of the buffer to fill:

fn audio_callback(buffer: &mut [u32; DMA_LENGTH], length: usize, offset: usize) {

For example, you can use lookup tables generate a sine wave and a sawtooth wave to feed to channel 1 and channel 2 of the DAC:

    static mut PHASE:f32 = 0.;
    let mut phase = unsafe { PHASE };

    let wt_length = wavetable::LENGTH;
    let wt_sin = wavetable::SIN;
    let wt_saw = wavetable::SAW;

    let dx = 261.6 * (1. / 44100.);  // 261.6 Hz = Middle-C

    for t in 0..length {
        let index = (phase * wt_length as f32) as usize;
        let channel_1 = wt_sin[index] as u32;
        let channel_2 = wt_saw[index] as u32;

        let frame = t + (offset * length);
        buffer[frame] = (channel_2 << 16) + channel_1;

        phase += dx;
        if phase >= 1.0 {
            phase -= 1.0;
        }
    }

    unsafe { PHASE = phase; }
}

There's a whole lot going on up there which could fill several articles of their own, but the important bits to note are:

We're using the DAC in dual mode which means that, for each sample, it's expecting two 12-bit words containing a value between 0 and 4095. One word for each channel.
Specifically, the DAC is reading your data from the DAC_DHR12RD register which contains two right-aligned 12-bit fields:

Our buffer represents the samples for each channel in a single 32 bit word.
Which is why we need to encode our data as follows:

buffer[index] = (channel_2 << 16) + channel_1;

7. Start `DMA2` data transfer

At last we get to our main() method!

Given that the DMA2 controller is doing the bulk of the work and the CPU is literally only being used to initialize the peripherals and generate some noise for the audio buffer it ends up being really simple:

#[entry]
fn main() -> ! {
    let mut cp = cortex_m::Peripherals::take().unwrap();
    let dp = stm32f303::Peripherals::take().unwrap();

    // initialize peripherals
    init_leds(&dp);
    init_tim2(&dp);
    init_dac1(&dp);
    init_dma2(&mut cp, &dp);

    // wrap shared peripherals
    cortex_m::interrupt::free(|cs| {
        MUTEX_DMA2.borrow(cs).replace(Some(dp.DMA2));
    });

    // start dma transfer
    cortex_m::interrupt::free(|cs| {
        let refcell = MUTEX_DMA2.borrow(cs).borrow();
        let dma2 = refcell.as_ref().unwrap();
        dma2.ccr3.modify(|_, w| w.en().enabled());
    });

    // enter main loop
    loop {
        cortex_m::asm::wfi(); // wait for interrupt
    }
}

That's all there is to it!

8. Connect the `DAC` outputs to something useful

At full amplitude the DAC outputs 3.3 Volts peak to peak with sufficient current to drive a small speaker or headphones.

If you connect a speaker to GND and the PA4 or PA5 pins you should hear a tone vibrating at roughly 260 Hz or Middle C:

If you have an oscilloscope you can see that PA4 is outputting a sine wave and PA5 a sawtooth wave:

Alternatively, you can connect the DAC outputs directly to your PC's sound card and look at the signal outputs with software:

Source code

You can find the code for this article in the github repo:

git clone https://github.com/antoinevg/stm32f3-rust-examples.git
cd stm32f3-rust-examples
make deps

# run in one terminal
openocd -f openocd.cfg

# run in another terminal
cargo run --bin stm32f3-02-dma

ant @ flowdsp.io

Using DMA to transfer data with Embedded Rust

Why DMA?

What do you need to know?

Links:

Mapping hardware connections and peripheral registers

Hardware connections

Peripheral registers

Implementation

1. Declare global objects

2. Initialize the `TIM2` timer

3. Initialize `GPIOA` and the `DAC`

4. Initialize the `DMA2` controller

5. Implement the `DMA2_CH3` interrupt handler

6. Generate a signal the `DMA2` controller can feed to the `DAC`

7. Start `DMA2` data transfer

8. Connect the `DAC` outputs to something useful

Source code

ant @ flowdsp.io

Using DMA to transfer data with Embedded Rust

Why DMA?

What do you need to know?

Links:

Mapping hardware connections and peripheral registers

Hardware connections

Peripheral registers

Implementation

1. Declare global objects

2. Initialize the TIM2 timer

3. Initialize GPIOA and the DAC

4. Initialize the DMA2 controller

5. Implement the DMA2_CH3 interrupt handler

6. Generate a signal the DMA2 controller can feed to the DAC

7. Start DMA2 data transfer

8. Connect the DAC outputs to something useful

Source code

2. Initialize the `TIM2` timer

3. Initialize `GPIOA` and the `DAC`

4. Initialize the `DMA2` controller

5. Implement the `DMA2_CH3` interrupt handler

6. Generate a signal the `DMA2` controller can feed to the `DAC`

7. Start `DMA2` data transfer

8. Connect the `DAC` outputs to something useful