Updating STM32 audio bootloader WAV file writer to Python 3.8 and adding support for STM32H743xI devices

I know I could wait for the Beads firmware to drop someday… but I’m impatient and would like to use the stm32 audio bootloader very soon for a hobby project I’m working on with the H7… so I thought I’d give it a shot myself!

In addition, I’ve been struggling to get Python 2.5 and an older version of NumPy to run on my Windows machine, which prompted me to go through and update the code for Python 3.8.2 as well.

I feel like I’m super close but I’m noticing some audible differences between the .WAV files my script is creating and the firmware releases on the Mutable site. If anyone is interested in reviewing what I have thus-far, it would be much appreciated. I will share all of this on git for others to use in the future.

Anyway… here is the most-recent version of the code. I’ve tried to tag all of my updates with some justification. You can run the script using run.bat on the testfirmware.bin file I’ve included:

When I run the script, I’m using the following settings:

py encoder.py -t stm32h7 -s 48000 -b 12000 -c 6000 -p 256 testfirmware.bin

The STM32H743xI has two banks of flash memory with eight 128kbytes sectors of memory per bank. 128 * 1024 = 131,0722 bytes per sector. I’m storing the bootloader in Bank 1 Sector 0 and my application will be stored in sector 1 and onward.


So I added the following to encoder.py:



STM32H7_BLOCK_SIZE = 131072

Then, because the file() method has been removed in Py 3, I added:

  data = open(args[0], 'rb')
  dataLength = len(data.read())
  dataList = b""
  while (byte := data.read(1)):
      dataList += byte

Then, i basically just created another set of options for stm32h7… but used the same ‘pause’ settings as the ones used for the stm32f4.

  elif options.target == 'stm32h7':
    for x in range(0, dataLength, STM32H7_BLOCK_SIZE):
      address = STM32H7_APPLICATION_START + x
      block = dataList[x:x+STM32H7_BLOCK_SIZE]
      pause = 3.5 if address in STM32H7_SECTOR_BASE_ADDRESS else 0.2
      for block in encoder.code(block, STM32H7_BLOCK_SIZE, pause):
        if len(block):
    blank_duration = 5.0

There were various other updates required to explicitly define certain operations as integer operations instead of floating point… I tackled this by typecasting or using floor division in a couple of places. One of the most important spots seemed to be in _encode_qpsk() method…

  def _encode_qpsk(self, symbol_stream):
    # Updated by JJ - used floor division here
    ratio = self._sr // self._br * 2
    # End JJ
    symbol_stream = numpy.array(symbol_stream)
    bitstream_even = 2 * self._upsample(symbol_stream % 2, ratio) - 1
    # Updated by JJ - used floor division here
    bitstream_odd = 2 * self._upsample(symbol_stream // 2, ratio) - 1
    # End JJ
    return bitstream_even / numpy.sqrt(2.0), bitstream_odd / numpy.sqrt(2.0)

After running the code, my .wav file seems to be formatted correctly, which is good and I’m not getting any errors:

  • Chunk ID = 52/49/46/46 = ‘RIFF’ in ascii
  • ChunkSize =0x009961a4 = 10,052,004 → 9816 kBytes
  • Format = 57/41/56/45 = ‘WAVE’ in ascii
  • Subchunk1 ID = 66/6d/74/20 = 'fmt ’ in ascii
  • Subchunk 1 size = 0x00000010 (little endian) aka 16 bit
  • AudioFormat = 0x0001 (little endian) aka PCM
  • NumChannels = 0x0001 (little endian) aka mono
  • SampleRate = 0x0000bb80 (little endian) aka 48000 Hz
  • ByteRate = 48000 * 1 * 16 / 8 = 96000 = 0x00017700 (little endian)
  • BlockAlign = 1 * 16 / 8 = 2 = 2 0x0002 (little endian)
  • BitsPerSample = 0x0010 (little endian) = 16 bits
  • Subchunk2ID = 64/61/74/61 = ‘data’ in ascii
  • Subchunk2size = 0x00996180 (little endian) = 10,051,968 → 9816 kBytes
  • Data = the actual data


Converting the .WAV file output data to .csv and plotting a small section of the output… things SEEM to look QPSK-ish? But the sound output is much more harsh than what I’m hearing on other Mutable releases so I feel like I’m not quite there yet.

This is where I’ve left off for now. Any advice would be much appreciated! I’m out of ideas at the moment and am going to sleep on it for now.

Have you had a look at this? GitHub - float32/qpsk: A C++ QPSK decoder for embedded systems Might be a cleaner and more modern starting point!

1 Like

This looks great! I’ve learned a lot going line by line through the Mutable wav file writer but I agree… this looks like it was written with Py 3 in mind instead of just applying some band aids to bring the Py 2.5 code into the Py 3 world.

I assume this is the work of @float32 ? Thank you so much for sharing your work with the world! Your repo is very well documented and a joy to look through. :blue_heart::blue_heart::blue_heart: excited to dig in.

As I get my setup going with my STM32H7, I’ll be sure to share my results here!

Thanks! I hope you find it useful as a reference.

At a glance, your waveform plot looks correct to me. If the only problem is that it sounds unlike official Mutable wav files, it might just be because you aren’t enabling the scrambler in your encoder command (-k or --scramble flag)

Sure enough, that was it. I applied the scrambler (after making some adjustments to make it py 3 compatible) and then it sounded like what I was expecting. Still, the flexibility of your encoder and decoder is very appealing. I’ll spend some time reviewing your repo this week as I work through my code for the target. Thank you again! Your work is very inspiring. :two_hearts: :two_hearts:

1 Like

Hey @float32 … I’ve been digging into your qpsk project over the weekend and I feel like I’ve got a good start integrating it into my bootloader; your example code for the F4 discovery board was very helpful.

However, I’m having some compilation issues due to my current setup. For ease of use I’ve been using the STM32Cube IDE for development and compilation… and… sadly… it only supports c11/c++14. I’m guessing this is due to the fact that ARM/gnu has not released an official version of the compiler that supports C++17 yet, so STM32 doesn’t allow the use of it within the development environment. It sounds like there is a community version available that supports C++17 and I can redevelop my workspace around a makefile… but this might take some time… I’m not totally opposed to this but I’m wondering if I can come up with a simpler work around.

I was able to get my code to compile and run using C++14 and removing the “inline” keyword for various variables and commenting out one line: “static_assert(std::atomic_bool::is_aways_lock_free)”.

My question is… do you think I’ll have any issues running your decoder if I remove these inline keywords and this static_assert()?

If it compiles, it will probably work fine.

What version of the IDE are you using? The release notes for v1.6.1 indicate that the toolchain is a customized version of the official ARM toolchain version 9-2020-q2-update, which is based on gcc 9 and should have no problem with c++17.

Wellll whattaya know… I’m on 1.5.0. Let me look into updating and I’ll report back.

Thank you for taking the time to respond! :two_hearts:

@float32 actually, I was on version 1.1.0! The update did the trick and then I had access to C++17 compilation options. Now my code compiles with your original source code without issue. Thanks so much for the tip.

I’m trying to sort out some odd hard fault issue I’m getting during the copy from source to destination in HAL_FLASH_Program… but before the hard fault, it seems like the decoder is successfully decoding the data I’m getting from my audio codec and storing it in the data buffer, so that’s great!

I haven’t spent a ton of time working through the hard fault, but I’ll post my write_block code here in-case anyone is interested in reviewing:

bool WriteBlock(uint32_t address, const uint32_t* data)
    SectorInfo sector_info;
    bool do_erase = false;

    for (auto& sector : kSectors) {
        if (address == sector.address) {
            sector_info = sector;
            do_erase = true;

	// Step 1: Clear all error flags, particularly PGSERR and INCERR

	// Step 2: Unlock the FLASH CR register

	// Step 3: Flash erase params
    uint32_t sector_error;
	FLASH_EraseInitTypeDef erase_params;
	erase_params.TypeErase = FLASH_TYPEERASE_SECTORS;
	erase_params.Banks = (sector_info.sector_num < 8) ? FLASH_BANK_1 : FLASH_BANK_2;
	erase_params.Sector = sector_info.sector_num & 7; // aka modulo 8 operation
	erase_params.NbSectors = 1;
	erase_params.VoltageRange = FLASH_VOLTAGE_RANGE_4;

	//Step 4: Erase flash bank
	if (do_erase) {
		HAL_FLASHEx_Erase(&erase_params, &sector_error);
		FLASH_WaitForLastOperation(HAL_MAX_DELAY, erase_params.Banks);

	//Step 5: Write data to flash memory
	for (uint32_t i = 0; i < kBlockSize; i += 4) {
		if (HAL_OK != HAL_FLASH_Program(FLASH_TYPEPROGRAM_FLASHWORD, address + i, *data++)) {
			return false;
		FLASH_WaitForLastOperation(HAL_MAX_DELAY, erase_params.Banks);

	//Step 6: Lock FLASH so that the next unlock doesn't perma-lock the FLASH CR Reg
	if (HAL_OK != HAL_FLASH_Lock()) {
		return false;

    return true;

What does your kSectors array look like? That needs to match your H7 chip - you probably can’t use the same values from the F407 example.

@float32 yooooo I (almost) got it :slightly_smiling_face:

I had the ksectors array set up correctly. It was mostly an issue with the Hal’s implementation of the flash program function. It wants to copy 256 bits/8 words at a time but I was trying to copy word by word.

There are still a couple of kinks to be worked out… I’m having an ocasional issue with writing the first block of data to memory, sometimes it just skips the first 2048 bytes of the binary, but all the subsequent blocks are spot-on… nevertheless I was able to get it to flash perfectly one time this morning right before I ran out the door for work.

I’ll report back more later tonight after I’ve gotten the chance to look into it a little more… just thought I’d share that I’m really close! :blue_heart::white_heart::blue_heart::white_heart::blue_heart:


@float32 alright so I resolved my issue with the “occasional first 2048 bytes missing” by just increasing self._encode_blank to generate a slightly longer tone to start. 5 seconds seemed like it did the trick. Not sure if this is because my codec needs a second to warm up or what… but this seemed like an easy work around. It flashes that first block every time now.

def _encode_intro(self):
    return self._encode_blank(5.0)

Now… I’m having one final little issue… every time during the last write, I’m getting an ERROR_OVERFLOW. I dumped the firmware from the device and ran a diff with the .bin and noticed that something was not quite right! That last byte isn’t being written…

Readback from device .bin is on the left while the original .bin is on the right:

Not sure what could be causing that… is it because I’m writing 8 words per flash write…? Or maybe I don’t have enough bytes to fill the last packet?

Here’s a link to the latest in-case you’re curious…

Maybe it’s worth noting that I’m using a block based approach (since it seemed like your code supported that) and each block is 32 samples in length. Not sure if that makes a difference.

Also… my encoder.py args:

py encoder.py -s 48000 -y 6000 -b 2048 -w 500 -f 128k:2000:16 -a +0x20000 -x 0x08000000 -p 256 -e 0x420ACAB -t bin -i Tapes_H7_CPP_v0.1.bin -o testfirmware.wav

Does the overflow error happen before or after it attempts to write the block?

The encoder should pad out the binary to a multiple of the block size, but you might want to inspect the waveform to make sure it’s actually happening.

Huh… idk what the deal is but I switched from playing my .wav file through Window’s “Groove Music” app to just playing the .wav file through Ableton and…

Succcesssss!!! :slight_smile: No more error overflow. Maybe Groove Music adds some unexpected cross fading or something…

Also… I just realized that the “last missing byte” is actually just some weird artifact from vim when viewing binaries. Which means it’s workingggggggg. Yayyyyy.

So happy I was able to get this to work (for now)! I’ll continue to use this as my main firmware update method and I’ll report back if I find any other issues. Thank you so much for your help @float32 ! I’m very grateful.

A friend of mine and I are working on an STM32-based stompbox thing and this will make sending updates to them so much easier. If and when we ever finish it… I’ll be sure to share whatever we make!

If you ever need an extra set of eyes to review some code, a pcb layout, or something else embedded-related, feel free to reach out! Thanks again. :two_hearts: :two_hearts: :two_hearts:

1 Like

@float32 just one final note… I actually got it to work with “Groove Music” too by just adding another intro/blank at the end of the encode method:

def encode(self, blocks):
symbols = []
symbols += self._encode_intro()

for (data, time) in blocks:
    symbols += self._encode_block(data)
    symbols += self._encode_blank(time)

symbols += self._encode_outro()
**symbols += self._encode_intro()**
return symbols

Excellent, glad you got it working!