LPC Playback Formant-Shifting


#21

Cool, thank you!

 inline int32_t ThisBlepSample(uint32_t t) {
if (t > 65535) {
  t = 65535;
}
return t * t >> 18;
  }
  
  inline int32_t NextBlepSample(uint32_t t) {
if (t > 65535) {
  t = 65535;
}
t = 65535 - t;
return -static_cast<int32_t>(t * t >> 18);
  }

Why the static cast on the NextBlepSample return value?


#22

The result is a negative number.


#25

Can I ask if you use a integer or floating-point implementation of the LPC filter?

Does using floating point increase the subjective quality of the LPC filter at all?


#26

Made a little lookup table for rate quantised to notes (± 1 octave):

0.08333
0.08829
0.09354
0.09910
0.10499
0.11124
0.11785
0.12486
0.13228
0.14015
0.14848
0.15731
0.16666
0.17658
0.18708
0.19820
0.20998
0.22248
0.23570
0.24972
0.26457
0.28030
0.29697
0.31462
0.33333

#27

I used floating point everywhere. Maybe it makes things cleaner than on a speak and spell, but I didn’t bother.


#28

Cool, I thought that might be the case.
Good to know.
I guess you had to convert the LPC coefficient table values to floating-point, too.

Did you normalise the table values to the 0.0f - 1.0f range?

Also, does your LPC filter return samples in the range -1.0f to +1.0f, as opposed to the original -512 - +512?


#29

LUT for ± 2 octaves:

0.04167
0.04414
0.04677
0.04955
0.05250
0.05562
0.05893
0.06243
0.06614
0.07007
0.07424
0.07866
0.08333
0.08829
0.09354
0.09910
0.10499
0.11124
0.11785
0.12486
0.13228
0.14015
0.14848
0.15731
0.16667
0.17658
0.18708
0.19820
0.20999
0.22247
0.23570
0.24972
0.26457
0.28030
0.29697
0.31462
0.33333
0.35315
0.37415
0.39640
0.41997
0.44495
0.47140
0.49944
0.52913
0.56060
0.59393
0.62925
0.66667

#30

The coefficient table is still integers. It’s converted to floats at the step where I also do the crossfading between coefficients of adjacent frames.

The excitation pulse is a table of int8_t – but I have oversampled it by 32 (to do aliasing free synthesis even at pitches that are not sr / N) and applied minimum phase reconstruction to make it more compact (allowing me to reach higher pitches).


#31

Interesting. I used the original TI 8-step bit-shift-based interpolation mechanism on the integer coefficient values.

Do you scale the coefficient values at this point, or keep them as-is and simply cast them to float?

I did a naive version of that. I simply copied each value in the chirp table 6 times. Then I rolled a little lookup table to convert LPC 8kHz pitch cycle-length values to the signed 27-bit note values Axoloti uses. This way, I can also play the chirp oscillator directly via MIDI.

Maybe I should lerp between the values in the extended loop table, rather than simply duplicating them.


#32

No they are divided by 128.

Since it’s done offline, it wouldn’t cost more to use a more proper interpolation method.

Here’s the whole story: original chirp, 32x FFT interpolation, minimum phase reconstruction, minimum phase reconstruction + some slight taper to 0 near the tail.

Doesn’t sound noticeably different from the original, but half the size in the time domain.


#33

Do you right-shift the integer values by 7 before casting to float, then, or cast to float, then do a floating-point divide?

Fascinating, thank you for the insights, Olivier.

So the final table ends up around 700 samples long?

Do you increment the phase-counter at 48kHz, and use the same phase-reset mechanism for changing the chirp/f0 pitch as used in the original TI systems?

The final table seems to have a DC offset not present on the original.


#34

Casting to float, then divide by 128.0. This is compiled into the same ARM instruction.

yes.

yes. Except that I don’t reset to the beginning of the table but to 32x fractional sample at which the reset occur. That’s why I have oversampled it.

No, the wiggles below 0 compensate the big bump above 0.


#35

Not sure I understand, unfortunately. I presume this is to smooth out the discontinuity when you reset the phase.

Would it also be possible to achieve something similar by applying the same blep algorithm to the non-oversampled table?

Ah, I see.


#36

You need the discontinuity when the period of the waveform is shorter than the duration of the chirp.

The upsampling is a different situation. Imagine that the period of your signal is 100 samples and your chirp 20 samples. So every 100th sample you “copy-paste” your chirp. Now what to do if your period is 100.2 sample? First you copy-paste your chirp, but 100 samples later, you copy-paste your chirp delayed by 0.2 samples. That’s where the pre-upsampled version comes handy! In this case, I play the upsampled chirp table starting at 0.2 x 32 and with an increment of 32. For the next reset I’ll play the upsampled chirp table starting from 0.4 x 32. The fractional sample is phase / frequency at the wrap point.


#37

I’ll pick my way through that, thanks very much for the explanation.

In the meantime, can I ask another, please?

Would there by any mileage in me simply smoothing my 6x-oversampled chirp wave table using minimum phase reconstruction, in the way you did, but without the 32x oversampling?


#38

Minimum phase reconstruction doesn’t smooth anything - it just produces a signal with a similar frequency content but more compact in the time domain. So it helps for the cases where your period becomes shorter than the chirp duration itself (a situation you can also deal with by letting the instances of the chirp overlap each other).


#39

So it doesn’t produce a perceptually different result, in terms of audio quality, then, or reduce aliasing?


#40

I’ve managed to implement formant-shifting, based on your code above.

I decided not to bother making the shifting smooth, and stuck to semitone shifting, 12 semitones either way.

The formant parameter creates an index into two lookup tables, one for fractional filter sample-rate, the other for an offset for the chirp oscillator pitch.

Works well, though you’re right about it sounding bit-crushed. Not surprising, of course, since the filter is running at 4kHz, 1 octave down.


#41

The minimum phase reconstruction allows your chirp to be shorter, and thus your excitation signal to be pitched higher without having to overlap the tail of one chirp to the beginning of the one from the next period.


#42

I see. :slight_smile: