Simulating tape/analogue delay in DSP

This isn’t about how to get the sound, but rather the behaviour when the delay time is modulated. In a BBD delay, modulating the delay time changes both the read and write sample rates, effectively re-pitching the entire contents of the buffer in a way analogous to changing tape speed. Therefore, any sound going round in a feedback look can be repitched (subject to the degradation of a BBD/tape delay).

In the usual digital delay implementations, the buffer is of fixed length and the write index always advances at the sample rate. The delay time is set by changing how many samples behind the write point to read from (with interpolation to allow for fractional delay times), so when the delay time is changed the pitch momentarily increases or decreases while the ‘read head’ is in motion, but returns to normal pitch when it stops. Further, if this is done with a feedback loop implemented the momentary pitch modulation gets recorded into all subsequent repeats. This would be analogous to moving the play head in a tape delay rather than changing the tape speed.

It bugs me that a lot of ‘analogue modelled delays’ only simulate the signal degradation but not the modulation behaviour.

So my question is ‘How can I simulate the behaviour of the first type in a fixed sample rate environment?’

I can think of two theoretical methods but they both seem like they’d be problematic:

1. When the delay time is changed, instantaneously resample the entire contents of the buffer into a shorter/longer buffer. This seems impractical for all kinds of reasons, certainly ill suited to a smooth/continuous delay time change.

2. Increase the index increment for shorter delay times and write to (and read from) the buffer with some kind of interpolation… rather than resampling the entire buffer at once the resampling is sort of incorporated into the write and read. I have no idea how to do that though, let alone whether it is practical.

Can someone shed some light on the usual way people solve this problem, or at least tell me what I need to google?


Excellent! I’m glad I arrived at the ‘correct’ solution by just thinking about it. What is this method called? Googling for info about interpolated delays just finds me the usual kind of digital delay with fractional delay times.

Basically, I have no idea how to correctly write to the buffer with interpolation, nor do I know what to search for to learn about it.

Hi, interesting question :slight_smile: but I don’t quite understand what your technique would change with respect to the usual digital delay implem. Wouldn’t what you describe (repitching that is recorded into all subsequent repeats when feedback is on) happen also with a BBD?

Re. writing to the buffer “with interpolation”, it amounts to resampling your input I guess… what’s your issue with it?

The difference is that with tape/BBD/PT2399* the pitch change affects everything already in the delay, so if you halve the delay time it goes up by an octave. The record rate and the playback rate are the same as each other and both vary with the delay time.

With a ‘normal’ digital delay, pitch is only affected while you’re changing the delay time. The buffer is written to at a steady speed and read from at a steady speed, changing the delay time just changes the ‘spacing’ between the write pointer and the read pointer. The process of changing that spacing effectively causes the buffer to be read from faster or slower while it’s being moved, but as soon as you stop changing the spacing it’s reading from the buffer at normal rate again.

So if you have a note going round and round in the first kind of delay, you can freely repitch the entire thing by changing the delay time. In the second kind of delay, changing the delay time introduces a momentary ‘warble’ that will then appear on all subsequent repeats, but overall the note will stay at the same pitch.

I want to replicate the behaviour of the first kind of delay in a digital delay. I don’t know if this explanation is adequate, but hopefully you’ve played with both kinds of delay and noticed the difference.

As for my issue, I just don’t know how to do it.


*The PT2399 is a special case, since the delay time is defined by the rate the chip is clocked at, sort of like a digital BBD.

> As for my issue, I just don’t know how to do it.

Let’s say you want to simulate a 4096 stage BBD configured for a delay of 0.2s. Your initial sample rate is 48kHz.

  • Step 1: Resample the input audio from 48kHz to 4096 / 0.2 = 20480 Hz. You get a stream of samples at 20480 Hz.
  • Step 2: Write this to your buffer. The read/write pointer increments will always be 1.
  • Step 3: Read from your buffer @ (write_ptr + 4095) & 4095.
  • Step 4: Resample the output audio from 20480 Hz to 48kHz.

For step 1 and 4, if you want to go band-limited, you can have a look at how things are done in libsamplerate (there’s a big precomputed windowed sinc table, and it’s reading subsamples of it - that’s a direct implementation of this).

But BBD themselves rarely have anti-aliasing filters.

In any case, when you change the delay time, you don’t touch steps 2 and 3 - you just change the resampling ratios at steps 1 and 4.

> or at least tell me what I need to google?

I would use a fixed-length buffer with variable read and write speed. Imagine the buffer has a length of 44100 samples and the sampling rate of the whole system is 44.1kHz. If the delaytime is set to 1s, then the buffer will be filled and read from at the samplerate of 44.1kHz.

If the delay is set to 2s, then the buffer will be written to at 22.05kHz (downsampling from the incoming audio by factor 0.5) and the buffer is read at 22.05kHz (upsampling it by factor 2x so it can be output at the systems native sampling frequency).
Or put differently: at 2s delaytime for each incoming 10 samples you downsample them to 5 samples and write them into the buffer. At the same time you take 5 samples out of the buffer and upsample them to 10 samples and send them to the output.

It works the same for shorter delaytimes, except that you will have to upsample when writing and downsample when reading.

Basically all you need is a fixed length buffer, and two sample-rate converters where the conversion factor of one is reciprocal to the other. The conversion factor is set by the following formula:

ConversionFactorOfWriteHead = WriteHeadSampleRate/NativeSampleRate = (BufferLength/DesiredDelayTime)/NativeSampleRate

This gives you a BBD “simulation”, it reduces the audio quality at longer delay times and when you change the delaytime you get the desired pitch-shifting effect. To increase the sound quality, you can increase the sitze of the buffer. There is no audio degradation if you ensure that the writehead conversion factor stays >1.

For the down/upsampling, you can google for “windowed sinc interpolation” or any other sample-rate conversion algorithm.

EDIT: Ops, pichenettes was writing at the same time. lol. At least our ideas are the same.

Thanks for pointing me in the right direction guys! I was stuck on the idea of changing the index increment but I now realise how silly that is when compared to resampling and writing steadily to the buffer.

I need to learn more about resampling anyway because it would also allow me to oversample things.

Hi, thanks for the explanation SirPrimalform. I actually never touched a bbd-like delay, and I’m still wrapping my head around the “functional” difference between the two delay models. Can you help me make sense out of your question?

Let’s ignore the question of quality loss when resampling for a minute (or consider that you have an infinite length BBD, or tape of infinitely fine granularity). I understand that in the digital model, the pitch-shifting is proportional to the derivative of the time parameter (that’s what you meant by “pitch is only affected when you change the delay time”), whereas in the BBD case, pitch shifting is directly proportional to the sampling rate (tape speed). Is that the only difference? In other words, if the knob on my digital delay controlled the integral of the delay time, and not the demay time itself, would I have the same results as with a BBD? Again, that’s ignoring the question of bounds of the delay line and quality of resampling.

Generally I’m looking for any tips on how to reason about these delay questions (mathematical models)…

I’m not sure if you can easily describe it with a mathematical formula. The pitch shifting only occurs to contents already in the buffer until they are overwritten. It’s a transient response and doesn’t really fit into a simple output=f(input) formula.

Yea, it can be described as a discrete time state-space model with a finite number of states ( equal to the number of samples in the buffer) or as a continuous time state space model with infinite states. But it’s so complex - what is it worth?

> It’s a transient response and doesn’t really fit into a simple output=f(input) formula.

Yes no, definitely not what I was looking for. What I’m lacking is a good algebraic theory that would allow me to predict these kinds of phenomena. For the moment, I only rely on examples/impulse responses, which are painful to go over by hand… I’ll ask on a DSP mailing list, see what comes up. Thank you for your help anyway!

I’m afraid I can’t answer your question as far as an actual mathematical model, but the difference between the two effects is analogous to the difference between phase modulation and frequency modulation.
The digital style is akin to phase modulation and the BBD style akin to frequency modulation. I find a square wave as a modulator is a good way to appreciate the difference.

With the ‘digital’ style, the square wave switching states would instantaneously change the point in the buffer that you’re reading from, causing a discontinuity rather than any kind of pitch modulation.

With the ‘analogue’ style, when the square wave switches state, it instantaneously changes the rate that the buffer is being written to and read from. A carefully chosen modulation depth with a square wave can cause interesting transposition/harmony effects.

That’s a good starting point indeed (I hadn’t thought of it as FM/PM); now the gap that still needs to be addressed is the writing into the buffer. I’ll think about it some more…

From a mathematical point of view, there’s a redundancy in the information of write speed and read speed. Its not important what the actual speed of the tape is: Its the ratio between write and read speed that matters. If you write at a sine wave onto a tape at the speed of 1 and read at a speed of 2 you double the frequency. The same happens for a write speed of 2 and a read speed of 4.

Maybe that can be a starting point.

Be m(t) the modulation of the tape speed and v0 the initial tape speed, then v = v0 + m(t) is the real tape speed.
All you need is the ratio between the current read speed and the speed at which the information that is currently under the play head was written:

R = amount of frequency-shifting = v(t2) / v(t1)

where t1 is the time at which the information was written and t2 is the time at which it is read.

The desired solution is R (t2). For that, you need t1 and that’s the complicated part. I doubt that a general solution will result in a pretty formula. If you can specify m(t) a little further, you might be able to get to a solution.

Lets assume m(t) = cos(omega*t). Let’s also assume that s is the distance between the write and read head. The distance that a piece of information travels on the tape is x(t1, t2) = integral(v(t)) with the lower bound of the integral being t1 and the upper bound being t2. If you integrate this you get: x(t1, t2) = v0 (t2 - t1) + omega*(sin(omega * t2) - sin (omega * t1)). We then want to get t1 for the condition: x(t1, t2) == s. Here is where I stopped. I don’t think there is an easy solution to this equation. Maybe you can use an approximation with an infinite series (taylor?).

Anyway, if you get t1 then you also get R. For other forms of m(t) you might get a simple solution. Try it.

For a practical way to “simulate” the pitch shifting and get R (t) for arbitrary m(t) you can use a tape delay and instead of writing the actual audio signal, you write the current tape speed. The information from the read head, devided by the current tape speed is R (t). You can program this in pd or Reaktor or any other language.

Oh and I also can’t help to think of the doppler effect. I struggle to relate it to this delay discussion but something tells me there is a relation.

There’s definitely a relation to the doppler effect. It’s almost exactly what you’re getting when you change the delay by moving the read point offset, i.e the phase modulation. The pitch change is proportional to the rate of change in delay time.

Hi all,
I’m getting interesting responses over there (thread towards the bottom of the list) if you’re interested.
@TheSlowGrowth sorry, I still have to digest your long answer; thank you in any case.

The first answer in the list is basically what I’ve written in my long post. The question really is: What do you want to achieve? A) A prediction of the systems behaviour for a specific input aka simulation? Then you can use the method I described at the bottom of my last post. Its easy and it works always. B) A mathematical formula or solution to e.g. predict effects of audio rate modulation in the frequency domain? Then you must solve it analytically. As I showed above, that’s highly relying on the assumptions for the modulator signal and in some cases might not be solvable at all.

Assume a continuous system.

We want to find out the delay time d(t) satisfying out(t) = in(t - d(t))

Variable read head delay: easy, d(t) is the write-read head distance, we manipulate it directly.

BBD delay: in the interval d(t) we had to travel through N samples (N is the length of the BBD), at an instantaneous rate of r(t) samples per second. Thus:

\\int_0^d(t) r(t - \\tau)d\\tau = N.

Thus: R (t) - R (t - d(t)) = N

d(t) = t - R^{-1}(R (t) - N).

You can think of R, the primitive of r, as a kind of sample counter: how many BBD clock ticks have occurred so far. Because the BBD works in discrete time R^-1 is the discrete sequence of instants at which clock ticks occurred. And you can see that you only need access to the N most recent values of R^-1 to find d(t).

So if you want a somewhat impractical algorithm to simulate a BBD-like response with a plain variable read head delay:

  • You keep track of your BBD sample count integral R and its inverse R^1 for the recent past that is to say you simulate the BBD clock as time goes, and every time a sample is clocked, you write its time t in a circular buffer.
  • At any time t, you set your “instantaneous” delay time to t minus the value you read N samples back in your circular buffer keeping track of clock tick instants.

Your circular buffer, which should be no more than N sample long, is simply keeping track of the arrival time of the samples residing in the N stages of your BBD, and you “look back” at the sample that corresponds to the arrival time stored N stages back.

This effectively allows you to look up solutions of the integral equation - but this is only possible because R is “steppy”, so it’s cheap to tabulate its inverse. With this, you also see why any change to the clock rate r will take N samples to propagate through the BBD.

This algorithm won’t simulate the variable sample-rate artefacts of a BBD (it doesn’t take into account the sampling of the signal that occurs at each clock tick), but only simulates the “how far back should we look” bit. It could allow a classic variable sample head delay to have a BBD-like response to modulations.