Fast (?) log2(x) routine without loop, branch etc

Hi everyone,

I’ve learned a lot by reading Émilie’s codebase and forum posts through the years.

To try to give back to the community a bit, I wanted to share a “new to me” level novel way of calculating log2(x) and pow2(x) without any branches / loops.

Best,
ihsan.

3 Likes

That’s a very neat trick! I didn’t know the clz instruction!

For your pow2 routine, maybe it could be retrofitted in Braids’ ugly ComputePhaseIncrement method (which got ported without much thoughts from the Shruthi codebase - where the idea of doing a division by 12 to get the argument to pow2 was insane).

The modules with FPUs all employ a routine for evaluating 2^(x/12.0), which uses the identity 2^(x / 12) = 2^(frac(x/12)) x 2^(int(x/12)), with LUTs (256 entries) for the multiplier and multiplicand.

Over time, I started calling this routine to compute 2^x too, which explains horrors like SemitonesToRatio(decay * 84.0f) instead of pow2f(decay * 7.0f) in the codebase.

In an upcoming update of Plaits, I use this trick (which works only for floats): use a polynomial approximation for the fractional part of x, then directly mess with the bits of the IEEE 754 representation to multiply by 2^int(x).

template<int order>
inline float Pow2Fast(float x) {
  int32_t x_integral = static_cast<int32_t>(x);
  if (x < 0.0f) {
    --x_integral;
  }
  x -= static_cast<float>(x_integral);

  union {
    float f;
    int32_t w;
  } r;

  if (order == 1) {
    r.f = 1.0f + x;
  } else if (order == 2) {
    r.f = 1.0f + x * (0.6565f + x * 0.3435f);
  } else if (order == 3) {
    r.f = 1.0f + x * (0.6958f + x * (0.2251f + x * 0.0791f));
  }
  r.w += x_integral << 23;
  return r.f;
}
2 Likes

Thank you for sharing. I was also not aware of the CLZ instruction. Great observation!

I have recently been using Paul Mineiro’s approach to calculate log2, log10, pow, pow2 (also utilizes properties of IEEE 754 representation like @pichenettes method) for a compressor to convert to and from decibels.