C# benchmark of 2^x optimizations

As promised last week I’m going to compare three methods of optimizing 2 ^ x calculation:

C# implementations

Cache tables method

See previous post. I tried to further optimize it with use of union structures, Math.Floor(…) function, BitConverter class, but all just led to performance decrease. So I’m sticking with original code.

Polynomial approximation

I used the skeleton of my function, took calculation from this source and the constants from here.
Although it’s suggested that the function was fit to [-0.5…0.5] range, the relative error analysis indicates [0..1] (and that’s what I assumed in the code):

Exp2() - polynomial approximations error graph

Exp2() - polynomial approximations relative error graph for various degrees

Excel spreadsheet calculation: Relative_error(x) = | 2 ^ x – PolyN(x) | / 2 ^ x

Polynomials used:

  • Poly2(x) = 1.0017247 + (x * (0.65763628 + x * 0.33718944))
  • Poly3(x) = 0.9999252 + (x * (0.69583356 + x * (0.22606716 + x * 0.078024521)))
  • Poly4(x) = 1.0000026 + (x * (0.69300383 + x * (0.24144275 + x * (0.052011464 + x * 0.013534167))))
  • Poly5(x) = 0.99999994 + (x *( 0.69315308 + x * (0.24015361 + x * (0.055826318 + x * (0.0089893397 + x * 0.0018775767)))))

Example code for Poly3:

[csharp]public double Pow2Poly3(double x)
{
// Integer part of x
int integer = (int)x;

// Fractional part of x
double frac = x – integer;

// 2 ^ Int(x) value
double powInt;

if (x >= 0)
// 2 ^ Int(x)
powInt = 1 << integer;
else
{
// conversion of negative fractional part
// remember that the polynomial is fit for [0…1]
if (frac < 0)
{
integer–;
frac++;
}

// 2 ^ Int(x) for x < 0
// taking advantage of dependency: 2 ^ (-x) = 1 / (2 ^ x)
powInt = 1d / (1 << -integer);
}

return powInt * (0.9999252 + (frac * (0.69583356 + frac * (0.22606716 + frac * 0.078024521))));
}[/csharp]

Compact IEEE-754

As suggested on Martin Ankerl’s blog:

[csharp]public static double PowerA(double a, double b)
{
int tmp = (int)(BitConverter.DoubleToInt64Bits(a) &amp;gt;&amp;gt; 32);
int tmp2 = (int)(b * (tmp – 1072632447) + 1072632447);
return BitConverter.Int64BitsToDouble(((long)tmp2) &amp;lt;&amp;lt; 32);
}[/csharp]

Benchmark

[iframe http://spreadsheets.google.com/pub?key=0AuMBNVyV5GXPdHJZT21FWHFkaVlLSDc1ZW0ta1g4ZXc&authkey=CICXvqIB&hl=en&output=html&widget=true 760 500]

Note: On my previous post I’ve made a mistake to run the tests under debug mode which caused additional delays. This time a release build was executed without any debuggers attached. In order to remove any JIT compile overhead each method is executed few times before starting proper benchmark.

2^x functions time efficiency comparison graph

2^x methods - time efficiency comparison graph

2^x functions Speed [ms] vs Precision chart

2^x methods - Precision (max relative error) vs Time efficiency (ms)

Conclusions

  • Compact IEEE-754 – sucks in C#.
    It provides lowest quality with times no better (or even worse) than the other methods.
    Most likely reason for that is the usage of BitConverter class. Faster would be probably to use pointers.
    The method doesn’t handle well negative values – this can be fixed in same way it’s done for two other implementations. However even with such improved precision the error could reach up to 25%. This is too much for my project.
  • Polynomials – Poly4: best choice for precision up to 0,00001 !
    What is strange there is very little difference between Poly2, Poly3 and Poly4 (the last seems to even be faster than the others!). For some reason there is performance decrease for Poly5 – at this point it’s more efficient to use tables.
  • Cache – Table3: best choice for high precision > 0,0001 (up to 10^-9) !
    Time efficiency is similar for all versions. So it makes sense to shoot for all 3 tables and best precision. Note that this solution can be further boosted – either with increasing cache size (current = 3 * 2048B) or adding more tables (the Table4 performance is yet to be determined).

Those conclusions are different than presented on Jose’s blog. This is most likely due to different Cache implementation (more memory efficient) and the magic of .net compiler + it’s IL optimizations. Rule of thumb: stick to the basic math & binary operators, avoid provided library methods & type conversions.

Although precision offered by Poly4 seems reasonable (since human can’t distinguish pitch changes < 5-6 cents), I’ll go for audiophile quality of Table3 method. Also in order to prevent any bigger errors when processing the signal multiple times.

Sound quality is the highest priority :)

Tags: ,