I have already written about approximations of e^x, log(x) and pow(a, b) in my post Optimized Exponential Functions for Java. Now I have more In particular, the pow() function is now even faster, simpler, and more accurate. Without further ado, I proudly give you the brand new approximation:
Approximation of pow() in Java

public static double pow(final double a, final double b) { final int x = (int) (Double.doubleToLongBits(a) >> 32); final int y = (int) (b * (x  1072632447) + 1072632447); return Double.longBitsToDouble(((long) y) << 32); } 
This is really very compact. The calculation only requires 2 shifts, 1 mul, 2 add, and 2 register operations. That’s it! In my tests it usually within an error margin of 5% to 12%, in extreme cases sometimes up to 25%. A careful analysis is left as an exercise for the reader. This is very usable for in e.g. metaheuristics or neural nets.
UPDATE, December 10, 2011
I just managed to make the above code about 30% faster than the one above on my machine. The error is a tiny fraction different (not better or worse).

public static double pow(final double a, final double b) { final long tmp = Double.doubleToLongBits(a); final long tmp2 = (long)(b * (tmp  4606921280493453312L)) + 4606921280493453312L; return Double.longBitsToDouble(tmp2); } 
This new approximation is about 23 times as fast as Math.pow() on my machine (Intel Core2 Quad, Q9550, Java 1.7.0_01b08, 64Bit Server VM). Unfortunately, microbenchmarks are difficult to do in Java, so your mileage may vary. You can download the benchmark PowBench.java and have a look, I have tried to prevent overoptimization, and substract the overhead introduced due to this preventation.
Approximation of pow() in C and C++
UPDATE, January 25, 2012
The code below is updated with using union, you do not need fnostrictaliasing any more for compiling. Also, here is a more precise version of the approximation.

double fastPow(double a, double b) { union { double d; int x[2]; } u = { a }; u.x[1] = (int)(b * (u.x[1]  1072632447) + 1072632447); u.x[0] = 0; return u.d; } 
Compiled on my PentiumM with gcc 4.1.2:

gcc O3 march=pentiumm fomitframepointer 
This version is 7.8 times faster than pow() from the standard library.
Approximation of pow() in C#
Jason Jung has posted a port of the this code to C#:

public static double PowerA(double a, double b) { int tmp = (int)(BitConverter.DoubleToInt64Bits(a) >> 32); int tmp2 = (int)(b * (tmp  1072632447) + 1072632447); return BitConverter.Int64BitsToDouble(((long)tmp2) << 32); } 
How the Approximation was Developed
It is quite impossible to understand what is going on in this function, it just magically works. To shine a bit more light on it, here is a detailed description how I have developed this.
Approximation of e^x
As described here, the paper “A Fast, Compact Approximation of the Exponential Function” develops a C macro that does a good job at exploiting the IEEE 754 floatingpoint representation to calculate e^x. This macro can be transformed into Java code straightforward, which looks like this:

public static double exp(double val) { final long tmp = (long) (1512775 * val + (1072693248  60801)); return Double.longBitsToDouble(tmp << 32); } 
Use Exponential Functions for a^b
Thanks to the power of math, we know that a^b can be transformed like this:
 Take exponential
 Extract b
Now we have expressed the pow calculation with e^x and ln(x). We already have the e^x approximation, but no good ln(x). The old approximation is very bad, so we need a better one. So what now?
Approximation of ln(x)
Here comes the big trick: Rember that we have the nice e^x approximation? Well, ln(x) is exactly the inverse function! That means we just need to transform the above approximation so that the output of e^x is transformed back into the original input.
That’s not too difficult. Have a look at the above code, we now take the output and move backwards to undo the calculation. First reverse the shift:

final double tmp = (Double.doubleToLongBits(val) >> 32); 
Now solve the equation

tmp = (1512775 * val + (1072693248  60801)) 
for val:
 The original formula

tmp = (1512775 * val + (1072693248  60801)) 
 Perform subtraction

tmp = 1512775 * val + 1072632447 
 Bring value to other side

tmp  1072632447 = 1512775 * val 
 Divide by factor

(tmp  1072632447) / 1512775 = val 
 Finally, val on the left side

val = (tmp  1072632447) / 1512775 
VoĆla, now we have a nice approximation of ln(x):

public double ln(double val) { final double x = (Double.doubleToLongBits(val) >> 32); return (x  1072632447) / 1512775; } 
Combine Both Approximations
Finally we can combine the two approximations into e^(ln(a) * b):

public static double pow1(final double a, final double b) { // calculate ln(a) final double x = (Double.doubleToLongBits(a) >> 32); final double ln_a = (x  1072632447) / 1512775; // ln(a) * b final double tmp1 = ln_a * b; // e^(ln(a) * b) final long tmp2 = (long) (1512775 * tmp1 + (1072693248  60801)); return Double.longBitsToDouble(tmp2 << 32); } 
Between the two shifts, we can simply insert the tmp1 calculation into the tmp2 calculation to get

public static double pow2(final double a, final double b) { final double x = (Double.doubleToLongBits(a) >> 32); final long tmp2 = (long) (1512775 * (x  1072632447) / 1512775 * b + (1072693248  60801)); return Double.longBitsToDouble(tmp2 << 32); } 
Now simplify tmp2 calculation:
 The original formula

tmp2 = (1512775 * (x  1072632447) / 1512775 * b + (1072693248  60801)) 
 We can drop the factor 1512775

tmp2 = (x  1072632447) * b + (1072693248  60801) 
 And finally, calculate the substraction

tmp2 = b * (x  1072632447) + 1072632447 
The Result
That’s it! Add some casts, and the complete function is the same as above.

public static double pow(final double a, final double b) { final int tmp = (int) (Double.doubleToLongBits(a) >> 32); final int tmp2 = (int) (b * (tmp  1072632447) + 1072632447); return Double.longBitsToDouble(((long) tmp2) << 32); } 
This concludes my little tutorial on microoptimization of the pow() function. If you have come this far, I congratulate your presistence
UPDATE Recently there several other approximative pow calculation methods have been developed, here are some others that I have found through reddit:
 Fast pow() With Adjustable Accuracy — This looks quite a bit more sophisticated and precise than my approximation. Written in C and for float values. A Java port should not be too difficult.
 Fast SSE2 pow: tables or polynomials? — Uses SSE operation and seems to be a bit faster than the table approach from the link above with the potential to scale better when due to less cache usage.
Please post what you think about this!