<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>Martin Ankerl &#187; science</title> <atom:link href="http://martin.ankerl.com/category/science/feed/" rel="self" type="application/rss+xml" /><link>http://martin.ankerl.com</link> <description>Chunky bacon!!</description> <lastBuildDate>Sat, 04 Feb 2012 10:18:10 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <item><title>Optimized Approximative pow() in C / C++</title><link>http://martin.ankerl.com/2012/01/25/optimized-approximative-pow-in-c-and-cpp/</link> <comments>http://martin.ankerl.com/2012/01/25/optimized-approximative-pow-in-c-and-cpp/#comments</comments> <pubDate>Wed, 25 Jan 2012 19:48:39 +0000</pubDate> <dc:creator>martinus</dc:creator> <category><![CDATA[C++]]></category> <category><![CDATA[programming]]></category> <category><![CDATA[science]]></category><guid isPermaLink="false">http://martin.ankerl.com/?p=894</guid> <description><![CDATA[Mostly thanks to this reddit discussion, I have updated my pow() approximation for C / C++. I have now two different versions: This new code uses the union trick, instead of the weird casting trick I&#8217;ve used before. This means &#8230; <a href="http://martin.ankerl.com/2012/01/25/optimized-approximative-pow-in-c-and-cpp/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<p>Mostly thanks to <a href="http://www.reddit.com/r/gamedev/comments/n7na0/fast_approximation_to_mathpow/">this reddit discussion</a>, I have updated my <a href="http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/">pow() approximation</a> for C / C++. I have now two different versions:</p><pre class="brush: cpp; title: ; notranslate">inline double fastPow(double a, double b) {
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)(b * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;
  return u.d;
}</pre><p><span id="more-894"></span><br /> <a href="http://martin.ankerl.com/wp-content/uploads/2012/01/pow.png?9d7bd4"><img src="http://martin.ankerl.com/wp-content/uploads/2012/01/pow.png?9d7bd4" alt="" title="This is how a^b looks like, in case you were wondering..." width="240" height="202" class="alignright size-full wp-image-906" /></a>This new code uses the union trick, instead of the weird casting trick I&#8217;ve used before. This means that <tt>-fno-strict-aliasing</tt> is no more  required any more when compiling, and it is also a bit faster because one less temporary variables is needed. When you have a little endian machine, you have to exchange u.x[0] and u.x[1]. On my PC, this version is 4.2 times faster than the much more precise pow().</p><p>Besides that, I also have now a slower approximation that has much less error when the exponent is larger than 1. It makes use <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Exponentiation_by_squaring">exponentiation by squaring</a>, which is exact for the integer part of the exponent, and uses only the exponent&#8217;s fraction for the approximation:</p><pre class="brush: cpp; title: ; notranslate">// should be much more precise with large b
inline double fastPrecisePow(double a, double b) {
  // calculate approximation with fraction of the exponent
  int e = (int) b;
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)((b - e) * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;

  // exponentiation by squaring with the exponent's integer part
  // double r = u.d makes everything much slower, not sure why
  double r = 1.0;
  while (e) {
    if (e &amp; 1) {
      r *= a;
    }
    a *= a;
    e &gt;&gt;= 1;
  }

  return r * u.d;
}</pre><p>This code is 3.3 times faster than pow(). Writing a microbenchmark is not easy, so <a href="http://pastebin.com/DRvPJL2K">I have posted mine here</a>. <a href="http://pastebin.com/ZW95gEyr">Here is also a Java version of the more accurate pow approximation</a>.</p><p>Any ideas how this could be improved? Please post them!</p><div style='clear:both'></div>]]></content:encoded> <wfw:commentRss>http://martin.ankerl.com/2012/01/25/optimized-approximative-pow-in-c-and-cpp/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Optimized pow() approximation for Java, C / C++, and C#</title><link>http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/</link> <comments>http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/#comments</comments> <pubDate>Thu, 04 Oct 2007 22:48:08 +0000</pubDate> <dc:creator>martinus</dc:creator> <category><![CDATA[benchmark]]></category> <category><![CDATA[C++]]></category> <category><![CDATA[coding]]></category> <category><![CDATA[java]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[news]]></category> <category><![CDATA[programming]]></category> <category><![CDATA[science]]></category> <category><![CDATA[tricks]]></category> <category><![CDATA[floating point]]></category> <category><![CDATA[optimization]]></category><guid isPermaLink="false">http://martin.ankerl.com/?p=96</guid> <description><![CDATA[I have already written about approximations of e^x, log(x) and pow(a, b) in my post Optimized Exponential Functions for Java. Now I have more In particular, the pow() function is now even faster, simpler, and more accurate. Without further ado, &#8230; <a href="http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<p>I have already written about approximations of <tt>e^x</tt>, <tt>log(x)</tt> and <tt>pow(a, b)</tt> in my post <a href="http://martin.ankerl.com/2007/02/11/optimized-exponential-functions-for-java/">Optimized Exponential Functions for Java</a>. Now I have more <img src="http://martin.ankerl.com/wp-includes/images/smilies/icon_smile.gif?9d7bd4" alt=':-)' class='wp-smiley' /> In particular, the <tt>pow()</tt> function is now even faster, simpler, and more accurate. Without further ado, I proudly give you the brand new approximation:</p><h1>Approximation of pow() in Java</h1><pre class="brush: java; title: ; notranslate">public static double pow(final double a, final double b) {
    final int x = (int) (Double.doubleToLongBits(a) &gt;&gt; 32);
    final int y = (int) (b * (x - 1072632447) + 1072632447);
    return Double.longBitsToDouble(((long) y) &lt;&lt; 32);
}</pre><p>This is really very compact. The calculation only requires 2 shifts, 1 mul, 2 add, and 2 register operations. That&#8217;s it! In my tests it usually within an error margin of 5% to 12%, in extreme cases sometimes up to 25%. A careful analysis is left as an exercise for the reader. This is very usable for in e.g. <a href="http://en.wikipedia.org/wiki/Metaheuristic">metaheuristics</a> or <a href="http://en.wikipedia.org/wiki/Artificial_neural_network">neural nets</a>.</p><h2>UPDATE, December 10, 2011</h2><p>I just managed to make the above code about 30% faster than the one above on my machine. The error is a tiny fraction different (not better or worse).</p><pre class="brush: java; title: ; notranslate">public static double pow(final double a, final double b) {
	final long tmp = Double.doubleToLongBits(a);
    final long tmp2 = (long)(b * (tmp - 4606921280493453312L)) + 4606921280493453312L;
    return Double.longBitsToDouble(tmp2);
}</pre><p>This new approximation is about <strong>23 times</strong> as fast as Math.pow() on my machine (Intel Core2 Quad, Q9550, Java 1.7.0_01-b08, 64-Bit Server VM). Unfortunately, microbenchmarks are difficult to do in Java, so your mileage may vary. You can download the benchmark <a href="/files/PowBench.java">PowBench.java</a> and have a look, I have tried to prevent overoptimization, and substract the overhead introduced due to this preventation.</p><h1>Approximation of pow() in C and C++</h1><h2>UPDATE, January 25, 2012</h2><p>The code below is updated with using union, you do not need <tt>-fno-strict-aliasing</tt> any more for compiling. Also, here is a <a href="http://martin.ankerl.com/2012/01/25/optimized-approximative-pow-in-c-and-cpp/">more precise version of the approximation</a>.</p><pre class="brush: cpp; title: ; notranslate">double fastPow(double a, double b) {
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)(b * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;
  return u.d;
}</pre><p>Compiled on my Pentium-M with gcc 4.1.2:<pre>gcc -O3 -march=pentium-m -fomit-frame-pointer</pre><p>This version is <b>7.8 times</b> faster than pow() from the standard library.</p><h1>Approximation of pow() in C#</h1><p>Jason Jung has posted a port of the this code to C#:</p><pre class="brush: csharp; title: ; notranslate">public static double PowerA(double a, double b) {
  int tmp = (int)(BitConverter.DoubleToInt64Bits(a) &gt;&gt; 32);
  int tmp2 = (int)(b * (tmp - 1072632447) + 1072632447);
  return BitConverter.Int64BitsToDouble(((long)tmp2) &lt;&lt; 32);
}</pre><h1>How the Approximation was Developed</h1><p>It is quite impossible to understand what is going on in this function, it just magically works. To shine a bit more light on it, here is a detailed description how I have developed this.</p><h2>Approximation of e^x</h2><p>As described <a href="http://martin.ankerl.com/2007/02/11/optimized-exponential-functions-for-java/">here</a>, the paper &#8220;<a href="http://citeseer.ist.psu.edu/schraudolph98fast.html">A Fast, Compact Approximation of the Exponential Function</a>&#8221; develops a C macro that does a good job at exploiting the IEEE 754 floating-point representation to calculate <tt>e^x</tt>. This macro can be transformed into Java code straightforward, which looks like this:</p><pre class="brush: java; title: ; notranslate">public static double exp(double val) {
    final long tmp = (long) (1512775 * val + (1072693248 - 60801));
    return Double.longBitsToDouble(tmp &lt;&lt; 32);
}</pre><h2>Use Exponential Functions for a^b</h2><p>Thanks to the power of math, we know that <tt>a^b</tt> can be transformed like this:</p><ol><li>Take exponential<pre>a^b = e^(ln(a^b))</pre><li>Extract b<pre>a^b = e^(ln(a)*b)</pre></ol><p>Now we have expressed the pow calculation with <tt>e^x</tt> and <tt>ln(x)</tt>. We already have the <tt>e^x</tt> approximation, but no good <tt>ln(x)</tt>. The <a href="http://martin.ankerl.com/2007/02/11/optimized-exponential-functions-for-java/">old approximation</a> is very bad, so we need a better one. So what now?</p><h2>Approximation of ln(x)</h2><p>Here comes the big trick: Rember that we have the nice <tt>e^x</tt> approximation? Well, <tt>ln(x)</tt> is exactly the inverse function! That means we just need to transform the above approximation so that the output of <tt>e^x</tt> is transformed back into the original input.</p><p>That&#8217;s not too difficult. Have a look at the above code, we now take the output and move backwards to undo the calculation. First reverse the shift:</p><pre>final double tmp = (Double.doubleToLongBits(val) >> 32);</pre><p>Now solve the equation<pre>tmp = (1512775 * val + (1072693248 - 60801))</pre><p> for val:</p><ol><li>The original formula<pre>tmp = (1512775 * val + (1072693248 - 60801))</pre><li>Perform subtraction<pre>tmp = 1512775 * val + 1072632447</pre><li>Bring value to other side<pre>tmp - 1072632447 = 1512775 * val</pre><li>Divide by factor<pre>(tmp - 1072632447) / 1512775 = val</pre><li>Finally, val on the left side<pre>val = (tmp - 1072632447) / 1512775</pre></ol><p>Voíla, now we have a nice approximation of <tt>ln(x)</tt>:</p><pre class="brush: java; title: ; notranslate">public double ln(double val) {
    final double x = (Double.doubleToLongBits(val) &gt;&gt; 32);
    return (x - 1072632447) / 1512775;
}</pre><h2>Combine Both Approximations</h2><p>Finally we can combine the two approximations into <tt>e^(ln(a) * b)</tt>:</p><pre class="brush: java; title: ; notranslate">public static double pow1(final double a, final double b) {
    // calculate ln(a)
    final double x = (Double.doubleToLongBits(a) &gt;&gt; 32);
    final double ln_a = (x - 1072632447) / 1512775;

    // ln(a) * b
    final double tmp1 = ln_a * b;

    // e^(ln(a) * b)
    final long tmp2 = (long) (1512775 * tmp1 + (1072693248 - 60801));
    return Double.longBitsToDouble(tmp2 &lt;&lt; 32);
}</pre><p>Between the two shifts, we can simply insert the <tt>tmp1</tt> calculation into the tmp2 calculation to get</p><pre class="brush: java; title: ; notranslate">public static double pow2(final double a, final double b) {
    final double x = (Double.doubleToLongBits(a) &gt;&gt; 32);
    final long tmp2 = (long) (1512775 * (x - 1072632447) / 1512775 * b + (1072693248 - 60801));
    return Double.longBitsToDouble(tmp2 &lt;&lt; 32);
}</pre><p>Now simplify <tt>tmp2</tt> calculation:</p><ol><li>The original formula<pre>tmp2 = (1512775 * (x - 1072632447) / 1512775 * b + (1072693248 - 60801))</pre><li>We can drop the factor <tt>1512775</tt><pre>tmp2 = (x - 1072632447) * b + (1072693248 - 60801)</pre><li>And finally, calculate the substraction<pre>tmp2 = b * (x - 1072632447) + 1072632447</pre></ol><h2>The Result</h2><p>That&#8217;s it! Add some casts, and the complete function is the same as above.</p><pre class="brush: java; title: ; notranslate">public static double pow(final double a, final double b) {
    final int tmp = (int) (Double.doubleToLongBits(a) &gt;&gt; 32);
    final int tmp2 = (int) (b * (tmp - 1072632447) + 1072632447);
    return Double.longBitsToDouble(((long) tmp2) &lt;&lt; 32);
}</pre><p>This concludes my little tutorial on microoptimization of the pow() function. If you have come this far, I congratulate your presistence <img src="http://martin.ankerl.com/wp-includes/images/smilies/icon_smile.gif?9d7bd4" alt=':-)' class='wp-smiley' /></p><p><strong>UPDATE</strong> Recently there several other approximative <tt>pow</tt> calculation methods have been developed, here are some others that I have found through <a href="http://www.reddit.com/r/programming/comments/8kftl/fast_pow_approximation_in_java_and_c/">reddit</a>:</p><ul><li><a href="http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.html">Fast pow() With Adjustable Accuracy</a> &#8212; This looks quite a bit more sophisticated and precise than my approximation. Written in C and for float values. A Java port should not be too difficult.</li><li><a href="http://jrfonseca.blogspot.com/2008/09/fast-sse2-pow-tables-or-polynomials.html">Fast SSE2 pow: tables or polynomials?</a> &#8212; Uses <a href="http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions">SSE </a> operation and seems to be a bit faster than the table approach from the link above with the potential to scale better when due to less cache usage.</li></ul><p>Please post what you think about this!</p><div style='clear:both'></div>]]></content:encoded> <wfw:commentRss>http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/feed/</wfw:commentRss> <slash:comments>41</slash:comments> </item> <item><title>The Best Educational Videos</title><link>http://martin.ankerl.com/2006/12/08/educational-videos/</link> <comments>http://martin.ankerl.com/2006/12/08/educational-videos/#comments</comments> <pubDate>Fri, 08 Dec 2006 21:00:25 +0000</pubDate> <dc:creator>martinus</dc:creator> <category><![CDATA[news]]></category> <category><![CDATA[science]]></category> <category><![CDATA[videos]]></category><guid isPermaLink="false">http://martin.ankerl.com/?p=76</guid> <description><![CDATA[It is cold and foggy outside, which is the best time to kill&#8230; time. And what would be better than killing time with something extremely interesting? I have watched a lot of online documentations, and selected the cream of the &#8230; <a href="http://martin.ankerl.com/2006/12/08/educational-videos/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description> <content:encoded><![CDATA[<p>It is cold and foggy outside, which is the best time to kill&#8230; time. And what would be better than killing time with something extremely interesting? I have watched a lot of online documentations, and selected the cream of the crop for you. Watch all of them them, I ensure you that you will get this amazing &#8220;wow&#8221; feeling after each of them. Read on to view the videos.</p><p><span id="more-76"></span></p><h1>Should Google Go Nuclear? Clean, cheap, nuclear power (no, really)</h1><p>Very knowledgeable professor <a href="http://en.wikipedia.org/wiki/Robert_W._Bussard">Robert W. Bussard</a> talks about his findings on nuclear fusion which were just recently (November 2006) allowed to be made public. His lab really seems to be on to something big. He also voices big problems in the American research system.</p><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=1996321846673788606&#038;hl=en" flashvars=""></embed></center></p><h1>Human Computation</h1><p>Lots of problems are still very hard to do with the computers, while humans can do them easily. Luis von Ahn gives an entertaining presentation on how to utilize this human power. Best of all, he shows how he can get the people to solve theses problems for free. Or even make them <em>pay</em> for solving your problem. <a href="http://images.google.com/imagelabeler/">Google is already using this technique</a>, and there are some other games like <a href="http://www.peekaboom.org/">Peekaboom</a>.</p><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=-8246463980976635143&#038;hl=en" flashvars="&#038;subtitle=on"></embed></center></p><h1>Winning The DARPA Grand Challenge</h1><p>The <a href="http://en.wikipedia.org/wiki/DARPA_Grand_Challenge">DARPA grand challenge</a>, what it took to build a robotic vehicle that would race across the desert by itself.</p><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=8594517128412883394&#038;hl=en" flashvars=""></embed></center></p><h1>Why We Fight</h1><p>War means big business. This BBC documentation provides an inside look at the anatomy of the American war machine. A must see. It starts with the powerfull farewell speech of <a href="http://en.wikipedia.org/wiki/Dwight_D._Eisenhower#Retirement_and_death">Eisenhower</a>.</p><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=-4924034461280278026&#038;hl=en" flashvars=""></embed></center></p><h1>Alkali Metals</h1><p>See what happens when you mix <a href="http://en.wikipedia.org/wiki/Alkali_metal">alkali metals</a> with water.</p><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=-2134266654801392897&#038;hl=en" flashvars=""></embed></center></p><h1>The Paradox of Choice &#8211; Why More Is Less</h1><p>A very enlightening discussion about how our modern way of living influences our choices, and ultimatively our happiness.<br /><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=6127548813950043200&#038;hl=en" flashvars=""></embed></center></p><h1>1. The Elegant Universe</h1><p>The world is very stranger place, so strange that it&#8217;s weirdness seems impossible to understand. This video of 3 parts gives an excellent introduction into the <a href="http://en.wikipedia.org/wiki/String_theory">String Theory</a>, which currently looks like it can give the deepest understanding of the universe we ever had.</p><p>If you like these videos you should have a look at the book <a href="http://www2.wwnorton.com/catalog/fall03/005858.htm">The Elegant Universe</a>. I have read half through it already, it is a bit more difficult to understand than the videos, but much more in depth and very well written.</p><h2>Part 1: Einstein Universe</h2><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=1794242500551206071&#038;hl=en" flashvars=""></embed></center></p><h2>Part 2: Strings</h2><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=3366440257073785288&#038;hl=en" flashvars=""></embed></center></p><h2>Part 3: Welcome to the 11th Dimension</h3><p><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=5253371475737693133&#038;hl=en" flashvars=""></embed></center></p><h1>Vivek &#8212; Homeless Prophet</h1><p>Vivek is a homeless who really has something to say about life, the universe, and everything. Just ignore the pseudo funny comments of the people who record this.</p><p><center><br /> <object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/rB8BdRNVnEI"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/rB8BdRNVnEI" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed></object><br /></center></p><h1>The Great Dictator &#8212; Speech</h1><p>Charlie Chaplin&#8217;s holds a <a href="http://www.americanrhetoric.com/MovieSpeeches/moviespeechthegreatdictator.html">speech</a> to the world in his satire &#8220;<a href="http://en.wikipedia.org/wiki/The_Great_Dictator#Making_of_the_film">The Great Dictator</a>&#8220;. This movie was done in 1940 and the speech is still as relevant as if it was held yesterday.<br /><center><br /> <embed style="width:400px; height:326px;" id="VideoPlayback" type="application/x-shockwave-flash" src="http://video.google.com/googleplayer.swf?docId=4055517603283436476&#038;hl=en" flashvars=""></embed></center></p><div style='clear:both'></div>]]></content:encoded> <wfw:commentRss>http://martin.ankerl.com/2006/12/08/educational-videos/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 1/14 queries in 0.052 seconds using disk: basic
Object Caching 634/652 objects using disk: basic

Served from: martin.ankerl.com @ 2012-02-04 11:45:56 -->
