代码笔记：faster pow() than std::pow()"faster pow() than std::pow

起因

最近实现了一个算法，profiling之后发现是性能瓶颈在使用了std::pow()。但是我对于结果的精度要求不是很高。于是找到了以下一段相当魔幻的代码。

代码

inline double fastPow(double a, double b) {
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)(b * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;
  return u.d;
}

这段代码的摘自Martin Ankerl¹于2007年在个人博客上发布一篇文章"Optimized pow() approximation for Java, C / C++, and C#"²

原理

快速 $e^x$ 运算。

Nicol N. Schraudolph在1999年提出的快速 $e^x$ 算法³是这段代码的一个重要前提，Paper中给出了具体的原理介绍和代码实现⁴。

#define _USE_MATH_DEFINES
#include <math.h>

static union
{
    double d;
    struct
    {
#ifdef LITTLE_ENDIAN
        int j, i;
#else
        int i, j;
#endif
    } n;
} _eco;

#define EXP_A (1048576 / M_LN2) /* use 1512775 for integer version */
#define EXP_C 60801             /* see text for choice of c values */
#define EXP(y) (_eco.n.i = EXP_A * (y) + (1072693248 - EXP_C), _eco.d)

需要注意大小端问题，以及是否是IEEE754编码。（本文的代码均是小端模式，IEEE754编码）

公式化简

因为 $a^b=e^{ln⁡(a^b)}=e^{b∗ln⁡(a)}$ ，而 $ln⁡(x)$ 是 $e^x$ 的逆运算。所以可以反推出

#define LN(x)  (_eco.d = x, double((_eco.n.i - ((1072693248 - EXP_C)))) / EXP_A)

这样，就有了加速版本的pow()

#define POW(x,y)  EXP( (y*(LN(x))) )

将这个宏展开，并对算式合并整理之后，就得到了开头写的代码，以及magic number (1072632447)。Martin Ankerl的文章中有详细的推导过程。

这个计算精度非常低，不过在reddit的讨论之下，Martin Ankerl更新了一个高精度版本。

// should be much more precise with large b
inline double fastPrecisePow(double a, double b) {
  // calculate approximation with fraction of the exponent
  int e = (int) b;
  union {
    double d;
    int x[2];
  } u = { a };
  u.x[1] = (int)((b - e) * (u.x[1] - 1072632447) + 1072632447);
  u.x[0] = 0;

  // exponentiation by squaring with the exponent's integer part
  // double r = u.d makes everything much slower, not sure why
  double r = 1.0;
  while (e) {
    if (e & 1) {
      r *= a;
    }
    a *= a;
    e >>= 1;
  }

  return r * u.d;
}

就算如此，和标准库的计算结果还是有些不同。如果对精度有要求，不建议使用。

博客备份自：代码笔记：faster pow() than std::pow()

代码笔记：faster pow() than std::pow()

起因

代码

原理

Footnotes