youtube分析: www.youtube.com/watch?v=p8u…
wikipedia分析:en.wikipedia.org/wiki/Fast_i…
IEEE 754 的float表示包括三个部分:
- Sign(1bit) 在这里肯定都是0
- Exponent(8bit),后面简写E
- Mantissa(23bit),后面简写M 那么浮点数表示是 (1 + M/(2^23)) * 2^(E-127).
视频里面解释的非常清楚,大致思想包括下面几个:
- 使用log2去将(E-127)部分分离出来
- log2(1+x) ~= (x + u), 其中u涉及到魔数的选择,那么log(1+M/(2^23)) ~= M/(2^23) + u
- 转换成为整数表示 (E + M/(2^23) ) << 23 = (E<<23) + M. 这就是float的整数表示
- 牛顿迭代计算f(x) = 0的话,迭代方法是 x = x0 - f(x0) / f'(x0). 这里f(x) = 1/(x^2) - number = 0
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
总之整来整去得到的就是下面这个式子(牛顿迭代法之前一步)
这里u是log(1+x) - x的估算。如果假设u=0的话,那么计算出来的浮点数如下,M部分误差在6%左右
In [56]: a = (1<< 23) * 1.5 * (127)
In [57]: b = 0x5f3759df
In [60]: a, b, a-b, (a-b) / (1 << 23)
Out[60]: (1598029824.0, 1597463007, 566817.0, 0.06756985187530518)
我们可以做个试验看看不同u下面,M部分的误差是否会有所改善,以及平均下来u大约在什么范围
In [11]: def test():
...: data = []
...: import math
...: for x2 in range(100):
...: x = x2 * 0.01
...: u = math.log2(1 + x) - x
...: a = (1 << 23) * 1.5 * (127- u)
...: b = 0x5f3759df
...: r = abs(a - b) / (1 << 23)
...: data.append((round(x, 4), round(u, 4), round(r, 4)))
...: data.sort(key = lambda x: x[2])
...: avgu = sum((x[1] for x in data)) / len(data)
...: print('====top10=====')
...: for x in data[:10]:
...: print(x)
...: value = 1.5 * (1 << 23) * (127 - avgu)
...: number = hex(int(value))
...: print('avg u = %.4f, number = %s' % (avgu, number))
结果如下,可以看到u大约取值是在0.0439 - 0.046区间内,平均u是0.0573, 对应的number就是 0x5f34ff97. 至于 0x5f3759df 这个魔数,对应的u是 0.0450466.
====top10=====
(0.81, 0.046, 0.0014)
(0.82, 0.0439, 0.0017)
(0.13, 0.0463, 0.0019)
(0.12, 0.0435, 0.0023)
(0.8, 0.048, 0.0044)
(0.83, 0.0418, 0.0048)
(0.14, 0.049, 0.006)
(0.11, 0.0406, 0.0067)
(0.79, 0.05, 0.0074)
(0.84, 0.0397, 0.008)
avg u = 0.0573, number = 0x5f34ff97
按照这个思路,我们也可以写个 sqrt(x)
的实现,只不过在牛顿迭代的时候,有除法计算,而且需要迭代个两次才能得到比较准确的结果,有点不太讲究。
float Q_sqrt( float number )
{
long i;
float y;
// X = int(127-u) * (1 << 22)
// u = 0.0573
#define X 0x1fbc5532
// u = 0.045
// #define X 0x1fbd1df4
y = number;
i = * ( long * ) &y;
i = (i >> 1) + X;
// to avoid negative value.
// i = i & 0x7fffffff;
y = * ( float * ) &i;
y = 0.5f * (y + number / y);
y = 0.5f * (y + number / y);
return y;
}
int main() {
for(int i=10;i<=1200;i+=20) {
float x = Q_sqrt(i);
cout << "i = " << i << ", x=sqrt(i) = " << x << ", x*x= " << x * x << endl;
}
return 0;
}