这边文章是接着 灭霸脚本怎么随机删除服务器的一半文件? 写的,Linux 命令 shuf(一个使输出随机的命令) 的资料好少,我通过 Java 的 Random 类扩展。
Random() 函数
从源码入手,过程中遇到不懂的扩展出去,解决完了再回到源码,直到把核心代码理解完。
/**
* An instance of this class is used to generate a stream of
* pseudorandom numbers. The class uses a 48-bit seed, which is
* modified using a linear congruential formula. (See Donald Knuth,
* <i>The Art of Computer Programming, Volume 2</i>, Section 3.2.1.)
* <p>
* If two instances of {@code Random} are created with the same
* seed, and the same sequence of method calls is made for each, they
* will generate and return identical sequences of numbers. In order to
* guarantee this property, particular algorithms are specified for the
* class {@code Random}. Java implementations must use all the algorithms
* shown here for the class {@code Random}, for the sake of absolute
* portability of Java code. However, subclasses of class {@code Random}
* are permitted to use other algorithms, so long as they adhere to the
* general contracts for all the methods.
* <p>
* The algorithms implemented by class {@code Random} use a
* {@code protected} utility method that on each invocation can supply
* up to 32 pseudorandomly generated bits.
* <p>
* Many applications will find the method {@link Math#random} simpler to use.
*
* <p>Instances of {@code java.util.Random} are threadsafe.
* However, the concurrent use of the same {@code java.util.Random}
* instance across threads may encounter contention and consequent
* poor performance. Consider instead using
* {@link java.util.concurrent.ThreadLocalRandom} in multithreaded
* designs.
*
* <p>Instances of {@code java.util.Random} are not cryptographically
* secure. Consider instead using {@link java.security.SecureRandom} to
* get a cryptographically secure pseudo-random number generator for use
* by security-sensitive applications.
*
* @author Frank Yellin
* @since 1.0
*/
大致翻一下
重点:
此实例用于生成伪随机数,使用48位种子,改自线性同余方程。
特点:
- 如果种子相同,生成的随机数序列相同。
- Random 类是线程安全的,但在同一个实例中调用会因为争夺导致性能差,考虑使用
java.util.concurrent.ThreadLocalRandom。 - Random 类未加密,
java.security.SecureRandom提供安全加密的伪随机数生成器。
线性同余算法
伪随机数算法的核心。至于为毛用这个算法,怎么保证随机性等等更深的问题,我想过,但特么太复杂了。。。先搞懂这个算法再说吧。。。那些更深的问题,我八成是放弃了,毕竟我是个 Java 开发。(手动狗头)
线性同余方程
a*x≡b(mod m)
这个公式的意思是:a*x 和 b 除以 m后余数相同,读作 a*x 与 b 同余,mod 为 c。
Random 类线性同余法
X(n+1) = (a * X(n) + c) % m
其中 X(0) 为种子,X(0) 的值算出 X(1),X(1) 的值算出 X(2) ...
依次类推,一旦种子(X(0))确定,随机数队列即确定的,也是特点 1 的原因。
看哈 Random 类怎么实现的 X(0) 的,有两个构造器,一个传种子,一个不传。
// 不传参的构造器
public Random() {
this(seedUniquifier() ^ System.nanoTime());
}
private static long seedUniquifier() {
// L'Ecuyer, "Tables of Linear Congruential Generators of
// Different Sizes and Good Lattice Structure", 1999
for (;;) {
long current = seedUniquifier.get();
long next = current * 181783497276652981L;
if (seedUniquifier.compareAndSet(current, next))
return next;
}
}
private static final AtomicLong seedUniquifier
= new AtomicLong(8682522807148012L);
// 传参的构造器
public Random(long seed) {
if (getClass() == Random.class)
this.seed = new AtomicLong(initialScramble(seed));
else {
// subclass might have overriden setSeed
this.seed = new AtomicLong();
setSeed(seed);
}
}
无参构造器获取种子的方法是,用 current(8682522807148012L) 乘以 181783497276652981L。
compareAndSet 方法保证线程安全,如果检测到 current 值被其他线程修改了就再乘一次,for (;;) 相当于 while(ture)。
接着再用结果 next 异或 System.nanoTime(),最后得到的结果就是 seed。
System.nanoTime() 有点像 System.currentTimeMillis(),是一个跟时间有关的随机数,但他只能用来计算时间段,比如方法体前后分别调用 System.nanoTime() 再相减可以计算方法执行时间,但不能用于计算当前日期。
线性同余出现在 next 方法中。这是 Random 类的核心,所有计算随机数的方法都调用次方法。
protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));
}
private static final long multiplier = 0x5DEECE66DL;
private static final long addend = 0xBL;
private static final long mask = (1L << 48) - 1;
核心代码 nextseed = (oldseed * multiplier + addend) & mask 跟 X(n+1) = (a * X(n) + c) % m 是等价的。
multiplier、addend、mask 分别对应 a、c、m ,% m 变成了 & mask。
网上这么解释的:
- 所有比2^N位(包括2^N那一位)高的位全都为0
- 所有比2^N低的位保持原样
所以 x & [(1L << 48)–1] 与 x(mod 2^48) 等价。
为什么 Java 用 48 位种子而不是常规的 32 或 64 位?
why 48 bit seed in util Random class?
You need more bits of state than bits of output, because the nature of an LCG is such that the low-order bits of state are not very random at all. So if you want 32-bit outputs, you need more than 32 bits of state.
Why use 48 rather than 64? Because 48 is enough, and you're designing this decades ago, so there are good reasons to want to avoid using any more resources than are strictly necessary.
好了好了,求放过,Java位运算符我知其然不知其所以然,还有为毛线性同余方程可以用来做随机数算法我也不知道,不想假装知道假装高深(手动笑哭),希望懂的大佬们带带我。。。(手动狗头),哈哈哈哈。
每一次成长,都想与你分享。