遗传算法(genevo算法)学习
对于NP困难问题,很难通过正向算法来解决,常规解决方案是遗传算法,或者它的各种变体,如模拟退火算法、禁忌搜索等。
遗传算法基本逻辑:
初始化种群
loop
评价种群适应度
如果到达停止循环条件则停止
选择下一个种群
改变种群(交叉、变异)
在rust中想要使用遗传算法,又不想自己实现一遍,所以研究学习一下 innoave/genevo: Execute genetic algorithm (GA) simulations in a customizable and extensible way. (github.com) 算法的思路和逻辑。
1. 初始化种群
genevo 初始化种群是通过 build_population() 函数来实现。这是模块 population 中的公开函数,返回一个空的EmptyPopulationBuilder对象:
pub fn build_population() -> EmptyPopulationBuilder {
EmptyPopulationBuilder {
_empty: PhantomData,
}
}
EmptyPopulationBuilder 需要调用 with_genome_builder() 函数来传入一个 GenomeBuilder 对象,生成 PopulationWithGenomeBuilderBuilder 对象,依次再调用 of_size(),uniform_at_random() 或 using_seed() ,最终生成 Population。
1.1 GenomeBuilder
/// A `GenomeBuilder` defines how to build individuals of a population for
/// custom `genetic::Genotype`s.
///
/// Typically the individuals are generated randomly.
pub trait GenomeBuilder<G>: Sync
where
G: Genotype,
{
/// Builds a new genome of type `genetic::Genotype` for the given
/// `index` using the given random number generator `rng`.
fn build_genome<R>(&self, index: usize, rng: &mut R) -> G
where
R: Rng + Sized;
}
这个trait 用于生成一个新的基因组,根据index不同生成不同基因组。
2. 评价适应度
评价适应度相关的trait 是 FitnessFunction:
/// Defines the evaluation function to calculate the `Fitness` value of a
/// `Genotype` based on its properties.
pub trait FitnessFunction<G, F>: Clone
where
G: Genotype,
F: Fitness,
{
/// Calculates the `Fitness` value of the given `Genotype`.
fn fitness_of(&self, a: &G) -> F;
/// Calculates the average `Fitness` value of the given `Fitness` values.
fn average(&self, a: &[F]) -> F;
/// Returns the very best of all theoretically possible `Fitness` values.
fn highest_possible_fitness(&self) -> F;
/// Returns the worst of all theoretically possible `Fitness` values.
/// This is usually a value equivalent to zero.
fn lowest_possible_fitness(&self) -> F;
}
比较简单,需要能够实现
- 对一个基因组进行计算获得
Fitness , - 对一组
Fitness 进行运算,求平均值, - 获得最高、最低的
Fitness。
库对所有整数实现了 Fitness,包括有符号整数和无符号整数。Fitness要求是必须实现 Eq和Ord,所以浮点数不能作为适应度使用,需要转换为整数使用。
实现方式是宏:
macro_rules! implement_fitness_for_signed_integer {
( $($t:ty),* ) => {
$(
impl Fitness for $t {
fn zero() -> $t {
0
}
fn abs_diff(&self, other: &$t) -> $t {
let diff = self - other;
diff.abs()
}
}
impl AsScalar for $t {
#[inline]
fn as_scalar(&self) -> f64 {
*self as f64
}
}
)*
}
}
implement_fitness_for_signed_integer!(i8, i16, i32, i64, isize);
macro_rules! implement_fitness_for_unsigned_integer {
( $($t:ty),* ) => {
$(
impl Fitness for $t {
fn zero() -> $t {
0
}
fn abs_diff(&self, other: &$t) -> $t {
if self > other {
self - other
} else {
other - self
}
}
}
impl AsScalar for $t {
#[inline]
fn as_scalar(&self) -> f64 {
*self as f64
}
}
)*
}
}
implement_fitness_for_unsigned_integer!(u8, u16, u32, u64, usize);
3. 终止条件
终止条件使用 Termination 特性来定义,使用 until 函数来输入给 ga算法。
/// A `Termination` defines a condition when the `Simulation` shall stop.
///
/// One implementation of the trait `Termination` should only handle one
/// single termination condition. In the simulation multiple termination
/// conditions can be combined through `combinator`s.
pub trait Termination<A>
where
A: Algorithm,
{
/// Evaluates the termination condition and returns a `StopFlag` depending
/// on the result. The `StopFlag` indicates whether the simulation shall
/// stop or continue.
///
/// In case the simulation shall be stopped, i.e. a `StopFlag::StopNow` is
/// returned also a the reason why the simulation shall be stopped is
/// returned. This reason should explain to the user of the simulation,
/// why the simulation has been stopped.
fn evaluate(&mut self, state: &State<A>) -> StopFlag;
/// Resets the state of this `Termination` condition. This function is
/// called on each `Termination` instance when the simulation is reset.
///
/// This function only needs to be implemented by an implementation of
/// `Termination` if it has its own state, e.g. for counting or tracking
/// of progress.
///
/// The default implementation does nothing.
fn reset(&mut self) {}
}
对每一组基因以及它的 Fitness, Termination 会评估一个结果, 是否终止: StopNow 或Continue。
genevo库内置了一些条件,包括: FitnessLimit,达到指定Fitness后终止;GenerationLimit,遗传指定次数后终止;TimeLimit, 达到指定时间后终止。
除此之外,还为不同条件提供了组合: or、and,来实现不同条件间的组合。
4. 变换
变换是遗传算法最复杂的部分,也是这个库最复杂的部分。
genevo把变换分成多步进行:Selection, CrossOver,Mutation,ReInsertion 等。
4.1 Selection
按照官方的文档直接进行翻译
从一个族群中根据它们的适应度和选择策略选择一组父基因组。
通过trait SelectionOp来实现:
/// A `SelectionOp` defines the function of how to select solutions for being
/// the parents of the next generation.
pub trait SelectionOp<G, F>: GeneticOperator
where
G: Genotype,
F: Fitness,
{
/// Selects individuals from the given population according to the
/// implemented selection strategy.
fn select_from<R>(
&self,
population: &EvaluatedPopulation<G, F>,
rng: &mut R,
) -> Vec<Parents<G>>
where
R: Rng + Sized;
}
这里的 EvaluatedPopulation 是已经经过评估的族群信息,从当前族群中,可以选择多组父基因。
genevo库内置了几种选择方式:
4.1.1 RouletteWheelSelector
按照一定概率和数量,进行平均选择。
selection_ratio, 选择率,确定选择几组父基因,数量为族群数量 * selection_ratio。
num_individuals_pre_parents,每组父基因的基因数量。
默认情况下,每组父基因都是从族群中随机选择一条。
4.1.2 UniversalSamplingSelector
随机适应度的选择方式,与RouletteWheelSelector的区别在于,每组父基因中的第一条基因随机获取,其后的每一条基因,通过相同间隔的跳跃获得。
4.1.3 TournamentSelector
这是一种称为锦标赛的选择方式,选择最佳个体。
4.1.4 MaximizeSelector
选择表现最好的族群。
4.2 CrossOver
通过交叉产生新的后代。
/// A `CrossoverOp` defines a function of how to crossover two
/// `genetic::Genotype`s, often called parent genotypes, to derive new
/// `genetic::Genotype`s. It is analogous to reproduction and biological
/// crossover. Cross over is a process of taking two parent solutions and
/// producing an offspring solution from them.
pub trait CrossoverOp<G>: GeneticOperator
where
G: Genotype,
{
/// Performs the crossover of the `genetic::Parents` and returns the result
/// as a new vector of `genetic::Genotype` - the `genetic::Children`.
fn crossover<R>(&self, parents: Parents<G>, rng: &mut R) -> Children<G>
where
R: Rng + Sized;
}
传入一组 Parent,生成一组新的Children。
4.2.1 UniformCrossBreeder
产生与父基因数量相同的子基因,产生方式是对要产生的每一个子基因,每一位随机从父类中获取。
4.2.2 SinglePointCrossBreeder和 MultiPointCrossover
它们通过对基因组本身实现的 MultiPointCrossover trait来实现,
pub trait MultiPointCrossover: Genotype {
type Dna;
fn crossover<R>(parents: Parents<Self>, num_cut_points: usize, rng: &mut R) -> Children<Self>
where
R: Rng + Sized;
}
genevo库对 Vec类型实现了该特性,原理是将子基因组分成 num_cut_points+1段,每段取自不同基因组。其中SinglePointCrossBreeder 是num_cut_points=1的情况。
4.2.3 OrderOneCrossover和PartiallyMappedCrossover
这两种方式,都通过函数 multi_parents_cyclic_crossover来进行交叉实现,genevo库的实现都是针对 usize类型。
OrderOneCrossover 通过函数order_one_crossover 从两个序列中各取一段来进行混合,产生新的序列。
PartiallyMappedCrossover 通过函数 partial_mapped_crossover来调整顺序。
它们都是用来处理不同对象的顺序的。
两个父基因组,各取一部分组成一个新的基因组。
4.3 Mutation
通过突变来产生新的后代。这种变换方式比较直接,就是直接对一条基因组进行变换,生成新的基因组。
4.3.1 InsertOrderMutator
比较简单的突变方式,循环 mutation_rate*length 次,每次随机选择一个基因,插入到一个新的位置。
4.3.2 SwapOrderMutator
循环 mutation_rate*length 次,每次随机交换两个基因的位置。
4.3.3 RandomValueMutator
主要针对值类型的基因,随机将基因突变为 min_value 和max_value 之间的值。突变使用 RandomGenomeMutation 特性。
4.4 Reinsertion
从后代中选择一些基因,创建新的族群。
pub trait ReinsertionOp<G, F>: GeneticOperator
where
G: Genotype,
F: Fitness,
{
/// Combines the given offspring with the current population to create
/// the population of the next generation.
///
/// The offspring parameter is passed as mutable borrow. It can be
/// mutated to avoid cloning. The `genetic::Genotype`s that make it up into
/// the new population should be moved instead of cloned. After this
/// function finishes the offspring vector should hold only those
/// `genetic::Genotype`s that have not been included in the resulting
/// population. If by the end of this function all `genetic::Genotype`s in
/// offspring have been moved to the resulting population the offspring
/// vector should be left empty.
fn combine<R>(
&self,
offspring: &mut Offspring<G>,
population: &EvaluatedPopulation<G, F>,
rng: &mut R,
) -> Vec<G>
where
R: Rng + Sized;
}
4.4.1 UniformReinserter
从新族群中随机选择一部分,从旧族群中随机选择一部分,共同组成新族群。
4.4.2 ElitistReinserter
精英选择,选择新旧族群中最好的那部分。
5. 运行
genevo库有两种运行方式: run和step。
5.1 run()
运行run时,算法进入循环,直到找到终止或出错。
5.2 step()
每运行一次 step,算法运行一次,直到终止或出错。