遗传算法(genevo算法)学习

91 阅读6分钟

遗传算法(genevo算法)学习

对于NP困难问题,很难通过正向算法来解决,常规解决方案是遗传算法,或者它的各种变体,如模拟退火算法禁忌搜索等。

遗传算法基本逻辑:

初始化种群
loop
	评价种群适应度
    如果到达停止循环条件则停止
    选择下一个种群
    改变种群(交叉、变异)
  

rust​中想要使用遗传算法,又不想自己实现一遍,所以研究学习一下 innoave/genevo: Execute genetic algorithm (GA) simulations in a customizable and extensible way. (github.com) 算法的思路和逻辑。

1. 初始化种群

genevo​ 初始化种群是通过 build_population()​ 函数来实现。这是模块 population​ 中的公开函数,返回一个空的EmptyPopulationBuilder​对象:

pub fn build_population() -> EmptyPopulationBuilder {
    EmptyPopulationBuilder {
        _empty: PhantomData,
    }
}

EmptyPopulationBuilder​ 需要调用 with_genome_builder()​ 函数来传入一个 GenomeBuilder​ 对象,生成 PopulationWithGenomeBuilderBuilder​ 对象,依次再调用 of_size()​,uniform_at_random()​ 或 using_seed()​ ,最终生成 Population​。

1.1 GenomeBuilder

/// A `GenomeBuilder` defines how to build individuals of a population for
/// custom `genetic::Genotype`s.
///
/// Typically the individuals are generated randomly.
pub trait GenomeBuilder<G>: Sync
where
    G: Genotype,
{
    /// Builds a new genome of type `genetic::Genotype` for the given
    /// `index` using the given random number generator `rng`.
    fn build_genome<R>(&self, index: usize, rng: &mut R) -> G
    where
        R: Rng + Sized;
}

这个trait​ 用于生成一个新的基因组,根据index不同生成不同基因组。

2. 评价适应度

评价适应度相关的trait​ 是 FitnessFunction​:

/// Defines the evaluation function to calculate the `Fitness` value of a
/// `Genotype` based on its properties.
pub trait FitnessFunction<G, F>: Clone
where
    G: Genotype,
    F: Fitness,
{
    /// Calculates the `Fitness` value of the given `Genotype`.
    fn fitness_of(&self, a: &G) -> F;

    /// Calculates the average `Fitness` value of the given `Fitness` values.
    fn average(&self, a: &[F]) -> F;

    /// Returns the very best of all theoretically possible `Fitness` values.
    fn highest_possible_fitness(&self) -> F;

    /// Returns the worst of all theoretically possible `Fitness` values.
    /// This is usually a value equivalent to zero.
    fn lowest_possible_fitness(&self) -> F;
}

比较简单,需要能够实现

  1. 对一个基因组进行计算获得 Fitness​ ,
  2. 对一组Fitness​ 进行运算,求平均值,
  3. 获得最高、最低的 Fitness​。

库对所有整数实现了 Fitness​,包括有符号整数和无符号整数。Fitness​要求是必须实现 Eq​和Ord​,所以浮点数不能作为适应度使用,需要转换为整数使用。

实现方式是宏:

macro_rules! implement_fitness_for_signed_integer {
    ( $($t:ty),* ) => {
        $(
            impl Fitness for $t {
                fn zero() -> $t {
                    0
                }

                fn abs_diff(&self, other: &$t) -> $t {
                    let diff = self - other;
                    diff.abs()
                }
            }

            impl AsScalar for $t {
                #[inline]
                fn as_scalar(&self) -> f64 {
                    *self as f64
                }
            }
        )*
    }
}

implement_fitness_for_signed_integer!(i8, i16, i32, i64, isize);

macro_rules! implement_fitness_for_unsigned_integer {
    ( $($t:ty),* ) => {
        $(
            impl Fitness for $t {
                fn zero() -> $t {
                    0
                }

                fn abs_diff(&self, other: &$t) -> $t {
                    if self > other {
                        self - other
                    } else {
                        other - self
                    }
                }
            }

            impl AsScalar for $t {
                #[inline]
                fn as_scalar(&self) -> f64 {
                    *self as f64
                }
            }
        )*
    }
}

implement_fitness_for_unsigned_integer!(u8, u16, u32, u64, usize);

3. 终止条件

终止条件使用 Termination​ 特性来定义,使用 until​ 函数来输入给 ga​算法。

/// A `Termination` defines a condition when the `Simulation` shall stop.
///
/// One implementation of the trait `Termination` should only handle one
/// single termination condition. In the simulation multiple termination
/// conditions can be combined through `combinator`s.
pub trait Termination<A>
where
    A: Algorithm,
{
    /// Evaluates the termination condition and returns a `StopFlag` depending
    /// on the result. The `StopFlag` indicates whether the simulation shall
    /// stop or continue.
    ///
    /// In case the simulation shall be stopped, i.e. a `StopFlag::StopNow` is
    /// returned also a the reason why the simulation shall be stopped is
    /// returned. This reason should explain to the user of the simulation,
    /// why the simulation has been stopped.
    fn evaluate(&mut self, state: &State<A>) -> StopFlag;

    /// Resets the state of this `Termination` condition. This function is
    /// called on each `Termination` instance when the simulation is reset.
    ///
    /// This function only needs to be implemented by an implementation of
    /// `Termination` if it has its own state, e.g. for counting or tracking
    /// of progress.
    ///
    /// The default implementation does nothing.
    fn reset(&mut self) {}
}

对每一组基因以及它的 Fitness​, Termination​ 会评估一个结果, 是否终止: StopNow​ 或Continue​。

genevo​库内置了一些条件,包括: FitnessLimit​,达到指定Fitness​后终止;GenerationLimit​,遗传指定次数后终止;TimeLimit​, 达到指定时间后终止。

除此之外,还为不同条件提供了组合: or​、and​,来实现不同条件间的组合。

4. 变换

变换是遗传算法最复杂的部分,也是这个库最复杂的部分。

genevo​把变换分成多步进行:Selection​, CrossOver​,Mutation​,ReInsertion​ 等。

4.1 Selection

按照官方的文档直接进行翻译

从一个族群中根据它们的适应度和选择策略选择一组父基因组。

通过traitSelectionOp​来实现:

/// A `SelectionOp` defines the function of how to select solutions for being
/// the parents of the next generation.
pub trait SelectionOp<G, F>: GeneticOperator
where
    G: Genotype,
    F: Fitness,
{
    /// Selects individuals from the given population according to the
    /// implemented selection strategy.
    fn select_from<R>(
        &self,
        population: &EvaluatedPopulation<G, F>,
        rng: &mut R,
    ) -> Vec<Parents<G>>
    where
        R: Rng + Sized;
}

这里的 EvaluatedPopulation​ 是已经经过评估的族群信息,从当前族群中,可以选择多组父基因。

genevo​库内置了几种选择方式:

4.1.1 RouletteWheelSelector

按照一定概率和数量,进行平均选择。

selection_ratio​, 选择率,确定选择几组父基因,数量为族群数量​ * selection_ratio​。

num_individuals_pre_parents​,每组父基因的基因数量。

默认情况下,每组父基因都是从族群中随机选择一条。

4.1.2 UniversalSamplingSelector

随机适应度的选择方式,与RouletteWheelSelector​的区别在于,每组父基因中的第一条基因随机获取,其后的每一条基因,通过相同间隔的跳跃获得。

4.1.3 TournamentSelector

这是一种称为锦标赛的选择方式,选择最佳个体。

4.1.4 MaximizeSelector

选择表现最好的族群。

4.2 CrossOver

通过交叉产生新的后代。

/// A `CrossoverOp` defines a function of how to crossover two
/// `genetic::Genotype`s, often called parent genotypes, to derive new
/// `genetic::Genotype`s. It is analogous to reproduction and biological
/// crossover. Cross over is a process of taking two parent solutions and
/// producing an offspring solution from them.
pub trait CrossoverOp<G>: GeneticOperator
where
    G: Genotype,
{
    /// Performs the crossover of the `genetic::Parents` and returns the result
    /// as a new vector of `genetic::Genotype` - the `genetic::Children`.
    fn crossover<R>(&self, parents: Parents<G>, rng: &mut R) -> Children<G>
    where
        R: Rng + Sized;
}

传入一组 Parent​,生成一组新的Children​。

4.2.1 UniformCrossBreeder

产生与父基因数量相同的子基因,产生方式是对要产生的每一个子基因,每一位随机从父类中获取。

4.2.2 SinglePointCrossBreeder​和 MultiPointCrossover

它们通过对基因组本身实现的 MultiPointCrossovertrait​来实现,

pub trait MultiPointCrossover: Genotype {
    type Dna;

    fn crossover<R>(parents: Parents<Self>, num_cut_points: usize, rng: &mut R) -> Children<Self>
    where
        R: Rng + Sized;
}

genevo​库对 Vec​类型实现了该特性,原理是将子基因组分成 num_cut_points+1​段,每段取自不同基因组。其中SinglePointCrossBreeder​ 是num_cut_points=1​的情况。

4.2.3 OrderOneCrossover​和PartiallyMappedCrossover

这两种方式,都通过函数 multi_parents_cyclic_crossover​来进行交叉实现,genevo​库的实现都是针对 usize​类型。

OrderOneCrossover​ 通过函数order_one_crossover​ 从两个序列中各取一段来进行混合,产生新的序列。

PartiallyMappedCrossover​ 通过函数 partial_mapped_crossover​来调整顺序。

它们都是用来处理不同对象的顺序的。

两个父基因组,各取一部分组成一个新的基因组。

4.3 Mutation

通过突变来产生新的后代。这种变换方式比较直接,就是直接对一条基因组进行变换,生成新的基因组。

4.3.1 InsertOrderMutator

比较简单的突变方式,循环 mutation_rate*length​ 次,每次随机选择一个基因,插入到一个新的位置。

4.3.2 SwapOrderMutator

循环 mutation_rate*length​ 次,每次随机交换两个基因的位置。

4.3.3 RandomValueMutator

主要针对值类型的基因,随机将基因突变为 min_value​ 和max_value​ 之间的值。突变使用 RandomGenomeMutation​ 特性。

4.4 Reinsertion

从后代中选择一些基因,创建新的族群。

pub trait ReinsertionOp<G, F>: GeneticOperator
where
    G: Genotype,
    F: Fitness,
{
    /// Combines the given offspring with the current population to create
    /// the population of the next generation.
    ///
    /// The offspring parameter is passed as mutable borrow. It can be
    /// mutated to avoid cloning. The `genetic::Genotype`s that make it up into
    /// the new population should be moved instead of cloned. After this
    /// function finishes the offspring vector should hold only those
    /// `genetic::Genotype`s that have not been included in the resulting
    /// population. If by the end of this function all `genetic::Genotype`s in
    /// offspring have been moved to the resulting population the offspring
    /// vector should be left empty.
    fn combine<R>(
        &self,
        offspring: &mut Offspring<G>,
        population: &EvaluatedPopulation<G, F>,
        rng: &mut R,
    ) -> Vec<G>
    where
        R: Rng + Sized;
}

4.4.1 UniformReinserter

从新族群中随机选择一部分,从旧族群中随机选择一部分,共同组成新族群。

4.4.2 ElitistReinserter

精英选择,选择新旧族群中最好的那部分。

5. 运行

genevo​库有两种运行方式: run​和step​。

5.1 run()

运行run​时,算法进入循环,直到找到终止或出错。

5.2 step()

每运行一次 step​,算法运行一次,直到终止或出错。