0、论文背景

本论文主要是通过改进的BPSO算法来进行特征选择，特征选择的效果通过1-NN分类算法来检验。通过比较得出提出的算法在特征选择上具有一定的优越性。

Zhang Y, Gong D, Hu Y, et al. Feature selection algorithm based on bare bones particle swarm optimization[J]. Neurocomputing, 2015, 148: 150-157.

是一篇二区的论文，个人觉得针对特征选择的实际问题而提出的改进的BPSO算法实际上创新性不是太大，但是可以让我们就相关问题有一个清晰的认识。

1、论文的核心思想

本文提出了一种利用BPSO求解最优特征子集的新方法，称为 binary BPSO。该算法(binary BPSO)设计了(a reinforced memory strategy)一种增强记忆策略来更新粒子的局部最优值，以避免粒子中突出基因的退化，并提出了一种统一的组合算法( uniform combination (UC))来平衡局部开发和全局探索。此外，利用1-NN方法作为分类器来评价粒子的分类精度。本文选取了一些国际标准数据集，对该算法进行了评价。

有关Binary PSO算法请参见：Binary PSO；有关BPSO算法参见：BPSO。

2、The Binary BPSO

The Binary BPSO的算法流程如下：
其中1.1中涉及到一个算法样本的初始化：

\large X_{i}=\left(x_{i, 1}, x_{i, 2}, \ldots, x_{i, N}\right), \quad i=1,2, \ldots,|S|

N代表特征(变量的个数)，|S|代表初始化样本的个数， $x_{i, j} \in[0,1]$ 表示第i个样本中第j个特征被选择的概率。

那如何表示第i个样本中第j个特征被选中的情况呢？

z_{i, j}=\left\{\begin{array}{ll} 1, & x_{i, j} \geq \text { rand } \\ 0 & \text { otherwise } \end{array}\right.

zij如果等于1表示第i个样本中第j个特征被选中。否则表示未选中。

1.2中怎么计算适应度值呢？

根据样本特征是否选中，从原来数据集中选择特征选中的数据和标签，得到新的数据集来进行分类准确率的计算，采用的分类算法是1-NN，从原始数据集中选取单个数据作为测试样本，其余数据构成训练样本。计算准确率的方法是：

F\left(X_{i}\right)=\frac{\text { No. of correctly predicted samples }}{\text { Size of the original data set }} \times 100 \%

下面就涉及到关键的两个算法创新点：a reinforced memory strategy和uniform combination (UC) 。

2.1 a reinforced memory strategy

传统的更新Pbest的方法在之前的PSO算法中提到过，但是这种更新方式有个缺陷：当为0.1时，恰好随机生成的数小于0.1，该特征被选择；虽然看似表现不好的特征被选择，但是总体预测正确率却变高了，说明该特征的重要性不止0.1；之前的更新Pbest的只是单纯地将xi值更新到Pbest，而忽略了应该提高xij这些特征在下次被选中的概率；因为如果不提高，下次这个特征可能就选不上了。因此提出了a reinforced memory strategy：

\text { Pbest }_{i}(t+1)=\left\{\begin{array}{ll} 0.5\left(X_{i}(t+1)+Z_{i}(t+1)\right) & \text { if } F\left(\text { Pbest }_{i}(t)\right)<F\left(X_{i}(t+1)\right) \\ 0.5\left(X_{i}(t+1)+Z_{i}(t+1)\right) & \text { if } \left.F\left(\text { Pbest }_{i}(t)\right)=F\left(X_{i}(t+1)\right) \& \mid Z\left(\text { Pbest }_{i}(t)\right)|\geq| Z_{i}(t+1)\right) \mid \\ \text { Pbest }_{i}(t) & \text { otherwise } \end{array}\right.

2.2 uniform combination

由于The Binary BPSO有可能会陷入到局部最优的情况，那么如何摆脱局部最优，达到全局搜索的目的呢？这就uniform combination。 它的思路是：当Gbest不更新时候，使Pbest基因突变的概率加大。具体策略如下：

p_{c}=\frac{0.2}{1+e^{5-n u m}}

当Gbest不更新时候，num++；当Gbest更新时候，num重置为0。pc表示Pbest基因突变的概率。

Pbest基因突变的过程如下：

3、论文中的实验结果

实验条件是：初始化种群20，迭代次数如下图：
不同算法对应数据集的分类准确率：
算法迭代过程中，准确率和特征选择个数的变化:
迭代过程中特征选择的具体展现：
当不采用 uniform combination或者a reinforced memory strategy与两个策略都采用的Binary BPSO算法实际运行结果比较。

4. 个人算法的复现以及简单的实验验证

数据集采用了4个，复现了Binary PSO和Binary BPSO两个算法：

function test_y = knn(x, y, test_x, k, num)
% x:nxm,y:nx1,test_x:n0xm,test_y:n0x1
n0 = size(test_x, 1);
test_y = zeros(n0, 1);
for i = 1 : n0
    ind = [];
    max_label = 0;
    Dist = sum((x - test_x(i, :)).^2,2);
    % 从小大大排序，取前k个
    [D,I] = sort(Dist);
    D = D(1:k, :);
    D_y = y(I(1:k), :);
    for j = 1 : num
        % 计算k个样本中各类样本的数量,记录下最大数量的样本的类别
        label1 = sum(D_y==j);
        if label1 > max_label
            max_label = label1;
            ind = [];
            ind = [ind; j];
        elseif label1 == max_label
            ind = [ind; j];
        end
    end
    num_ind = size(ind, 1);
    min_dist = inf;
    % 如果存在数量相等的情况,就将距离之和最小的类标签作为test_y的类别
    for j1 = 1 : num_ind
        dist = sum(D(D_y == ind(j1)));
        if dist <= min_dist
            test_y(i, :) = ind(j1);
        end
    end
end
end

function F = evaluate(Z, data_x, data_y, num_lable)
num_simple = size(Z, 1);
F = zeros(num_simple, 1);
% 采用1-NN
k = 1;
for i = 1 : num_simple
    data_x1 = data_x(:, Z(i, :) == 1);
    n = size(data_x1, 1);
    test_y = zeros(n ,1);
    for i1 = 1 : n
        % 在样本集中选择一个样本作为测试集，其他样本作为训练集
        test_x = data_x1(i1, :);
        if i1 == 1
            ind = i1 + 1 : n;
            test_y(i1, :) = knn(data_x1(ind, :), data_y(ind, :), test_x, k, num_lable);
        else
            ind = [1 : i1 - 1, i1 + 1 : n];
            test_y(i1, :) = knn(data_x1(ind, :), data_y(ind, :), test_x, k, num_lable);
        end
    end
    % 正确率与正确个数
    acc_num = sum(test_y == data_y);
    acc_rate =1.0 * acc_num ./ n;
    % fprintf("data total: %d feature select: Z(%d,:) acc_num: %d acc_rate: %.4f\n",n,i,acc_num,acc_rate); 
    F(i, :) = acc_rate;
end
end

function [best_Zi, best_acc] = Bi_PSO(num_simple, num_iter, num_feature, data_x, data_y, num_lable)
% num_simple:初始化粒子群样本数量;num_iter:搜索迭代次数;num_feature:每个样本包含的特征个数
% data_x代表聚类样本数据,data_y代表聚类样本的标签;num_lable代表聚类样本的标签的种类数
% best_Zi:准确率最高的样本的特征选择情况
 
% 初始化粒子群样本sample_x:num_simple*num_feature
X = rand(num_simple, num_feature);
% 将X转换成Z
r1 = ones(num_simple, num_feature) * 0.5;
Z = zeros(num_simple, num_feature);
Z(r1 < X) = 1;
Z(r1 >= X) = 0;
pbestz = Z;
% 计算分类准确率F:num_simple*1
F = evaluate(Z, data_x, data_y, num_lable);
pbestf = F;
% 计算gbest,准确率最大的对应的索引
[fmin, gbest] = max(pbestf);
fprintf("iter 0 feature: pbestz(%d,:) fmin: %.4f\n",gbest, fmin);
% 当前位置信息presentx
presentx = rand([num_simple, num_feature]);
% 将presentx转换成presentz
presentz = zeros(num_simple, num_feature);
r2 = ones(num_simple, num_feature) * 0.5;
presentz(r2 < presentx) = 1;
presentz(r2 >= presentx) = 0;
vz = Z;
w = 1;
for i = 1 : num_iter
    r3 = rand(num_simple, num_feature);
    % pso更新下一步的位置
    vz = w.*vz + 2 * r3 .* (pbestz - presentz) + 2 * r3 .* (pbestz(gbest, :) - presentz);
    % vz设置边界目的是为了使s在0~1之间
    vz(vz > 10) = 10;
    vz(vz < -10) = -10;
    s = 1 ./ (1 + exp(-vz));
    r4 = rand(num_simple, num_feature);
    %r4 = ones(num_simple, num_feature) * 0.5;
    presentz(r4 < s) = 1;
    presentz(r4 >= s) = 0;
    presentf = evaluate(presentz, data_x, data_y, num_lable);
    % 更新每个单独个体最佳位置
    pbestz(presentf > pbestf, :) = presentz(presentf > pbestf, :);
    pbestf(presentf > pbestf, :) = presentf(presentf > pbestf, :);
    % 更新所有个体最佳位置
    [fmin, gbest] = max(pbestf);
    fprintf("iter %d feature: pbestz(%d,:) fmin: %.4f\n",i, gbest, fmin);
%     best_feature = find(pbestz(gbest, :) == 1);
%     disp(sum(pbestz(gbest, :)));
%     disp(best_feature);
end
best_Zi = pbestz(gbest, :);
best_acc = fmin;
end

clc;clear;clearvars;
% load('lonosphere_X.mat', 'ionosphere');
% data_x = ionosphere;
% load('lonosphere_Y.mat', 'ionosphere1');
% data_y = ionosphere1;
% load('Vehicle_X.mat')
% load('Vehicle_Y.mat')
% data_x = xaa;
% data_y = xaa1;
load('sonar_X.mat')
data_x = sonar1;
load('sonar_Y.mat')
data_y = sonar1;
% load('WDBC_X.mat')
% data_x = UCIBeastcancerwdbc;
% load('WDBC_Y.mat')
% data_y = UCIBeastcancerwdbc;
num_lable = 2;
% num_lable = 4;
num_simple = 20;
num_iter = 100;
num_feature = size(data_x, 2);
best_Z1 = zeros(1, num_feature);
best_acc1 = 0;
%for i = 1 : 10
for i = 1
    [best_Zi, best_acc] = Bi_PSO(num_simple, num_iter, num_feature, data_x, data_y, num_lable);
    if best_acc > best_acc1
        best_acc1 = best_acc;
        best_Z1 = best_Zi;
    end
end
best_feature = find(best_Z1 == 1);
disp(sum(best_Z1));
disp(best_feature);

function [best_Zi, best_acc] = Bi_BPSO(num_simple, num_iter, num_feature, data_x, data_y, num_lable)
% num_simple:初始化粒子群样本数量;num_iter:搜索迭代次数;num_feature:每个样本包含的特征个数
% data_x代表聚类样本数据,data_y代表聚类样本的标签;num_lable代表聚类样本的标签的种类数
% best_Zi:准确率最高的样本的特征选择情况
 
% 初始化粒子群样本sample_x:num_simple*num_feature
X = rand(num_simple, num_feature);
% 将X转换成Z
r1 = ones(num_simple, num_feature) * 0.5;
Z = zeros(num_simple, num_feature);
Z(r1 < X) = 1;
Z(r1 >= X) = 0;
pbestx = X;
% 计算分类准确率F:num_simple*1
F = evaluate(Z, data_x, data_y, num_lable);
pbestf = F;
% 计算gbest,准确率最大的对应的索引
[fmin, gbest] = max(pbestf);
fprintf("iter 0 feature: pbestz(%d,:) fmin: %.4f\n",gbest, fmin);
num1 = 0;
for i = 1 : num_iter
    s = zeros(num_simple, num_feature);
    % pso更新下一步的位置,这里可以设置一下超过搜索范围的就设置为边界
    r = rand;
    if r > 0.5
        x = pbestx;
    else
        x = normrnd((pbestx + pbestx(gbest, :)) ./ 2,abs(pbestx - pbestx(gbest, :)));
    end
    r2 = rand(num_simple, num_feature);
    s(r2 < x) = 1;
    s(r2 >= x) = 0;
    presentf = evaluate(s, data_x, data_y, num_lable);
    % 更新每个单独个体最佳位置
    % pbestx(presentf > pbestf, :) = x(presentf > pbestf, :);
    % a reinforced memory strategy
    pbestx(presentf > pbestf, :) = 0.5 * (x(presentf > pbestf, :) + s(presentf > pbestf, :));
    pbestx((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)),:) = 0.5 * (x((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)), :) + ...
        s((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)), :));
    pbestf(presentf > pbestf, :) = presentf(presentf > pbestf, :);
    pbestf((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)),:) = presentf((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)), :);
    Z(presentf > pbestf, :) = s(presentf > pbestf, :);
    Z((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)),:) = s((presentf == pbestf)&(sum(Z, 2) > sum(s, 2)), :);
    % uniform combination (UC)
    pc = 0.2 / (1 + exp(5 - num1));
    for i1 = 1 : num_simple
        r3 = rand;
        if r3 < pc
            k = randi(num_simple, 1, 1);
            u_n = ceil(pc * num_feature);
            for i2 = 1 : u_n
                l = randi(num_feature, 1, 1);
                pbestx(i1, l) = pbestx(k, l) + rand;
            end
        end
    end
    % 更新所有个体最佳位置
    [fmin1, gbest] = max(pbestf);
    if fmin1 > fmin
        num1 = 0;
    else
        num1 = num1 + 1;
    end
    fmin = fmin1;
    fprintf("iter %d feature: pbestz(%d,:) fmin: %.4f\n",i, gbest, fmin);
end
best_Zi = Z(gbest, :);
best_acc = fmin;
end

clc;clear;clearvars;
% load('lonosphere_X.mat', 'ionosphere');
% data_x = ionosphere;
% load('lonosphere_Y.mat', 'ionosphere1');
% data_y = ionosphere1;
% load('Vehicle_X.mat')
% load('Vehicle_Y.mat')
% data_x = xaa;
% data_y = xaa1;
load('sonar_X.mat')
data_x = sonar1;
load('sonar_Y.mat')
data_y = sonar1;
% load('Beast-cancer_X.mat')
% data_x = UCIBeastcancerwdbc;
% load('Beast-cancer_Y.mat')
% data_y = UCIBeastcancerwdbc;
num_lable = 2;
num_simple = 20;
num_iter = 100;
num_feature = size(data_x, 2);
best_Z1 = zeros(1, num_feature);
best_acc1 = 0;
%for i = 1 : 10
for i = 1
    [best_Zi, best_acc] = Bi_BPSO(num_simple, num_iter, num_feature, data_x, data_y, num_lable);
    if best_acc > best_acc1
        best_acc1 = best_acc;
        best_Z1 = best_Zi;
    end
end
best_feature = find(best_Z1 == 1);
disp(sum(best_Z1));
disp(best_feature);
disp(best_acc1);

目前只展现sonar和lonosphere两个数据集的测试结果：

Binary PSO sonar数据集测试的准确率：
Binary PSO sonar数据集测试的特征数，最后定格在31个(总共60)：
Binary PSO lonosphere数据集测试的准确率：
Binary PSO lonosphere数据集测试的特征数，最后定格在9个(总共34)：
Binary BPSO sonar数据集测试的准确率：
Binary BPSO sonar数据集测试的特征数，最后定格在30个(总共60)：
Binary BPSO lonosphere数据集测试的准确率：
Binary PSO lonosphere数据集测试的特征数，最后定格在9个(总共34)：
如有错误，恳请私下交流和改正。

Feature selection algorithm based on bare bones particleswarm optimization