《Multinomial Distribution Learning for Effective Neural Architecture Search》

204 阅读1分钟

Contributions:

1、 propose a Multinomial Distribution Learning for extremely effective NAS, which considers the search space as a joint multinomial distribution. 2、Propose a performance ranking hypothesis, which can be incorporated into the existing NAS algorithms to speed up its search. 3、The proposed method achieves remarkable search efficiency, e.g., 2.55% test error on CIFAR-10 in 4 hours with 1 GTX1080Ti , which is attributed to using our distribution learning that is entirely different from RL-based methods and differentiable methods.

Search Space

Search Algorithm

In the right table, the epoch number of operation 1 is 10, which means that this operation is selected 10 times among all the epochs.

The overall search algorithm:

(1) Sample one operation in the search space according to the corresponding multinomial distribution with parameters θ. 即有θ的概率重新采样。

(2) Train the generated network with one forward and backward propagation.

(3) Test the network on the validation set and record the feedback (epoch and accuracy).

(4) Update the distribution parameters according to the proposed distribution learning algorithm.

How to update the probability:

1、define the differential of epoch and the differential of accuracy as:

2、The parameters of the multinomial distribution can be updated through:

思路亮点

1、提出了假说Performance Ranking Hypothesis.

感觉跟dropout想法的提出一样,有种好像我也有这个idea但是在看到之间我说不出来的赶脚。新奇又熟悉。

2、对epoch和accuracy的偏导的定义(感觉严格来说不能叫做偏导emm),以及最后probability更新方法的定义。感觉很有意思。跟之前那篇损失函数定义的论文有异曲同工之妙。

Experiment