问题:
在使用pyss3文档中给出的网格搜索调参示例过程中
scikit-learn==0.23.2
pyss3==0.6.4
from pyss3.util import Evaluation
clf = SS3()
best_s, best_l, best_p, _ = Evaluation.grid_search(
clf, x_train, y_train,
s=[0.2, 0.32, 0.44, 0.56, 0.68, 0.8],
l=[0.1, 0.48, 0.86, 1.24, 1.62, 2],
p=[0.5, 0.8, 1.1, 1.4, 1.7, 2],
k_fold=10
)
报错:
File "x.py", line 85, in <module>
batch()
File "x.py", line 54, in batch
best_s, best_l, best_p, _ = Evaluation.grid_search(
File "x\lib\site-packages\pyss3\util.py", line 1638, in grid_search
Evaluation.__grid_search_loop__(
File "x\lib\site-packages\pyss3\util.py", line 809, in __grid_search_loop__
Evaluation.__evaluation_result__(
File "x\lib\site-packages\pyss3\util.py", line 661, in __evaluation_result__
classification_report(
File "x\lib\site-packages\sklearn\utils\validation.py", line 72, in inner_f
return f(**kwargs)
File "x\lib\site-packages\sklearn\metrics\_classification.py", line 1979, in classification_report
name_width = max(len(cn) for cn in target_names)
ValueError: max() arg is an empty sequence
Process finished with exit code 1
问题原因和解决方案:
调用了sklearn的classification_report方法,需要传入标签,但是SS3类中的__categories__属性为[],引发了报错。
__categories__只有在执行SS3.fit方法时才会添加,因此在进行网格搜索之前,需要先执行一遍训练过程,即添加
clf.fit(x_train, y_train)
或着直接指定标签,即添加
clf.__categories__ = ['xxx', 'yyy']
后记
网格搜索会自动缓存一份各参数的表现,第二遍运行会直接读取结果。