统计学习和机器学习

2,125 阅读1分钟

Brendan O’Connor的博文Statistics vs. Machine Learning, fight!,初稿是08年写的,或许和作者的机器学习背景有关,他在初稿中主要是贬低了统计学,思想和[1]有点类似,认为机器学习比统计学多了些Algorithm Modeling方面内容,比如SVM的Max-margin,决策树等,此外他认为机器学习更偏实际。但09年十月的时候他转而放弃自己原来的观点,认为统计才是real deal: Statistics, not machine learning, is the real deal, but unfortunately suffers from bad marketing.

Machine learningStatistics
network, graphsmodel
weightsparameters
learningfitting
generalizationtest set performance
supervised learningregression/classification
unsupervised learningdensity estimation, clustering
large grant = $1,000,000large grant = $50,000
nice place to have a meeting: Snowbird, Utah, French Alpsnice place to have a meeting: Las Vegas in August

研究方法差异

  • 统计学研究形式化和推导
  • 机器学习更容忍一些新方法

维度差异

统计学强调低维空间问题的统计推导(confidence intervals, hypothesis tests, optimal estimators)

  • 机器学习强调高维预测问题
  • 统计学和机器学习各自更关心的领域:
  • 统计学: survival analysis, spatial analysis, multiple testing, minimax theory, deconvolution, semiparametric inference, bootstrapping, time series.
  • 机器学习: online learning, semisupervised learning, manifold learning, active learning, boosting.

统计学习和机器学习的专业术语区别:

统计学       机器学习
Estimation        Learning Classifier  
HypothesisData point       
Example/InstanceRegression  
Supervised LearningClassification 
Supervised LearningCovariate   
Feature Response          Label