Brendan O’Connor的博文Statistics vs. Machine Learning, fight!,初稿是08年写的,或许和作者的机器学习背景有关,他在初稿中主要是贬低了统计学,思想和[1]有点类似,认为机器学习比统计学多了些Algorithm Modeling方面内容,比如SVM的Max-margin,决策树等,此外他认为机器学习更偏实际。但09年十月的时候他转而放弃自己原来的观点,认为统计才是real deal: Statistics, not machine learning, is the real deal, but unfortunately suffers from bad marketing.
| Machine learning | Statistics |
|---|---|
| network, graphs | model |
| weights | parameters |
| learning | fitting |
| generalization | test set performance |
| supervised learning | regression/classification |
| unsupervised learning | density estimation, clustering |
| large grant = $1,000,000 | large grant = $50,000 |
| nice place to have a meeting:Snowbird, Utah, French Alps | nice place to have a meeting:Las Vegas in August |
研究方法差异
- 统计学研究形式化和推导
- 机器学习更容忍一些新方法
维度差异
统计学强调低维空间问题的统计推导(confidence intervals, hypothesis tests, optimal estimators)
- 机器学习强调高维预测问题
- 统计学和机器学习各自更关心的领域:
- 统计学: survival analysis, spatial analysis, multiple testing, minimax theory, deconvolution, semiparametric inference, bootstrapping, time series.
- 机器学习: online learning, semisupervised learning, manifold learning, active learning, boosting.
统计学习和机器学习的专业术语区别:
| 统计学 | 机器学习 |
|---|---|
| Estimation | Learning Classifier |
| Hypothesis | Data point |
| Example/Instance | Regression |
| Supervised Learning | Classification |
| Supervised Learning | Covariate |
| Feature Response | Label |