2 实践方法论
General Guide for a ML task🤗
2.1 模型偏差
If model is too simple, training is like seeking a needle in the sea when there is no needle in it
2.2 优化问题
an example of optimization issue, not overfitting
the fact that a more flexible model can't do better on training set indicate that there is a problem with optimization
gradient descent has a big problem talk about other optimization methods later in this class
how too differentiate optimization issue and model bias? if deeper networks do not obtain smaller loss on training data, then there is optimization issue
2.3 过拟合
problem caused by model flexibility
get more training data data augmentation give the model some restrictions, limit the flexibility others methods include early stopping, regularization and dropout
pick a moderate model to avoid overfitting
//I think of those who can really lift weights but lack genuine muscle strength
2.4 交叉验证
//如果滥用 public testing set,机器学习中也有暴力解法,但是得到的函数没什么实用性
Cross Validation split training set into training set and validation set
ideally, use validation set and ignore public testing set
N-fold Cross Validation
2.5 不匹配
mismatch is the discrepancy between training set and testing set i.e., use data of 2020 to train and use data of 2021 to test
at the end of the class, an interesting question is that how can a model trained on realistic pictures recognize drawings I think that is associated with abstract and common features, kind of philosophical
after class thinking
Mr. Lee's classes are bilingual, educational but not too hard. He isn't afraid to immerse students with English, that's cool