NLP 基本知识:www.cnblogs.com/mokoaxx/p/1…
NLP 学习路线:www.jiqizhixin.com/articles/20…
NLP 学习路线:blog.csdn.net/flyfor2013/…
字符串距离或相似度
- jaccard
- levenshitein
- cosine
文档分类
- Logistic Regression
- Naive Bayes Classifier
- Support Vector Machine
- Decision Tree
- LASSO Regression
文档聚类
- k-means(Lloyd | Euclidean | Voronoi Diagram | Spherical|Cosine)
- GMM
- DBSCAN
- Bayesian GMM
- Hierarchical Clustering
文档词袋模型(Bag of Words Model)
文档词嵌入(Embedding)
- Doc2Vec
激活函数
- 激活函数的意义 (参考1、参考2)
- Sigmoid、tanh、ReLU (参考1)(参考2)
- 梯度下降
- 梯度弥散和梯度爆炸 (参考1)
- 凹函数和凸函数 注意:中国数学界函数凹凸性定义和国外的定义是相反的,这里的凸函数应该是下凸,凹函数是上凸
Sequential Model
- LSTM (Cell State)
- RNN (Bi-directional RNN | DeepRNN | Back-Propagation Through Time | Vanishing Gradient)
- GRU
Seq2seq
Word Embedding
- Character Embedding
- Topic Modeling
- LDA(LDI) (Probobilistic LAD、Multinomial、SVD)
- Sparse Coding
- NMF
- Matrix Factorization
- Latent Dirichlet Allocation (Dirichlet、Gibbas Sampling、Perplexity)
- One-hot representation
- Curse of Dimension
- Dimensionality Reduction
- Distributed Representation
- Co-Ocuurrences(n-gram) (Word2Vec(CBOW、Skip-gram)、Glove、FastText、Co-Occurrences + Shifted PPMI)
- Implicit Matrix Factorization
Linear Regression
- MSE
- Regularized
- Ridge(L2)
- LASSO(L1)
- Elastic Net(L1+L2)
- Frobenius Norm
- Norm
Regularization (过拟合)
- Early Stopping
- Weight Decoy
- Dropout
- Normalization
- Layer
- Batch BN 和 LN 区别 参考1
- Weight
- BN、LN 和 WN 的比较
Logistic Regression
- From Gerneative to Discriminative Proof
- logits
- Activation Function
- Sigmoid
- tanh
- ReLU
- Leakly Relu
- PReLU
- ELU
- Maxout
- Representation Learning
- Maximize Likelihood is Minimize Cross Entropy Proof
- Gradient Descent
- Talor Series Proof
- Convex Function
- Jensen's inequolity
- Learning Rate
- Backpropogation Proof
- Gradient Vanishing & Exploding Proof
- Finite Difference Method
- SGD(b=1)
- Momentum optimization
- NAG optimization
- Adagrad optimization
- RMSProp optimization
- AdaDelta optimization
- Adam optimization
- Nadam optimization