逻辑回归

128 阅读1分钟

什么是逻辑回归

逻辑回归是有监督的,判别式的参数模型

核心公式

η(x)=11+e(wTx+b)\eta(\boldsymbol{x})=\frac{1}{1+e^{-\left(\boldsymbol{w}^T \boldsymbol{x}+b\right)}}

对于二分类问题,可以视作标签为1的概率。

求解

首先基于二项分布,可以求出单个样本的概率

p(yx;θ)=η(x;θ)y(1η(x;θ))1y,y=0,1,p(y \mid \boldsymbol{x} ; \boldsymbol{\theta})=\eta(\boldsymbol{x} ; \boldsymbol{\theta})^y(1-\eta(\boldsymbol{x} ; \boldsymbol{\theta}))^{1-y}, \quad y=0,1,

然后利用最大似然概率求解

L(θ):=i=1np(yixi;θ)L(\boldsymbol{\theta}):=\prod_{i=1}^n p\left(y_i \mid \boldsymbol{x}_i ; \boldsymbol{\theta}\right)

取对数后得

(θ):=logL(θ)=i=1nlogp(yixi;θ)=i=1n[yilog(11+eθTx~i)+(1yi)log(eθTx~i1+eθTx~i)].\begin{aligned} \ell(\boldsymbol{\theta}) & :=\log L(\boldsymbol{\theta}) \\ & =\sum_{i=1}^n \log p\left(y_i \mid \boldsymbol{x}_i ; \boldsymbol{\theta}\right) \\ & =\sum_{i=1}^n\left[y_i \log \left(\frac{1}{1+e^{-\boldsymbol{\theta}^T \tilde{\boldsymbol{x}}_i}}\right)+\left(1-y_i\right) \log \left(\frac{e^{-\boldsymbol{\theta}^T \tilde{\boldsymbol{x}}_i}}{1+e^{-\boldsymbol{\theta}^T \tilde{\boldsymbol{x}}_i}}\right)\right] . \end{aligned}

l=0\nabla l=0

正则化

L1正则化

L1正则化又称Lasso回归,假设ω\omega服从拉普拉斯回归

f(wμ,b)=12bexp(wμb)f(w \mid \mu, b)=\frac{1}{2 b} \exp \left(-\frac{|w-\mu|}{b}\right)

所以原公式写为

L(w)=P(yw,x)P(w)=i=1Np(xi)yi(1p(xi))1yij=1d12bexp(wjb)\begin{aligned} L(w) & =P(y \mid w, x) P(w) \\ & =\prod_{i=1}^N p\left(x_i\right)^{y_i}\left(1-p\left(x_i\right)\right)^{1-y_i} \prod_{j=1}^d \frac{1}{2 b} \exp \left(-\frac{\left|w_j\right|}{b}\right) \end{aligned}

log后取负得到

lnL(w)=i[yilnp(xi)+(1yi)ln(1p(xi))]+12b2jwj-\ln L(w)=-\sum_i\left[y_i \ln p\left(x_i\right)+\left(1-y_i\right) \ln \left(1-p\left(x_i\right)\right)\right]+\frac{1}{2 b^2} \sum_j\left|w_j\right|

L2正则化

L2正则化又称ridge回归(岭回归),假设ω\omega服从高斯分布

f(wμ,σ)=12πσexp((wμ)22σ2)f(w \mid \mu, \sigma)=\frac{1}{\sqrt{2 \pi} \sigma} \exp \left(-\frac{(w-\mu)^2}{2 \sigma^2}\right)

所以原公式写为

L(w)=P(yw,x)P(w)=i=1Np(xi)yi(1p(xi))1yij=1d12πσexp(wj22σ2)=i=1Np(xi)yi(1p(xi))1yi12πσexp(wTw2σ2)\begin{aligned} L(w) & =P(y \mid w, x) P(w) \\ & =\prod_{i=1}^N p\left(x_i\right)^{y_i}\left(1-p\left(x_i\right)\right)^{1-y_i} \prod_{j=1}^d \frac{1}{\sqrt{2 \pi} \sigma} \exp \left(-\frac{w_j^2}{2 \sigma^2}\right) \\ & =\prod_{i=1}^N p\left(x_i\right)^{y_i}\left(1-p\left(x_i\right)\right)^{1-y_i} \frac{1}{\sqrt{2 \pi} \sigma} \exp \left(-\frac{w^T w}{2 \sigma^2}\right) \end{aligned}

取 In 再取负,得到目标函数:

lnL(w)=i[yilnp(xi)+(1yi)ln(1p(xi))]+12σ2wTw-\ln L(w)=-\sum_i\left[y_i \ln p\left(x_i\right)+\left(1-y_i\right) \ln \left(1-p\left(x_i\right)\right)\right]+\frac{1}{2 \sigma^2} w^T w