逻辑回归算法数学原理

135 阅读1分钟

逻辑回归

1.什么是逻辑回归

θ0x0=b\theta_0x_0 = b , θ=(θ0,θ1,..,θm)\theta = \begin{pmatrix}\theta_0 , \theta_1, ..,\theta_m\end{pmatrix} x=(x0x1...xm)x = \begin{pmatrix} x_0 \\ x_1 \\...\\ x_m\end{pmatrix}

与线性回归一样,它们都具有

θTx=θ1x1+θ2x2+...+θmx+b\theta^Tx = \theta_1x_1 + \theta_2x_2 +...+ \theta_mx + b

我们定义逻辑回归的预测函数为 hθ(x)=σ(θTx)=11+eθTxh_\theta(x) = \sigma(\theta^Tx) = \frac{1}{1 + e^{-\theta^T x}}

逻辑回归模型的损失函数J(θ)=1mΣi=1m[y(i)ln(hθ(x(i)))+(1y(i))ln(1hθ(x(i)))]J(\theta)=-\frac{1}{m} \Sigma_{i=1}^{m}[y^{(i)}ln(h_\theta(x^{(i)})) + (1 - y^{(i)})ln(1 - h_\theta(x^{(i)}))]

大概是使用最大似然估计

可以利用梯度下降算法求解模型参数

J(θ)θj=1ml(θ)θj\frac{\partial J(\theta)}{\partial \theta_j}= -\frac{1}{m}\frac{\partial l(\theta)}{\partial \theta_j}

l(θ)θj=l(θ)hθ(x)hθ(x)θTx(θTx)θj\frac{\partial l(\theta)}{\partial \theta_j} = \frac{\partial l(\theta)}{\partial h_{\theta}(x)} \cdot \frac{\partial h_\theta(x)}{\partial \theta^Tx} \cdot \frac{\partial(\theta^Tx)}{\partial \theta_j}

=Σi=1m[y(i)1hθ(x(i))(1y(i))11hθ(x(i))]hθ(x(i))(1hθ(x(i))xj(i)= \Sigma_{i=1}^{m}[y^{(i)} \frac{1}{h_\theta(x^{(i)})} - (1 - y^{(i)}) \frac{1}{1 - h_\theta(x^{(i)})}]h_\theta(x^{(i)})(1 - h_\theta(x^{(i)}) \cdot x^{(i)}_j

=Σi=1m[y(i)hθ(x(i))]xj(i)= \Sigma_{i=1}^{m}[y^{(i)} - h_\theta(x^{(i)})]\cdot x^{(i)}_j

则参数更新为

θj:=θjα(1nΣi=1m[y(i)hθ(x(i))]xj(i))\theta_j := \theta_j - \alpha \cdot( -\frac{1}{n}\Sigma_{i=1}^{m}[y^{(i)} - h_\theta(x^{(i)})]\cdot x^{(i)}_j)

按理说可以c++实现,不过鉴于心智负担放弃了

有两种方式为了防止过拟合,可以在损失函数之后加上L1L_1 或者 L2L_2

分别叫做L1L_1 正则化和L2L_2 正则化

J(θ)L1=J(θ)+1CΣj=1nθj J(\theta)_{L_1} = J(\theta) +\frac{1}{C}\Sigma_{j=1}^n |\theta_j|

J(θ)L2=J(θ)+1CΣj=1nθj2J(\theta)_{L_2} = J(\theta) + \frac{1}{C}\Sigma_{j=1}^n \theta_j^2

2.py实现验证

经典梯度下降算法,没有加正则化(啊啊啊不想写了)


import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split


def sigmoid(x):
    x_ravel = x.ravel()
    length = len(x_ravel)
    y = []
    for index in range(length):
        if x_ravel[index] >= 0:
            y.append(1.0 / (1 + np.exp(-x_ravel[index])))
        else:
            y.append(np.exp(x_ravel[index]) / (np.exp(x_ravel[index]) + 1))
    return np.array(y).reshape(x.shape)


class Logistic:
    def __init__(self, lr, max_iter):
        self.theta = None
        self.x = None
        self.y = None
        self.lr = lr
        self.max_iter = max_iter

    def fit(self, in_arr, label_arr):
        self.x = in_arr
        self.y = label_arr
        self.theta = np.random.rand(in_arr.shape[1], 1)

        for _ in range(self.max_iter):
            for i in range(self.x.shape[0]):
                x = self.x[i].reshape(1, self.x[i].size)
                y = self.y[i]
                h = sigmoid(np.dot(self.theta.T, x.T))
                grad = (h - y) * x

                self.theta -= self.lr * grad.T

    def score(self, x, y):
        hit = 0
        size = y.size

        for i in range(x.shape[0]):
            ux = x[i].reshape(1, self.x[i].size)
            uy = y[i]
            h = sigmoid(np.dot(self.theta.T, ux.T))
            h = np.array(h, dtype=np.int32)

            if h == uy:
                hit += 1

        return hit / size


data = load_breast_cancer()
X = data.data
Y = data.target

x_train, x_test, y_train, y_test = train_test_split(X, Y, train_size=0.6)
LR = Logistic(0.2, 100)

LR.fit(x_train, y_train)
score = LR.score(x_test, y_test)

print('score:', score)