python线性回归预测教程（线性回归算法的正则方程的解法实现）

我决定爱你 2023-03-20 07:37:53

专栏推荐线性回归的算法的代码实现

import matplotlib.pyplot as plt import numpy as npfrom sklearn import datasets#sklearn是机器学习的库，提供了很多数据集

python线性回归预测教程（线性回归算法的正则方程的解法实现）(1)

class LinearRegression():

def __init__(self):

self.w = None

def fit(self, X, y):

X = np.insert(X, 0, 1, axis=1)

print (X.shape)

X_ = np.linalg.inv(X.T.dot(X))

self.w = X_.dot(X.T).dot(y)

def predict(self, X):# Insert constant ones for bias weights

X = np.insert(X, 0, 1, axis=1)

y_pred = X.dot(self.w)

return y_pred

def mean_squared_error(y_true, y_pred):

mse = np.mean(np.power(y_true - y_pred, 2))

return mse

def main():# Load the diabetes datasetdiabetes = datasets.load_diabetes()# Use only one featureX = diabetes.data[:, np.newaxis, 2]print (X.shape)# Split the data into training/testing setsx_train, x_test = X[:-20], X[-20:]# Split the targets into training/testing setsy_train, y_test = diabetes.target[:-20], diabetes.target[-20:]clf = LinearRegression()clf.fit(x_train, y_train)y_pred = clf.predict(x_test)# Print the mean squared errorprint ("Mean Squared Error:", mean_squared_error(y_test, y_pred))# Plot the resultsplt.scatter(x_test[:,0], y_test, color='black')plt.plot(x_test[:,0], y_pred, color='blue', linewidth=3)plt.show()

main()

在main方法中加载了diabetes dataset

然后对这个数据进行进行变换赋值x，变化的x.shape为：

（422，1）

就是说这个数据只有一维的，就是说这个数据有422个样本，每个样本只有一个特征数据

x_train, x_test = X[:-20], X[-20:]

为测试集和训练集的划分

上面还有一个类LinearRegression

这个类有两个方法：

def fit(self, X, y):这个方法是当我们有了数据和label（标注y），进行数据进行训练，这个方法就是通过正规方程的方式来求出theta值，求出theta值之后，代价函数就有了，以后只要带入x，就可以求出代价函数的值了。

X = np.insert(X, 0, 1, axis=1)

这句代码应该这样理解，对应的公式是下面的

然后给假设函数theta0配置了一个x0，x0=1

假设函数=theta0x0 theta1x1 theta2x2（x0=1）

此时x是一个列向量，（x0，x1，x2）应该是列向量这里横着写的，要注意

我们根据最终公式知道最后的seta的值为上面对这个，那么放在代码中是如何实现的呢？

X_ = np.linalg.inv(X.T.dot(X))self.w = X_.dot(X.T).dot(y)首先dot表示矩阵中的乘法，X.T表示x的转秩X.T.dot(X)表示x的转秩乘以x，np.linalg.inv()表示对矩阵求逆X_.dot(X.T).dot(y)然后再乘以x的转秩和y最后得出了theta，就是说公式完全被代码实现了，而代码实现的是直接的公式结论就ok，没必要从头开始

def predict(self, X):有了上面方法的theta之后就可以使用该方法进行测试了

def predict(self, X):X = np.insert(X, 0, 1, axis=1)y_pred = X.dot(self.w)return y_pred

我们现在有了theta也就是当前程序中类的变量w，现在我们要预测x，只需要按照公式

用想要求的x乘以theta的转秩就可以得出最终的h（x）值，就是我们想要的那个，预测结果

如果我们最终用可视化工具画出来的话，那么就是上面的感觉
,