线性回归分析spss找最优解(懿说学区30SPSS统计分析)
Yishuo School District (30) | SPSS Statistical Analysis (40) example of Multiple Linear Regression Analysis
“分享兴趣,传播快乐,增长见闻,留下美好! 大家好,这里是小编。欢迎大家继续访问学苑内容,我们将竭诚为您带来更多更好的内容分享。
"Share interest, spread happiness, increase knowledge, and leave a good impression! Hello everyone, this is Xiaobian. Welcome to continue to visit the content of Xueyuan, and we will wholeheartedly bring you more and better content to share.
上一期,我们一起学习了关于一元线性回归方程的理论与实践操作。在一元线性回归中,自变量只有一个,因变量只和一个因素有关。这在实际情况中是不常见的,常见的情况是1个自变量无法将因变量的变化信息完全解释清楚,往往需要多个自变量才能解释清楚,这就会涉及到一个因变量和一组自变量的线性回归问题,这在回归分析中称为多元线性回归。
In the last issue, we learned about the theory and practice of linear Regression equation with one variable. In univariate linear regression, there is only one independent variable, and the dependent variable is related to only one factor. This is not common in practice. A common situation is that one independent variable cannot fully explain the change information of the dependent variable, which often requires multiple independent variables to explain clearly. This will involve a linear regression problem of a dependent variable and a group of independent variables, which is called multiple linear regression in regression analysis.
多元线性回归是为了弥补一元线性回归无法完全解释因变量的变化信息这个缺点引入的,只有当一元线下回归效果较差时,才考虑使用多元线性回归。
Multivariate linear regression is introduced to make up for the disadvantage that the linear regression of one variable cannot fully explain the change information of the dependent variable. Only when the offline regression of one variable is poor, the use of multiple linear regression is considered.
我们来看多元线性回归的实例运用,下表是1992年亚洲各国家和地区平均寿命(y)、按购买力计算的人均GDP(x1)、成人识字率(x2)、一岁儿童疫苗接种率(x3)的数据。试用多元回归的方法分析各国家和地区平均寿命和人均GDP、成人识字率、一岁儿童疫苗接种率的关系。(数据来源:联合国开发计算署《人类发展报告》)
Let's look at the example application of multiple linear regression. The following table shows the data of average life expectancy (y), per capita GDP (x1), adult literacy rate (x2), and 1-year-old children's vaccination rate (x3) in Asian countries and regions in 1992. Multivariate regression was used to analyze the relationship between average life expectancy and GDP per capita, adult literacy rate, and vaccination rate of children aged one year. (Data source: Human Development Report of UNDP)
第一步,分析并组织数据,这里要分析的是一个变量“平均寿命”与其他三个变量之间的线性关系,显然是一个多元线性回归的问题。定义六个变量:分别为“序号”、“国家和地区”、“y”(平均寿命)、“x1”(人均GDP)、“x2”(成人识字率)、“x3”(一岁儿童疫苗接种率),输入数据并保存。
The first step is to analyze and organize data. Here, we need to analyze the linear relationship between one variable "average life" and the other three variables, which is obviously a problem of multiple linear regression. Define six variables: "serial number", "country and region", "y" (average life span), "x1" (GDP per capita), "x2" (adult literacy rate), and "x3" (vaccination rate of one year old children), input data and save them.
第二步,进行多元线性回归分析设置,按下图所示进行设置。
Step 2: Set the multiple linear regression analysis as shown in the figure below.
第三步,主要结果与分析。
1) 下表为相关系数矩阵表,显示了包括自变量和因变量在内的4个变量的皮尔逊相关系数以及单尾显著性概率,从表中可以看出因变量与自变量的相关系数分别为0.725、0.847、0.733,单尾检验的显著性概率也较小,说明三个自变量与因变量的关系比较密切。
The following table is the correlation coefficient matrix table, which shows the Pearson correlation coefficient and one tailed significance probability of four variables including independent variable and dependent variable. It can be seen from the table that the correlation coefficients of dependent variable and independent variable are 0.725, 0.847 and 0.733 respectively, and the significance probability of one tailed test is also small, indicating that the three independent variables are closely related to the dependent variable.
2) 观察输入\除去变量表,系统在进行逐步回归的过程中产生了三个回归模型,模型Ⅰ按照在“选项”对话框确认的标准概率值,先将与平均寿命线性关系最密切的自变量x2引入模型,其余再逐步引入。
Observe the input remove the variable table. The system generates three regression models in the process of stepwise regression. Model I first introduces the independent variable x2 that has the closest linear relationship with the average life into the model according to the standard probability value confirmed in the "Options" dialog box, and then gradually introduces the rest.
3) 模型摘要表分别给出了三个回归模型的复相关系数R,决定系数R方和调整后的决定系数R方。从第三个模型来看,R=0.952,R方=0.907,从拟合优度来看,第三个模型明显比第一个模型和第二个模型好。
The model summary table gives the complex correlation coefficient R, determination coefficient R and adjusted determination coefficient R of the three regression models. From the third model, R=0.952, R=0.907. From the goodness of fit, the third model is obviously better than the first model and the second model.
5) 方差分析表给出了三个模型的方差分析结果。对模型一:F值等于50.628,显著性概率P值为0.000,在显著性水平为0.05的情形下,可以认为y(平均寿命)与x2(成人识字率)之间有线性关系。第二和第三个模型可以进行类似分析。
The ANOVA table shows the ANOVA results of the three models. For model 1: F value is equal to 50.628, and P value of significance probability is 0.000. Under the condition of 0.05 significance level, it can be considered that there is a linear relationship between y (average life span) and x2 (adult literacy rate). The second and third models can perform a similar analysis.
5) 回归系数表。根据表中的数据非标准化系数B的数值可知,逐步回归过程中先后建立的三个回归模型如下:
模型一:y = 38.794 0.332x2
模型二:y = 41.206 0.071x1 0.253x2
模型三:y = 32.993 0.072x1 0.169x2 0.17 x3
Regression coefficient table. According to the value of the data denormalization coefficient B in the table, the three regression models established successively in the process of stepwise regression are as follows:
Model 1: y=38.794 0.332x2
Model 2: y=41.206 0.071x1 0.253x2
Model III: y=32.993 0.072x1 0.169x2 0.17 x3
6) 其他图表
Other charts
The third step is the main results and analysis.
下期预告:本期,我们学习了
多元线性回归分析的实践操作。
下一期,我们将会学习关于
曲线回归分析的问题。
今天的分享就到这里了
如果您对今天的文章有独特的想法
欢迎给我们留言
让我们相约明天
祝您今天过得开心快乐!
That's all for today's sharing. If you have unique ideas about today's article, please leave us a message. Let's meet tomorrow. I wish you a happy day today!
参考资料:百度百科,《SPSS 23 统计分析实用教程》
翻译:百度翻译
本文由learningyard新学苑原创,部分文字图片来源于他处,如有侵权,请联系删除。
,免责声明:本文仅代表文章作者的个人观点,与本站无关。其原创性、真实性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容文字的真实性、完整性和原创性本站不作任何保证或承诺,请读者仅作参考,并自行核实相关内容。文章投诉邮箱:anhduc.ph@yahoo.com