本示例使用II型平方和 。参数估计值在R中的计算方式不同。
可下载资源
怎么做测试
本示例使用II型平方和 。参数估计值在R中的计算方式不同,
Data = read.table(textConnection(Input),header=TRUE)
data:image/s3,"s3://crabby-images/87228/872285329f7f22d424c0fb6126d3bd058ae65835" alt=""
plot(x = Data$Temp,
y = Data$Pulse,
col = Data$Species,
pch = 16,
xlab = "Temperature",
ylab = "Pulse")
legend('bottomright',
legend = levels(Data$Species),
col = 1:2,
cex = 1,
pch = 16)
data:image/s3,"s3://crabby-images/4b4ab/4b4abb863efb3894d8e2de141ce17299f5c4c44a" alt=""
协方差分析
Anova Table (Type II tests)
Sum Sq Df F value Pr(>F)
Temp 4376.1 1 1388.839 < 2.2e-16 ***
Species 598.0 1 189.789 9.907e-14 ***
Temp:Species 4.3 1 1.357 0.2542
### Interaction is not significant, so the slope across groups
### is not different.
model.2 = lm (Pulse ~ Temp + Species,
data = Data)
library(car)
Anova(model.2, type="II")
Anova Table (Type II tests)
Sum Sq Df F value Pr(>F)
Temp 4376.1 1 1371.4 < 2.2e-16 ***
Species 598.0 1 187.4 6.272e-14 ***
### The category variable (Species) is significant,
### so the intercepts among groups are different
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.21091 2.55094 -2.827 0.00858 **
Temp 3.60275 0.09729 37.032 < 2e-16 ***
Speciesniv -10.06529 0.73526 -13.689 6.27e-14 ***
### but the calculated results will be identical.
### The slope estimate is the same.
### The intercept for species 1 (ex) is (intercept).
### The intercept for species 2 (niv) is (intercept) + Speciesniv.
### This is determined from the contrast coding of the Species
### variable shown below, and the fact that Speciesniv is shown in
### coefficient table above.
niv
ex 0
niv 1
data:image/s3,"s3://crabby-images/4b4ab/4b4abb863efb3894d8e2de141ce17299f5c4c44a" alt=""
拟合线的简单图解
plot(x = Data$Temp,
y = Data$Pulse,
col = Data$Species,
pch = 16,
xlab = "Temperature",
ylab = "Pulse")
data:image/s3,"s3://crabby-images/8f51f/8f51fe91ed437963e87218720493e3d5940cdfdd" alt=""
data:image/s3,"s3://crabby-images/19577/1957763eae9b5ce058f9a1acf8617ca21495f624" alt=""
data:image/s3,"s3://crabby-images/5a949/5a949f940e21da04f828727be99dfd90a7ffcb80" alt=""
模型的p值和R平方
Multiple R-squared: 0.9896, Adjusted R-squared: 0.9888
F-statistic: 1331 on 2 and 28 DF, p-value: < 2.2e-16
data:image/s3,"s3://crabby-images/4b4ab/4b4abb863efb3894d8e2de141ce17299f5c4c44a" alt=""
检查模型的假设
data:image/s3,"s3://crabby-images/84462/84462886a4039c6fe000f8a3c10915ab0c8ee299" alt=""
data:image/s3,"s3://crabby-images/387b4/387b422f0d733db61ca7f35e7effed0b0c76478c" alt=""
线性模型中残差的直方图。这些残差的分布应近似正态。
data:image/s3,"s3://crabby-images/d8b26/d8b26366d07537755653a88a06b82b2a07050988" alt=""
data:image/s3,"s3://crabby-images/ef47b/ef47b3af61b7a48f68b47ad8de17d85541d45df6" alt=""
残差与预测值的关系图。残差应无偏且均等。
### additional model checking plots with: plot(model.2)
### alternative: library(FSA); residPlot(model.2)
data:image/s3,"s3://crabby-images/e2ee1/e2ee1358566dad7efd59858cd69054015b3b200a" alt=""
具有三类和II型平方和的协方差示例分析
本示例使用II型平方和,并考虑具有三个组的情况。
### --------------------------------------------------------------
### Analysis of covariance, hypothetical data
### --------------------------------------------------------------
Data = read.table(textConnection(Input),header=TRUE)
data:image/s3,"s3://crabby-images/dcdf5/dcdf52506aa219fbd03336624d9250edff6ba597" alt=""
plot(x = Data$Temp,
y = Data$Pulse,
col = Data$Species,
pch = 16,
xlab = "Temperature",
ylab = "Pulse")
legend('bottomright',
legend = levels(Data$Species),
col = 1:3,
cex = 1,
pch = 16)
data:image/s3,"s3://crabby-images/55ce4/55ce43b415bc5f77a55475fe0254b58d38ed0909" alt=""
协方差分析
options(contrasts = c("contr.treatment", "contr.poly"))
### These are the default contrasts in R
Anova(model.1, type="II")
Sum Sq Df F value Pr(>F)
Temp 7026.0 1 2452.4187 <2e-16 ***
Species 7835.7 2 1367.5377 <2e-16 ***
Temp:Species 5.2 2 0.9126 0.4093
### Interaction is not significant, so the slope among groups
### is not different.
Anova(model.2, type="II")
Sum Sq Df F value Pr(>F)
Temp 7026.0 1 2462.2 < 2.2e-16 ***
Species 7835.7 2 1373.0 < 2.2e-16 ***
Residuals 125.6 44
### The category variable (Species) is significant,
### so the intercepts among groups are different
summary(model.2)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.35729 1.90713 -3.333 0.00175 **
Temp 3.56961 0.07194 49.621 < 2e-16 ***
Speciesfake 19.81429 0.66333 29.871 < 2e-16 ***
Speciesniv -10.18571 0.66333 -15.355 < 2e-16 ***
### The slope estimate is the Temp coefficient.
### The intercept for species 1 (ex) is (intercept).
### The intercept for species 2 (fake) is (intercept) + Speciesfake.
### The intercept for species 3 (niv) is (intercept) + Speciesniv.
### This is determined from the contrast coding of the Species
### variable shown below.
contrasts(Data$Species)
fake niv
ex 0 0
fake 1 0
niv 0 1
data:image/s3,"s3://crabby-images/b1d1b/b1d1b9b955f895ff374b3842a55aacc546626810" alt=""
拟合线的简单图解
data:image/s3,"s3://crabby-images/de8e9/de8e99ebdac52fd166bc06043841c521295c4be6" alt=""
data:image/s3,"s3://crabby-images/407d7/407d78d897e656ebc186a9b67a3936063a03069f" alt=""
组合模型的p值和R平方
Multiple R-squared: 0.9919, Adjusted R-squared: 0.9913
F-statistic: 1791 on 3 and 44 DF, p-value: < 2.2e-16
data:image/s3,"s3://crabby-images/407d7/407d78d897e656ebc186a9b67a3936063a03069f" alt=""
检查模型的假设
hist(residuals(model.2),
col="darkgray")
data:image/s3,"s3://crabby-images/c2ac3/c2ac370b38f98068fc17f6a82132960e6a7fcbc5" alt=""
data:image/s3,"s3://crabby-images/8bf0a/8bf0ad77718e82088441caa1f18e47ca37f182b4" alt=""
data:image/s3,"s3://crabby-images/9e985/9e985d9c1ed6218c013b4dbc09b3b4cef648adf9" alt=""
线性模型中残差的直方图。这些残差的分布应近似正态。
plot(fitted(model.2),
residuals(model.2))
data:image/s3,"s3://crabby-images/a0872/a08728786cf0f6185cbb60238ca13d750faeb8c5" alt=""
data:image/s3,"s3://crabby-images/36455/36455c80ee370b3bf6cd2dbb3f65383611da660d" alt=""
data:image/s3,"s3://crabby-images/0b544/0b54426f7ce68623a44c71977dff42ae3a1fb5d4" alt=""
残差与预测值的关系图。残差应无偏且均等。
### additional model checking plots with: plot(model.2)
### alternative: library(FSA); residPlot(model.2)
data:image/s3,"s3://crabby-images/4d0f8/4d0f88539852e5058bba4a0e8fe9b944e853d80c" alt=""
可下载资源
关于作者
Kaizong Ye是拓端研究室(TRL)的研究员。在此对他对本文所作的贡献表示诚挚感谢,他在上海财经大学完成了统计学专业的硕士学位,专注人工智能领域。擅长Python.Matlab仿真、视觉处理、神经网络、数据分析。
本文借鉴了作者最近为《R语言数据分析挖掘必知必会 》课堂做的准备。
非常感谢您阅读本文,如需帮助请联系我们!