Using Garch model and regression model to analyzing stock prices in R

In order to find out the main factors affecting price fluctuations, we use stepwise regression to eliminate some independent variables that have little impact on the dependent variable, that is, the price. The name of the variable is changed to x1, x2…

由Kaizong Ye,Coin Ge撰写

The last emergency event needs to use dummy variables. Only two dummy variables are needed. We will use them as the three parameters X49, X50, X51, and set their values ​​to “positive” “Effect”, “No effect”, and “Negative effect” were changed to -1,0,1.

We respectively changed the names of independent variables such as WTI Price Field Production of Crude Oil (Thousand Barrels) to x1, x2… The last emergency event needs to use dummy variables. Only two dummy variables are needed, we will use them as X49, X50, X51, three parameters and their values ​​”positive influence”, “no influence”, “negative influence” were changed to -1,0,1.

×

rugarch

This has the idea of a specification for a model that is a separate object.  Then there are functions for fitting (that is, estimating parameters), predicting and simulation.

Here is an example of fitting with a Student t distribution:

> gspec.ru <ugarchspec(mean.model=list(
     armaOrder=c(0,0)), distribution="std")
> gfit.ru <ugarchfit(gspec.ru, sp5.ret[,1])
> coef(gfit.ru)
         mu        omega       alpha1        beta1        shape
7.300187e-04 2.266325e-06 6.911640e-02 9.272781e-01 4.194994e+00
> # plot in-sample volatility estimates
> plot(sqrt(252) * gfit.ru@fit$sigma, type='l')

The optimization in this package is perhaps the most sophisticated and trustworthy among the packages that I discuss.

fGarch


fGarch
 is a part of the Rmetrics suite.


We’ll fit the same Student t model as above:

> gfit.fg <garchFit(data=sp5.ret[,1], cond.dist="std")
> coef(gfit.fg)
         mu        omega       alpha1        beta1        shape
7.263209e-04 2.290834e-06 6.901898e-02 9.271553e-01 4.204087e+00
> # plot in-sample volatility estimates
> plot(sqrt(252) * gfit.fg@sigma.t, type="l")

tseries

I believe that this package was the first to include a publicly available garch function in R.  It is restricted to the normal distribution.

> gfit.ts <garch(sp5.ret[,1])
> coef(gfit.ts)
         a0           a1           b1
6.727470e-06 5.588495e-02 9.153086e-01
> # plot in-sample volatility estimates
> plot(sqrt(252) * gfit.ts$fitted.values[, 1], type="l")

bayesGARCH

I think Bayes estimation of garch models is a very natural thing to do.  We have fairly specific knowledge about what the parameter values should look like.

The only model this package does is the garch(1,1) with t distributed errors.  So we are happy in that respect.

> gbayes <bayesGARCH(sp5.ret[,1])

However, this command fails with an error.  The default is to use essentially uninformative priors — presumably this problem demands some information.  In practice we would probably want to give it informative priors anyway.

betategarch

This package fits an EGARCH model with t distributed errors.  EGARCH is a clever model that makes some things easier and other things harder.

> gest.te <– tegarch.est(sp5.ret[,1])
> gest.te$par
     delta        phi1      kappa1  kappa1star          df
-0.05879721  0.98714860  0.03798604  0.02487405  4.50367108
> gfit.te <– tegarch.fit(sp5.ret[,1], gest.te$par)
> pp.timeplot(sqrt(252) * gfit.te[, "sigma"])

That the plotting function is 

pp.timeplot
 is an indication that the names of the input returns are available on the output — unlike the output in the other packages up to here.  Figure 4 compares this estimate with a garch(1,1) estimate (from 
rugarch
 but they all look very similar).



After R language processing, we get the model

Y~x1 + x2 + x4 + x5 + x7 + x13 + x14 + x15 + x16 + x17 + x18 + x20 + x21 + x23 + x34 + x25 + x26 + x29 + x30 + x33 + x35 + x36 + x37 + x39 + x40 + x42 + x44 + x46 + x47 + x48 + x49 + x50

It can be seen that those with less impact have been eliminated.

Garch model predicts volatility

We use the Garch model to predict volatility,


课程

R语言数据分析挖掘必知必会

从数据获取和清理开始,有目的的进行探索性分析与可视化。让数据从生涩的资料,摇身成为有温度的故事。

立即参加

First check the normality of the data, you can calculate the data distribution function, QQ chart, logarithmic return rate series line chart

> shapiro.test(rlogdiffdata) 
 
	Shapiro-Wilk normality test
 
data:  rlogdiffdata
W = 0.94315, p-value = 1.458e-05

It can be seen from the QQ graph and the p-value that the data roughly conforms to the normal distribution.

Finally, use the VaR curve to warn of severe fluctuations.

The ones with * on the right represent strong influence points.

We construct the F test through the studentized residuals, and finally get the t test to detect abnormal points. by

stdres<-rstudent(lm.sol)

To get the studentized residual, and then use the formula

To calculate Fj and finally convert it to tj,

t=sqrt((144-51-1)*stdres^2/(144-51-stdres^2))

Finally we can check ifThen it is an abnormal point.

R language execution

res<-t>abs(qt(.025, df=92)) 

 You can directly get a Boolean value greater than the corresponding t value.

 A value of True may be an abnormal point.

prediction

We used HoltWinters to predict our oil price range

The true value is basically within the predicted range, but it is still more difficult to predict the net profit.


可下载资源

关于作者

Kaizong Ye拓端研究室(TRL)的研究员。在此对他对本文所作的贡献表示诚挚感谢,他在上海财经大学完成了统计学专业的硕士学位,专注人工智能领域。擅长Python.Matlab仿真、视觉处理、神经网络、数据分析。

本文借鉴了作者最近为《R语言数据分析挖掘必知必会 》课堂做的准备。

​非常感谢您阅读本文,如需帮助请联系我们!

 
QQ在线咨询
售前咨询热线
15121130882
售后咨询热线
0571-63341498

关注有关新文章的微信公众号


永远不要错过任何见解。当新文章发表时,我们会通过微信公众号向您推送。

技术干货

最新洞察

This will close in 0 seconds