However, there exist investments which have considerably more risk and create opportunities for profit through that risk. As a result, we can capture the notion of risk through modeling volatility to make profits. Where Xₜ is the current asset price/return and we condition on all previous prices/returns. We will build on a previous Medium Post, “Introduction to ARMA Models with Financial Data”, to explore the ARCH (Autoregressive Conditional Heteroskedasticity) and GARCH (Generalized Autoregressive Conditional Heteroskedasticity) models.

Let’s take a small detour to mention the modeling of returns rather than prices. This is a convenient way to model financial assets and as such will be important to conceptualize. In the period following the onset of a crisis, however, returns may swing wildly from negative to positive territory.

The efficient market hypothesis will illustrate why the random walk model is appropriate and lead us to consider methods to capture risk and opportunities for profit within this economic framework. There is also a conditional coverage (CC) test by Christoffersen (1998). In addition to the unconditional coverage, Christoffersen test measures the likelihood of unusually frequent VaR exceptions—an effect of exceptions clustering. Essentially, where there is heteroskedasticity, observations do not conform to a linear pattern. The ACF of the squared series follows an ARMA pattern because of both the ACF and PACF taper. We impose the constraints \(\alpha_0\) ≥ 0 and \(\alpha_1\) ≥ 0 to avoid negative variance.

- It is shown that classical GARCH models generally give good results in financial modeling, where high volatility can be observed.
- That is going to be one with about the right persistence (see below), with the alpha1 parameter somewhere between 0 and 0.1, and the beta1 parameter between 0.9 and 1.
- Both model families were not able to respond correctly to the COVID-19 financial market crashes, hence the high number of exceptions in the last analyzed period.

We propose GARCHNet, a nonlinear approach to conditional variance that combines LSTM neural networks with maximum likelihood estimators in GARCH. The variance distributions considered in the paper are normal, t and skewed t, but the approach allows extension to other distributions. Our results confirm the validity of the solution, but we provide some directions for its further development. According to Lim et al. (2019), the best approach to using machine learning in the time series domain is not to fully replace statistical and econometric approaches.

But when I tried this command, it created a runtime error which ideally would have terminated R. Instead R hung around doing whatever zombies do — dead and not wanting to go away. Maybe it is the way to do it, but I don’t plan on trying it again for a while. EGARCH is a clever model that makes some things easier and other things harder. The default is to use essentially uninformative priors — presumably this problem demands some information. In practice we would probably want to give it informative priors anyway.

Their results reveal that this approach leads to a reduction in MAPE of about 10%. An important task of modeling conditional volatility is to generate

accurate forecasts for both the future value of a financial time series

as well as its conditional volatility. Volatility forecasts are used

for risk management, option pricing, portfolio allocation, trading

strategies and model evaluation. ARCH and GARCH models can generate

accurate forecasts of future daily return volatility, especially over

short horizons, and these forecasts will eventually converge to the

unconditional volatility of daily returns.

## GARCH Model

It should be noted that most of these exceedances occurred in the last two periods. For the other models, the overhead is much smaller and sometimes negative. However, we believe that the predictive power of such a model could be improved with a better neural architecture. Where \(\Gamma (\cdot )\) is a gamma function and the log likelihood is a logarithm of density of t distribution. Each derivation of GARCH can be used to accommodate the specific qualities of the stock, industry, or economic data.

Lots of data as in it would like tens of thousands of daily observations. This is a very desirable feature of a VaR model, as in the case of an exception the potential loss is not as severe. However, from the company’s point of view, the GARCHNet models do not look so good. In most cases, the values of the company’s cost function are the worst—only in a few cases was the value of the cost function for the GARCHNet model lower.

## Forecasting Volatility: Deep Dive into ARCH & GARCH Models

Almost always the volatility state that we want is the state at the end of the data. We want to use the current state of volatility and peek into the future. If the volatility clustering is properly explained by the model, then there will be no autocorrelation in the squared standardized residuals. It is common to do a Ljung-Box test to test for this autocorrelation. We look at volatility clustering, and some aspects of modeling it with a univariate GARCH(1,1) model. Since model diagnostics are often an overlooked step, we will spend some time assessing whether our fitted model is valid and provide direction for next steps.

## 1 GARCH Models

garch processes differ from homoskedastic models, which assume constant volatility and are used in basic ordinary least squares (OLS) analysis. OLS aims to minimize the deviations between data points and a regression line to fit those points. With asset returns, volatility seems to vary during certain periods and depend on past variance, making a homoskedastic model suboptimal. The estimation of a garch model is mostly about estimating how fast the decay is. The decay that it sees is very noisy, so it wants to see a lot of data.

The simplicity of VaR does not stop numerous approaches from being proposed (Engle & Manganelli, 2004; Barone-Adesi et al., 2008; Wang et al., 2010). The best example of such situation is the financial crisis of 2008 (Degiannakis et al., 2012) or the more recent market crash caused by COVID-19 (Omari et al., 2020). Therefore the financial industry—both regulators and financial institutions—are turning to a better, probabilistic way of estimating risk based on past events that is able to quickly adjust to recent shocks (So & Philip, 2006). As of writing, the official estimate of market risk is either Value at Risk or Expected Shortfall (ES) proposed by Basel Committee. It estimates the expected value of a potential loss if such a loss on a given asset is less than VaR.

## Forecasting Volatility

Meanwhile, risk refers to the probability of losing money when investing in a particular stock. Figure 2 shows the relationship between GARCH and GARCHnet predictions with innovations with a t distribution. It can be seen that the GARCHNet predictions do not deviate from the rate of return, even more—for some intervals GARCHNet confirms the presence of volatility shocks much faster. It can also be noted that the GARCHNet model tends to estimate a higher VaR than GARCH, except for the most recent period, where the relationship is reversed. Statistically, this number comes from a binomial distribution (assuming the exceptions are IID). The Basel Committee strictly regulates what values constitute a “safe zone” or require a look at the model.

GARCH models assume that the variance of the error term follows an autoregressive moving average process. The estimation of the ARCH-GARCH model parameters is more complicated

than the estimation of the CER model parameters. There are no simple

plug-in principle estimators for the conditional variance parameters.

The natural frequency of data to feed a garch estimator is daily data. You can use weekly or monthly data, but that smooths some of the garch-iness out of the data. The Jarque-Bera test shows us that we do not have standardized residuals that follow the normal distribution. While beyond the scope of this post, the next step is to specify different distributions in the GARCH model that are heavier-tailed for the error terms, such as the Student’s t-distribution. Another suitable and more pragmatic approach would be to only make one-step ahead predictions based on information up until time t and update the information to the model in real time as time t becomes time t+1, t+2, etc. More simply, with each passing day, we update Xₜ, Xₜ₊₁, Xₜ₊₂ with the actual, observed return from that day to model one time step ahead.

There exists a broad family of models that aim to capture such effect, the most common being Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, proposed by Bollerslev (1986). Since financial markets typically exhibit known stylized facts, a more fitting approach is to use a fat tail distribution (Aloui & Mabrouk, 2010). The introduction of distributions such as t-distributions or GEDs, which allow for modeling skewness and heavy tails, has dispelled any doubts about the validity of GARCH models (BenSaïda, 2015; Bonato, 2012). GARCH was developed in 1986 by Dr. Tim Bollerslev, a doctoral student at the time, as a way to address the problem of forecasting volatility in asset prices.

If you are used to looking at p-values from goodness of fit tests, you might notice something strange. The tests are saying that we have overfit 1547 observations with 4 parameters. Note that volatility from announcements (as opposed to shocks) goes the other way around — volatility builds up as the announcement time approaches, and then goes away when the results of the announcement are known. We know that the kurtosis of the normal distribution is 3, which is useful as a reference. If greater than 3, we have a sample that is heavier-tailed than normal distribution. If less than 3, we have a sample that is lighter-tailed than the normal distribution.

Rather, they propose to combine the best of both worlds, hence the idea of this paper is to model conditional variance using NN. Several studies have already been produced on the intersection of https://1investing.in/ and NN models. For example, Arnerić et al. (2014) have proposed modeling time series using the GARCH model, but with an extension to RNNs called Jordan NNs. Similar studies by Kristjanpoller and Minutolo (2015, 2016) propose an ANN-GARCH model and their results show a 25% reduction in mean absolute percentage error (MAPE). Research by Kim and Won (2018) oes a step further, incorporating an LSTM layer into the neural network, reporting a 37.2% decrease in mean absolute error (MAE). Yet another approach, proposed by Jeong and Lee (2019) considers the RNN model to determine the autoregressive moving average (ARMA) process, which drives not the conditional variance, but the conditional mean.

## Leave a Reply