# Regime Switching Models With Hidden Markov Models: Bull Or Bear Market?

##
May 12, 2018

Data Science
Finance

Financial markets have various states they can be in, such as recession/expansion or bearish/bullish/sideways. Different (algorithmic) trading strategies might perform different, depending on what state (or hereafter called “regime”) the market is in. Thus, regime switching is a well researched topic in finance. This article explores some simple models and tries to detect when these regimes occur and what probability they have given the data of the market.

## The Data

The S&P 500 is an index on the top 500 US companies, and thus represents a benchmark for individual companies to be compared against. It is commonly used to compare (algorithmic) trading strategies, where the S&P is the reference. If the strategy is worse or somewhat equal than the SPY, we could’ve just as well invested in an ETF (exchange traded fund) such as SPDR® S&P 500 ETF that replicates the company performances of the whole index. That’s why one of these indexes (`SPY`

) will be the reference for this analysis as well.

```
# get data
getSymbols("SPY")
```

`## [1] "SPY"`

```
# make return time series
returns <- as.numeric(diff(log(Cl(SPY))))
z.returns <- zoo(returns)
index(z.returns) <- index(SPY)
```

`chartSeries(SPY, theme='white')`

The plot above visualizes the performance of the `SPY`

ETF as well as its volume. An interesting time span immediately making note of itself is the year 2009, in which the subprime crisis induced recession was well underway. The price drop as well as the spike in volume shows how the recession manifested itself in the market, giving way to a lot of players buying the dip, as what goes down must go up–at least most of the times.

## Two State HMM

```
hmm.1 <- depmix(returns ~ 1, family = gaussian(), nstates = 2, data=data.frame(returns=returns))
hmmfit.1 <- fit(hmm.1, verbose = FALSE)
```

`## converged at iteration 44 with logLik: 9165.435`

```
post_probs.1 <- posterior(hmmfit.1)
regimes.1 <- apply(post_probs.1, 1, max)
```

We can plot the probabilities of the two states to see how distinct they are:

```
df.probs <- data.frame(x=index(SPY), probs=post_probs.1)
ggplot(df.probs, aes(color=factor(probs.state))) +
geom_line(aes(x=x, y=probs.S1)) +
geom_line(aes(x=x, y=probs.S2)) +
theme_hc() +
xlab("Date") +
ylab("State Probability")
```

Most of the times the probabilities are clearly distinct, and thus the expressiveness of the two states and our model is rather high.

The returns colored by state look like this:

```
lr <- as.numeric(diff(log(SPY$SPY.Close), lag=1)$SPY.Close)
df.1 <- data.frame(date=index(SPY), return=returns, state=regimes.1, log.return=lr)
g1 <- ggplot(df.1, aes(x=date, y=return, color=factor(state), group=1)) +
geom_line() +
xlab("Date") +
ylab("Return") +
theme_bw()
g1
```

The next thing we’ll have a look at is what constitutes the states. What’s the return distribution? Is it high/low volatility, bear/bull market or something entirely different?

```
ggplot(df.1) +
geom_density(aes(x=returns, group=state, fill=factor(state)), alpha=.5, adjust=2) +
xlab("Returns") +
ylab("Density") +
theme_bw()
```

The data are obviously (somewhat^{2}) normal distributed around zero. We can use the statistical moments of the normal distribution as an approximation to check for the volatility. We already know the mean is near zero, so no state has a trend whatsoever. The volatility^{3} of the first state data is 0.006800447, whereas for the second state it is 0.02139266.

The original time series (closing prices) with the states as colors now looks like this:

```
df.close <- data.frame(date=index(SPY), return=df.1$return, close=as.numeric(SPY$SPY.Close), state=df.1$state)
ggplot(df.close) +
geom_line(aes(x=date, y=close, color=factor(state), group=1)) +
xlab("Date") +
ylab("Closing Price") +
theme_bw()
```

It seems it picked up some kind of distinction between bearish and bullish markets! This shows that the saying of “bearish markets are high volatility”^{4}.
We found a simple model that we can use to differentiate between a bear and a bull market, which allows us to utilize different strategies, based on their performance for each of the markets. Additionaly we now can generate data easily to see which date period corresponds to what market type without having to rely on intuition or external data.

see this overview↩

except for the long tails, but that seems to be an ongoing research topic as to what distribution returns really are↩

defined as the standard deviation of log returns↩

also see this article over at bloomberg↩