Bitcoin And The Chinese New Year

May 8, 2018
Data Science Bitcoin Cryptocurrencies

I’ve often read that the correction at the beginning of the year is something seasonal, and that it always occurs because of the chinese new year. I wanted to quickly have a look at the data to confirm/dismiss this claim. Maybe there are even more interesting patterns to detect here.

knitr::opts_chunk$set(echo = TRUE)

library(crypto)
library(tidyverse)
library(ggthemes)
btc <- crypto::getCoins("bitcoin")
df.btc <- data.frame(Date=btc$date %>% as.Date, Close=btc$close)
df.btc$Month <- months(df.btc$Date)
df.btc$Year <- format(df.btc$Date, format="%Y")
# NaN to accomodate for the difference being one data point smaller
df.btc$return <- c(NaN, df.btc$Close %>% diff)
df.btc$Month <- ordered(df.btc$Month, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))

# these are from wikipedia
# the `to` date always refers to the actual day
# of the new year
ny_dates <- c('2013-01-21','2013-02-10',
  '2014-01-11', '2014-01-31',
  '2015-01-30', '2015-02-19',
  '2016-01-19', '2016-02-08',
  '2017-01-08', '2017-01-28',
  '2018-01-27', '2018-02-16')

df.btc$is.cny <- df.btc$Date %>% strftime(format("%Y-%m-%d")) %in% ny_dates

The ohlcv data is obtained from coinmarketcap via the crypto package. In its whole glory it looks like this:

ggplot(df.btc, aes(y=Close, x=Date)) +
  geom_line() +
  theme_hc() +
  xlab("Date") + 
  ylab("Closing Price")

There are (arguably) some slight down periods at the start of the year, noticably for 2014, 2017 and 2018.

The average and cumulative returns per month should show a pattern if the “chinese new year hypothesis” turns out to be true.

# average returns
agg.avg.returns <- data.frame(aggregate(return ~ Month + Year, df.btc, mean))
agg.avg.returns$Month <- ordered(agg.avg.returns$Month, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))
# print(agg.avg.returns)
  
ggplot(agg.avg.returns, aes(x=Month, y=Year, fill=return, label = sprintf("%.1f%%", return)), title="") +
  geom_tile() +
  scale_fill_gradient2(high="green", low="red", midpoint=0) +
  geom_text(size=3, colour="black") +
  ggtitle("Average Daily BTC Returns Per Month") + 
  theme_minimal() + theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank())

agg.cum.returns <- data.frame(aggregate(return ~ Month + Year, df.btc, sum))

agg.cum.returns$Month <- ordered(agg.cum.returns$Month, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"))

ggplot(agg.cum.returns, aes(x=Month, y=Year, fill=return, label = sprintf("%.1f%%", return)), title="") +
  geom_tile() +
  scale_fill_gradient2(high="green", low="red", midpoint=0) +
  geom_text(size=3, colour="black") + ggtitle("Cumulative Daily BTC Returns Per Month") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Maybe the years show some kind of pattern when visualized per year. In the plot below, the circles represent the start of the “three weeks before new year” period and the second circle represents the actual day the chinese new year is celebrated. If the initial hypothesis is correct, the connecting line should display a downwards trend.

# this is a nasty trick to align the data per year
# on the same x ticks
df.btc$DateIdx <- format(df.btc$Date, "%m/%d/2000") %>% as.Date("%m/%d/%y")
df.annotate <- data.frame(x=df.btc$DateIdx[which(df.btc$is.cny)], y=df.btc$Close[which(df.btc$is.cny)], Year=df.btc$Year[which(df.btc$is.cny)])

ggplot(df.btc, aes(x=DateIdx, y=Close, color=Year, fill=is.cny)) +
  geom_line() +
  facet_grid(Year ~ ., scales = "free") +
  theme_hc() +
  scale_x_date(name="Month", date_labels="%b", date_breaks="1 month") +
  geom_point(data=df.annotate, aes(x=x, y=y, color=Year, shape="a"), inherit.aes=FALSE, size=3, show.legend = FALSE) +
  scale_y_continuous(name="Closing Price") +
  theme(panel.spacing = unit(2, "lines"))

2014 and 2018 are the only years displaying a downwards trend in the three weeks prior to chinese new year. Another interesting thing to look at might be the distribution of returns per year. We have seen the average and cumulative returns per months, but the period of interest might span multiple months.

ny_dates_ranges <- c(seq(from = as.Date('2013-01-21'), to = as.Date('2013-02-10'), "day"),
  seq(from = as.Date('2014-01-11'), to = as.Date('2014-01-31'), "day"),
  seq(from = as.Date('2015-01-30'), to = as.Date('2015-02-19'), "day"),
  seq(from = as.Date('2016-01-19'), to = as.Date('2016-02-08'), "day"),
  seq(from = as.Date('2017-01-08'), to = as.Date('2017-01-28'), "day"),
  seq(from = as.Date('2018-01-27'), to = as.Date('2018-02-16'), "day"))

df.btc$cny.range <- df.btc$Date %>% strftime(format("%Y-%m-%d")) %>% as.Date %in% ny_dates_ranges
df.cny <- df.btc[which(df.btc$cny.range),]

print(paste("cumulative return: ", sum(df.cny$return)))
## [1] "cumulative return:  -970.01"
summary(df.cny$return)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -1321.740   -10.030    -0.270    -9.238    15.480   896.320

The cumulative return is negative, the mean and median are negative. All these hint at a negative return rate (bearish periods) for most of the data (read: most of the years). We know that’s not true though. 2018 is an outlier and dominates the whole distribution due to the relatively higher range of values if not accounted for.

NB: when working with returns, we have to be carefull. When the start of the period shows very low returns (low negative values), and the rest considerably low but positive returns, the trend can still be negative. Henceforth it is a question of the distribution of returns.

ggplot(df.cny, aes(x=Year, y=return, color=Year, fill=Year)) +
  geom_violin() +
  theme_hc() +
  scale_y_continuous("Return")

A negative period before the chinese new year would result in a violin completely positioned below zero. None show this behavior. We’ll have a final look at the structure of each period compared to the others.

ggplot(df.cny, aes(x=DateIdx, y=Close)) +
  geom_line() +
  facet_grid(Year ~ ., scale = "free") +
  theme(panel.spacing = unit(2, "lines")) +
  geom_smooth(method = "lm") +
  theme_hc() +
  scale_x_date(name="Time") +
  scale_y_continuous(name="Closing Price")

Other than in the plot before, where we compared the whole years with each other, the data now displays a bi-yearly trend to be bearish, hence going down. I’m not sold on the hypothesis that all the chinese bitcoin users sell off a small portion, thus driving the market down due to the increased volume. If you think otherwise, let me know in the comments.

comments powered by Disqus