Technical analysis aims to forecast future price movements in the stock market, and will be the key tools which we shall use to analyze COVID-19 data. In this section, we first introduce the notion of a trend (“Trends”) and Candlestick Representations of data (“Candlestick representations”), and then outline some of the main technical indicators:
In subsequent sections, we will investigate the predictive power of these technical indicators on COVID-19 data.
Trends
While stock prices and daily COVID cases may fluctuate on shorter time frames, they have an observed tendency to evolve in the same direction for extended periods of time. The cause for long term trends in the stock market may be linked to macroeconomic factors such as monetary policy, or in the case of individual companies trends may be due to particular news or sentiment which result in the continual increases or decreases of the stock prices. For COVID-19, the growth of new cases is due to the fact that the virus is very infectious and the population was (and remains) highly vulnerable, which led to a significant initial uptrend. As a result, there is typically a well defined notion of “trends” in time series such as these. A good approach to identify trends in a time series is through a simple regression procedure (as we will detail below) and Fig. 1 presents an example in terms of the daily change in asset prices.
A period for which the gradient of the regression line has a fixed sign indicates a trend in the time series, which can be either an uptrend or a downtrend. Identifying such trends can provide insight into the likely subsequent behavior of the time series. Technical analysis was originally developed to provide signals for the start and end of trends in stock market data, and here we apply these techniques to alternative time series data sets.
We introduce a time variable which is integer valued and in particular we focus on daily and weekly intervals, counting from some partiular start date. A natural question to ask is whether for a given date D a particular evolution in the time series exhibits a preexisting trend over the last \(\delta \) days. If such a trend exists we will say that the data exhibits a \(\delta \)-interval trend, additionally:
-
A trend is called bullish if the gradient of the trend is strictly positive.
-
Conversely, a trend is said to be bearish its gradient is strictly negative.
In what follows we shall often partition the time series data into daily and weekly intervals. Over the course of an interval the value of the time series will vary, following conventions of stock market analyses, we shall track a number of characteristic features of each time interval, in particular:
-
The initial value for a given time interval, is called the opening value and is denoted \(O_t\).
-
The final value for a given time interval, is called the closing value and is denoted \(C_t\).
The subscript t is the index of the time interval. Since the data is discretized, typically \(C_t\ne O_{t+1}\). It is also useful to define an average value for a given time interval
$$\begin{aligned} M_t := \frac{O_t+C_t}{2}. \end{aligned}$$
(1)
To identify potential trends we apply a linear regression fitted to \(M_t\) over the range of dates \(D-\delta \le t \le D\), with residuals \(\gamma _t\) of the regression defined by
$$\begin{aligned} \gamma _t := \mathrm{abs}[(l(t)-M_t] , \end{aligned}$$
(2)
where l(t) is the value of the linear regression at time t. We then define a trend function \(T(\cdot )\) which takes the set of \(\{M_t\}\) as input and returns \(+1\) for an uptrend and \(-1\) for an downwards trend, as follows
$$\begin{aligned} T(\cdot ):= {\left\{ \begin{array}{ll} ~1, &{} k \ge 0.005\cdot \mu , \qquad \gamma _t<0.02 \cdot \mu \\ -1, &{}k \le -0.005\cdot \mu , \quad \gamma _t<0.02 \cdot \mu \\ ~0, &{} \mathrm{otherwise} \end{array}\right. }, \end{aligned}$$
(3)
where k denotes the slope of the regression and the mean is given by \(\mu = \mathrm{mean}(\{M_t\})\). The requirement on k corresponds to an increase or decrease of at least half of a percent of the mean. The restriction on \(\gamma \) evaluates the goodness of the linear regression fit, requiring that each \(\{M_t\}\) be no further than \(2\%\) from the trend line. The function returns zero if there is not a robust trend in values, indicating that there is no clear trend.
Candlestick representations
While price movements in the stock market can be represented as a continuous curve that is smoothed by time averaging over some period (be it seconds, minutes, hours, or days), candlesticks were proposed as a tool to better visualize the movements. Candlesticks provide a summary of prices using four numbers – open, close, high, low – in a given period. In addition to the opening \(O_t\) and closing \(C_t\) values defined above, we now introduce:
Typical lengths of the periods that candlestick describe are one day, an hour, 30 minutes, and 5 minutes. Specifically, given a time series over a certain period, a candlestick \({\mathcal {I}}_t\) for the interval t is defined by the quadruple
$$\begin{aligned} {\mathcal {I}}_t=(O_t, C_t, H_t, L_t). \end{aligned}$$
(4)
Taking the period to be a single day, this implies that \(C_t-O_t\) is the change in value over the day. For \(O_t>C_t\) this indicates a decrease in the value of the time series during the day, while \(C_t>O_t\) implies an increase. Moreover, \(C_{t-7}-O_t\) is the change in value over a given week.
A visualization of how a single candlestick is constructed from the data in the intervening period is shown in Fig. 2, following common practice we color the candlesticks red if \(O_t>C_t\) and green if \(C_t>O_t\), the color indicating whether the price increased or decreased over the period of the candlestick.
Each candlestick is comprised of three parts, the real body, and its lower and upper shadows. The real body \(r_t\) at time t is the difference between the opening values and the closing value
$$\begin{aligned} \begin{aligned} r_t = \text {abs}(O_t-C_t). \end{aligned} \end{aligned}$$
(5)
This is represented as the central solid rectangle in the visualization of Fig. 2. The lower shadows \(l_t\) and upper shadows \(u_t\) at time t are defined by
$$\begin{aligned} l_t= & {} \min (O_t, C_t)- L_t, \end{aligned}$$
(6)
$$\begin{aligned} u_t= & {} H_t- \max (O_t, C_t). \end{aligned}$$
(7)
These are represented as the thin lines which extend above and below the real body in the visualization of Fig. 2. Note that in some cases the shadows may have vanishing extent, for instance for \(O_t=L_t\) with \(C_t< O_t\).
In the context of the stock market, asset traders may often choose to utilize these discrete candlesticks to visualize the data, as this representation provides substantially much information than the simpler line graphs of stock prices. Traders have developed a number of visual cues based on this candlestick representation—known as candlestick patterns—which are thought to forecast future asset price moves, as we discuss next.
Illustrations of the construction of candlesticks. A red candlestick represents a decrease in value during the intervening period, observe that the open price is higher than the price at close. A green candlestick, conversely, indicates an increase in value. The proportions of the candlesticks are set by the open, high, low, close values over the period.
Examples of accurate forecasting via candlestick patterns. The x-axis provides an index of time with each candle representing one time period, while the y-axis indicates the value of some positive-valued measurable quantity (traditionally, share price). Axes values have been omitted as they are unimportant for these illustrations. The blue line indicates the 4-day trend lines established via linear regression, confirming either an appropriate uptrend or downtrend. The light colored candle indicates the start of each candlestick pattern, observe that in all cases shown the pattern corresponds to a trend reversal.
Candlestick patterns
Candlestick patterns typically involve the relative magnitude of the high, low, open, and close values of one or two consecutive candlesticks. There is a widespread use of these patterns within the trading community, with the belief that specific configurations of candlesticks can be used to forecast future price movements2:
-
If a pattern predicts an uptrend will reverse to a downtrend, it is called a bearish reversal pattern.
-
Conversely, a bullish reversal pattern predicts a reversal of a downwards trend to an uptrend.
In this work we focus our analysis on three bearish reversal patterns patterns (Bearish Engulfing, Hanging Man, and Dark Cloud Over) and two bullish reversal patterns (Bullish Engulfing and Hammer). For the mathematical definition of the candlestick patterns we used the definitions proposed in7, through restrictions on their \(O_t, C_t,L_t, H_t\) values and requirements on a pre-existing trend. These patterns are shown graphically in Figs. 3 and 4 and then are defined mathematically in Fig. 5.
The indices appearing in the definitions of Fig. 5 denote the time ordering such that the first candle of each pattern occurs with time stamp \(D=0\), with the second candle (if any) for \(D=1\). We require a trend for the preceding \(\delta \) intervals, such that there is an appropriate trend over the period \(D-\delta \le t \le D\) as outlined in “Trends”.
Using R we implemented a code which takes time-series data and outputs candlestick representations then scans the output for specific patterns. Some example candlestick patterns identified by our code when applied to the S &P 500 Index (GSPC) daily data are presented in Fig. 4. We show the signal event which indicates the candlestick pattern in a lighter shade. The regression line for the center of the four candlesticks preceding the candlestick patterns is shown to confirm the trend (note that we vary the required trend period in later sections).
Moving average convergence divergence
The Moving Average Convergence Divergence (MACD)3 provides an alternative set of bullish/bearish market signals which can be repurposed for general time series data. The MACD is calculated using two exponential moving averages (EMAs), calculated over two periods of differing length n. Specifically, for a given dataset of length n, usually the closing values \(\{C_1,C_2, \ldots C_{n}\}\), the EMA \(V_n\) is calculated recursively via
$$\begin{aligned} \begin{aligned} V_i [C_i]:= {\left\{ \begin{array}{ll} C_1 &{}i =1\\ s C_i + (1-s) V_{i-1} &{} i>1 \end{array}\right. }, \end{aligned} \end{aligned}$$
(8)
where \(s = \frac{2}{n+1}\) is smoothing factor. Thus, \(V_n\) can be seen as the exponential average over n intervals, which by substitutions in the recursive formula can be expressed
$$\begin{aligned} \begin{aligned} V_n = s[C_n + (1-s)C_{n-1} \ldots (1-s)^{n-1}C_1]. \end{aligned} \end{aligned}$$
(9)
Observe that the coefficient of each term decreases exponentially for earlier values in the time series, thus giving greater weighting to more recent data, hence the name. Given the EMA, the MACD is defined by the difference between a longer period average \(n_2\) and a shorter period average \(n_1\) (thus by convention \(n_1<n_2\)), as follows
$$\begin{aligned} \begin{aligned} \text {MACD}(n_1,n_2)= V_{n_1}- V_{n_2}. \end{aligned} \end{aligned}$$
(10)
Common choices for \((n_1, n_2)\) are (12, 26), which corresponds to the number of trading days in roughly two weeks and a month, and lead to the following (Fig. 6):
-
When the MACD has large positive values, it indicates that the values have risen more in the recent \(n_1\) observations when compared with the last \(n_2\) observations, signifying a strong uptrend.
-
Conversely, when MACD is negative, the price has fallen more in the last \(n_1\) observations, signifying a recent downtrend.
MACD analyses provides signals based on “momentum” of the time series. To identify buy and sell signals, the MACD is compared to the so-called Signal Line S, defined by
$$\begin{aligned} \begin{aligned} S = V_{n_3}[\text {MACD}(n_1,n_2)]. \end{aligned} \end{aligned}$$
(11)
A common value for \(n_3\) is 9, signifying a week and a half trading period.
There are many ways to use the signal line. In this paper we will focus on crossovers between the MACD and S, illustrated in Fig. 6, which are described below:
-
When MACD crosses from below to above the signal line, it serves as a bullish signal because the crossing signifies a strong uptrend in MACD, meaning the short-term momentum has risen faster than the long term momentum.
-
Conversely, when MACD crosses from above to below the signal line, it serves as a bearish signal forecasting a downturn in values.
Relative strength index
The Relative Strength Index (RSI) quantifies the momentum in the times series data through average rate of increases and decreases in value (see Fig. 7). The indicator is constructed by dividing the closing values \(\{C_t\}\) over some period into two sets:
From the above sets one can compute the averages \({\bar{G}}_t\) and \({\bar{L}}_t \) using the EMA \(V_n\) over n periods with a smoothing factor \(s = \frac{1}{n}\), leading to
$$\begin{aligned} \begin{aligned} {\bar{G}}_t = V_n[G_t]~~~\mathrm{and}~ ~~D_t = V_n[D_t]. \end{aligned} \end{aligned}$$
(14)
Then, the RSI\(_t\) indicator at time t is defined as follows
$$\begin{aligned} \begin{aligned} \text {RSI}_t := 100- \frac{100}{1+\frac{{\bar{G}}_t}{{\bar{D}}_t}}. \end{aligned} \end{aligned}$$
(15)
In the stock market the RSI is used to signal when an asset has become overbought (meaning it has appreciated more rapidly than thought to be typically sustaiable) or oversold. In particular, a high RSI is thought to indicate that one should anticipate a reversal from an uptrend to a downtrend in the near-term. A low RSI is interpreted in by market traders as an asset being oversold, and predicts a near-term increase in prices. We set the threshold for high and low RSI\(_t\) to be 75 and 25. When the RSI reaches 25 it serves as a bullish signal and conversely, when the RSI reaches 75 this gives a bearish signal. Figure 7 gives two examples of accurate RSI signals.