Good judgment comes from experience. Experience comes from bad judgment. —Jim Horning


In part3, model error has an attribute of a normal distribution. This signifies that the model is usually not too far off predicting the market within a reasonable bound. If there is an instant that the model error is exceedingly high, one may expect that the model error would reduce back toward zero in the next iteration, and prices would converge back to a long term average (or mean) that the model suggests. This behavior is called mean reversion. It turns out that the regression model is suggesting the market price mean reverts approximately to

Mean = ConstantA / (1 – ConstantB)

And the standard deviation of the modeled price is

ModeledPriceStd = ModelErrorStd / (1 – ConstantB^2) ^ (1/2)

If the current market price exceedingly lower than the modeled mean, one could long the asset and gain the advantages of mean reversion. Ie buy low sell high. But what is the measurement of when is the best time to enter a long position? S-Score is a simple comparison of how much the current price is away from the Mean and its relation to ModeledPriceStd.

S-Score = (CurrentMarketPriceMean) / ModeldedPriceStd
S-Score of +1 implies that the current market price is one Modeled Price Std higher than the mean.
S-Score of -1 implies that the current market price is one Modeled Price Std lower than the mean.

One may plan a set of trading rules using S-Score and thresholds similar to as follows:

Trade Entry
Long when S-Score is < -1
Short when S-Score is > +1

Trade Exit
Close Long Position when S-Score is > +0.5
Close Short Position when S-Score is < -0.5

Stop Loss
Stop Loss Long Position when S-Score < -1.2
Stop Loss Short Position when S-Score > +1.2

One would need to defined the thresholds levels based on personal’s favorite asset being traded and accepted risk levels. In the next post, we will go over some results of a trading strategy that strictly follows a S-Score rule.


In part 2, Linear Regression technique was used to calibrate the relationship between Current Price and Last Price. The model error of the relationship was introduce. It turns out that the model error roughly follows so called a standard distribution.

To understand standard distribution. Consider the men height in U.S. The average men height is around 5’10″. The shortest man is around 2’3″, and the tallest man is around 8’11″ feet (according to Google). Men’s height, more or less, follows a normal distribution where majority is around 5’10″, then less men at 5’9″, …, then finally only handful of men is below 3′. Below graph illustrates a normal distribution. Standard deviation can be thought as a measurement of the dispersion of data. Majority of the population is around the average, and only handful data points are far away from the average.

Going back to the Dow Jones example in part2. The model cannot perfectly represent all data point in the graph. Some data points are perfectly aligned to the linear line, some are far away. The distance of each data point from the linear line is called model error, and the average error is around 13.06, and the standard deviation (or dispersion / scatterness) of the error is around 84.12.

Majority of the error is within 0.5 standard deviations away from the average. There are rare occasions where the model and observed price disagree by 1.5 standard deviations or more. In this case, if the model is consistence, high model error could mean that the market may have mis-priced data points. Thus, building a trading strategy to trade at prices with high model error could be profitable. In the next post, we will refine the model error into trading signals using expected long term model return and S-Score.


In part1, we defined that the AR(1) model and its formula has a form:

CurrentPrice = ConstantA + ConstantB x LastPrice + Model Error

CurrentPrice is today close price; i.e. at time T=0.
LastPrice is previous close price; i.e. at time T-1.
ConstantA is the intercept of a linear relationship between CurrentPrice and LastPrice
ConstantB is the slope of a linear relationship between CurrentPrice and LastPrice

This equation suggests that the today’s close price and previous close price has a linear relationship. The term “linear” maybe thought as that a straight line is drawn to express the relationship. Using an example is best to illustrate this relationship. Consider Dow Jones Industrial Average Jul19th 2012 to Oct10th 2012 as shown in the table.

Column “Price T+0” is the closing price on the date that is indicated in the “Date” column. Column “Price T-1” is closing price of the corresponding previous date. We can plot the Price at T+0 and Price at T-1 on a graph as shown. The graph suggests that the prices at T+0 and T-1 have a pattern and a straight line (aka linear regression line) can be drawn to approximate this relationship. This line is drawn such that, in average, all data points are as close as possible from the line that approximates them. The line can be represented with a linear equation y = 0.9516x + 634.63. 634.63 is the ConstantA, and 0.9516 is the ConstantB of the AR(1) formula. This suggests that today’s close can be approximated using previous close times 0.9516 and then plus 634.63:

Today’s Close = 634.63 + 0.9516 x Previous’s Close

Readers can Google “excel graph add trendline” for details approximating the linear relationship using excel. One should note that the above formula is estimated (via least mean square regression technique) using data from Jul19th 2012 to Oct10th 2012. Should one be interest in another other time interval, the corresponding data set should be used otherwise.

As shown in the graph, the linear model cannot perfectly represent the data set. The difference between the line and the dataset are known as “model error”. The model error is how inaccurate the linear model representing the data set.
For example, consider Oct5th, 2012 has a closing price of 13610.15. Using the linear regression relationship, we can approximate (or forecast) Oct8th price:

Oct8th, 2012 = 634.63 + 0.9516 x 13610.15 = 13586.05

However, in reality the Oct8th price was 13583.65. And the “model error” is 13586.05 – 13583.65 = -2.4
Higher the model error, worse the line can approximate data points. Understand the overall model error of a model can enable us to from simple trading strategies. However, it requires simple understand of standard distribution, which we will go over in part3.


Autoregressive Model (AR) is one of the most vanilla algorithm widely used in the algorithmic trading field. The main application of the AR model is to forecast prices in the future using current and/or historical prices. In this series of posts, we will discuss how this model is implemented and go over a simple strategy utilizing its algorithm.

One note reader should keep in mind is that most well known algorithmic trading strategies are being implemented by many market participants (i.e. funds, financial institutions, and individual traders). Thus, the resultant exploitable strategies may not be proprietary and the corresponding profit opportunities recede. However, understanding the basics is essential for more complex strategies.

The founding idea of an AR model is that current price is somewhat dependent to the historical prices. Although this idea is against Efficient Market Hypotheses (EMH), the mathematics behind the model is useful for algorithmic trading purposes. Besides, the academia and the industries have numerous arguments that EMH is not entirely correct.

When observing historical prices, the AR Model can be set to observe 1 period, 2 periods, or N periods into the history. For our discussion, let’s use a time interval of our observation being daily stock prices. If we observe 5 days into history to model current prices, we call it a AR(5) Model. Similarly, 4 days would be AR(4), and AR(1) for looking 1 day into historical price. Lets discuss AR(1) of its simplicity and popularity. Below is the mathematic formula for an AR(1) model:

CurrentPrice = ConstantA + ConstantB x LastPrice + Model Error.

CurrentPrice is today’s stock price; i.e. at time T=0.
LastPrice is yesterday’s stock price; i.e. at time T-1.
ConstantA is the intercept of a linear relationship between CurrentPrice and LastPrice
ConstantB is the slope of a linear relationship between CurrentPrice and LastPrice

In part2, we will define the mathematical jargons above and understand how the formula works.


Algorithmic Trading is a field in finance that utilizes computer programs and algorithms to automatically execute investment strategies and/or executing orders without human interventions. Other names such as Algo, Black Box Trading, and Automated Trading are typically used in the field. There are two main catagories of algorithmic Trading: modeling investment strategies and executing orders.

Investment strategies can be modeled and coded into programs to monitor live market data for investment and/or arbitrage opportunities. Any automated strategies as simple as long/short based on moving averages to the more sophisticated using artificial intelligence with complex mathematics can fall under this category.

Automatic order execution is the other category of algorithmic trading. Suppose an investor wants to long one million shares of Apple. He would want to minimize his market impacts that would cause himself buying expensive shares. A possible strategy is to break up the trade into smaller blocks and execute them separately (some times anonymously) to minimize his market impact. Strategies such as VWAP or dark pool trading fall under this category. Alternatively, order execution can generate revenue if one can view and execute orders faster than other players in the market. Suppose Tom wants to buy one million shares of Citigroup and his broker sent out bids in the market. John’s company is the next door of the exchange and sees the bids before all other players do. John sees the bids are above market prices and immediately buy all available shares in the market, then sell them to Tom to profit the spreads. Such strategy is the foundation of High Frequency Trading and can generate profits from good order execution.

Well-funded players could have both investment strategies and order executions automated. Alternatively, order executions can be outsourced to brokers. Since automatic order executions requires expensive hardware and human capital, individual investors who utilize algorithms to trade usually for investment strategies.

© 2018 權傾天下 Tradeoptions4living Suffusion theme by Sayontan Sinha
Copy Protected by Chetan's WP-Copyprotect.