"Trying to model the complex interdependencies
between financial assets with so restrictive concept of correlation
is like trying to surf the internet with an
IBM AT." Carol Alexander
Just like a a drunk man leaving a bar follows a random walk. His
dog also follows a random walk on its own. The paths will diverge ...
Then they go into a
park where dogs are not allowed to be untied. Therefore the drunk
man puts a strap on his dog and both enter into the park.
Now, they share some common direction, their paths are cointegrated ...
see Murray [11].
A good intro is also given by
Carol Alexander in [2]
"Cointegration and asset allocation: A new active hedge fund strategy".
Definition: Two time series x_{t} and y_{t} are cointegrated if, and only if, each is
I(1) and a linear combination X_{t}ab Y_{t},
where b_{1}¹0 is I(0)
In general, linear combinations of I(1) time series are also I(1).
Cointegration is a
particular feature not displayed between arbitrary pairs of time series.
If two time series
are cointegrated, then the cointegrating vector (b_{1}) is unique
Granger (1981) introduced the case
y_{t}=
a+
b x_{t} +
u_{t}
(1)
where the individual time series are I(1) but the error term,
u_{t}, is I(0). That is, the error
term might be autocorrelated but, because it is stationary, the relationship will keep
returning to the equilibrium or longrun equation
y_{t}=a+b x_{t}
Granger (1981) and Engle and Granger (1987) demonstrated that, if a vector of
time series is cointegrated, the longrun parameters can be estimated directly without
specifying the dynamics because, in statistical terms, the estimated longrun parameter
estimates converge to their true values more quickly than those operating on stationary
variables. That is they are superconsistent and a twostep procedure of first estimating
the longrun relationship and estimating the dynamics, conditional on the long run
becomes possible.
As a result, simple static models came back in vogue in the late 1980's
but it rapidly became apparent that small sample biases can indeed be large (Banerjee et
al, 1986).
Two major problems typically arise in a regression such as
(1). First, it is not always clear
whether one should regress yt on xt or vice versa. Endogeneity is
not an issue asymptotically because the simultaneous equations
bias is of a lower order of importance and, indeed, is dominated
by the nonstationarity of the regressor. However, least squares
is affected by the chosen normalisation and the estimate of one
regression is not the inverse of that in the alternative ordering
unless R^{2} =1.
Secondly, the coefficient b^{^} is not asymptotically normal when xt is I(1) without
drift, even if ut is iid. Of course, autocorrelation in the residuals produces a bias in the
least squares standard errors, even when the regressor is nonstationary, and this effect is in
addition to that caused by nonstationarity.
The preceding discussion is based on the assumption that the disturbances are
stationary. In practice, it is necessary to pretest this assumption. Engle and Granger
suggested a number of alternative tests but that which emerged as the popular method is
the use of an ADF test on the residuals without including a constant or a time trend.
1.1 
Stationary and nonstationary variables 

Consider :
y_{t} = r y_{t1} + e_{t}
(2)
If r < 1 then the series is stationary (around 0),
if r = 1 then it is nonstationary (a random walk in this case).
A stationary series is one for which :

(i) the mean is constant
 (ii) the variance is constant, and
 (iii) Covariance(y_{t} , y_{ts}) depends only upon the lag length s.
Strictly, this is "weak" or "second order" stationarity but is
good enough for practical purposes.
We can more generally write
y_{t} = a + r y_{t1} + e_{t}
(3)
which is stationary around a/(1r).
(To see this, set E(y_{t}) = E(y_{t1})
hence E(y) = a + r E(y) + 0, hence E(y) = a/(1r).)
If r=1, we have a random walk with drift.
We can also have a second order AR process, e.g.
y_{t} = r_{1} y_{t1}  r_{2} y_{t2} + e_{t}
(4)
The conditions for stationarity of this series are given later.
We can also have a time trend incorporated:
y_{t} = b_{1}+ b_{2} t + r y_{t1} + e_{t}
(5)
This will be stationary (around b_{1} + b_{2} t) if r < 1.
This is called a trend stationary series (TSS).
As trend stationary series can look similar to nonstationary series.
This is unfortunate, since we should detrend the former (take
residuals from a regression on time) and difference the latter
(the latter are also known as difference stationary series for
this reason. It is also known as a stochastic trend). Doing the
wrong transformation leads to biased estimates in regression, so
it's important (but unfortunately difficult) to tell the
difference.
Note that, for a nonstationary process

y_{t} 
= 
a + y_{t1} + e_{t} we can write:


y_{0} 
= 
0 (say)

(6) 
y_{1} 
= 
a + 0 + e_{1}

(7) 
y_{2} 
= 
a + y_{1} + e_{2} = 2a + e_{1} + e_{2}

(8) 
...



(9) 
y_{t} 
= 
a t + åe

(10) 

This implies that
var(y) = t var(varepsilon) = ts^{2}
(11)
which tends to infinity as the sample size increases.
For a Trend Stationary Series with r=0:

y_{t} 
= 
b_{1}+ b_{2} t + e_{t}


y_{0} 
= 
0 (say)

(12) 
y_{1} 
= 
b_{1} + b_{2} + e_{1}

(13) 
y_{2} 
= 
b_{1} + b_{2} 2 + e_{2}

(14) 
...



(15) 
y_{t} = b_{1} + b_{2} t + e_{t}



(16) 

Note the similarity of
10 and
16, apart from the nature of the
error term, i.e. a Difference Stationary Series
can be written as a function of t, like a
Trend Stationary Series , but with a MA error term.
In finite samples a Trend Stationary Series can be
approximated arbitrarily well by a Difference Stationary Series , and vice versa.
The
finite sample distributions are very close together and it can be
hard to tell them apart.
A difference between them is that a shock to the system (D e) has
a temporary effect upon a Trend Series but a permanent effect upon a
Difference Series.
If we interpret 'shock' as a change in government policy, then we
can see the importance of finding whether variables are Difference Series or
Trend Series.
A nonstationary series is said to be integrated, with the order
of integration being the number of times the series needs to be
differenced before it becomes stationary. A stationary series is
integrated of order zero, I(0).
For the random walk model, y I(1).
Most economic variables are I(0) or I(1). Interest rates are
likely to be I(0), they are not trended. Output, the price
level, investment, etc. are likely to be I(1).
Some variables
may be I(2). Transforming to logs may affect the order of
integration.
^{1}
The investment strategy we aim at implementing is a market
neutral long/short strategy. This implies that we will try to
find shares with similar betas, where we believe one stock will
outperform the other one in the short term. By simultaneously
taking both a long and short position the beta of the pair equals
zero and the performance generated equals alpha.
Needless to mention, is that the hard part of this strategy is to
find market neutral positions that will deviate in returns. To do
this we can use a statistical tool developed by Schroder Salomon
Smith Barney (SSSB).
The starting point of this strategy is that
stocks that have historically had the same trading patterns (i.e.
constant price ratio) will have so in the future as well. If
there is a deviation from the historical mean, this creates a
trading opportunity, which can be exploited. Gains are earned
when the price relationship is restored. The historical
calculations of betas and the millions of tests executed are done
by SSSB, but it is our job as portfolio managers to interpret the
signals and execute the trades.
Summary:

find two stocks prices of which have historically
moved together,
 mean reversion in the ratio of the prices, correlation is not key
 Gains earned when the historical price relationship is restored
 Free resources invested in riskfree interest rate
2.2 
Testing for the mean reversion 

The challenge in this strategy is identifying stocks that tend to
move together and therefore make potential pairs. Our aim is to
identify pairs of stocks with meanreverting relative prices. To
find out if two stocks are meanreverting the test conducted is
the DickeyFuller test of the log ratio of the pair. In the
A DickeyFuller test for determining stationarity in the logratio
y_{t}=logA_{t}log B_{t} of share prices A and B
D y_{t} = µ + g y_{t1} + e_{t}
(17)
In other words, we are regressing
D y_{t} on lagged values of y_{t}.
the null hypothesis is that g=0,
which means that the process is not mean reverting.
If the null hypothesis can be rejected on the 99% confidence
level the price ratio is following a weak stationary process and
is thereby meanreverting. Research has shown that if the
confidence level is relaxed, the pairs do not meanrevert good
enough to generate satisfactory returns. This implies that a very
large number of regressions will be run to identify the pairs. If
you have 200 stocks, you will have to run 19 900 regressions,
which makes this quite computerpower and time consuming.
Schroder Salomon Smith Barney provide such calculation
By conducting this procedure, a large number of pairs will be
generated. The problem is that all of them do not have the same
or similar betas, which makes it difficult for us to stay market
neutral. Therefore a trading rule is introduced regarding the
spread of betas within a pair. The beta spread must be no larger
than 0.2, in order for a trade to be executed. The betas are
measured on a twoyear rolling window on daily data.
We now have meanreverting pairs with a limited beta spread, but
to further eliminate the risk we also want to stay sector
neutral. This implies that we only want to open a position in a
pair that is within the same sector. Due to the different
volatility within different sectors, we expect sectors showing
high volatility to produce very few pairs, while sectors with low
volatility to generate more pairs. Another factor influencing the
number of pairs generated is the homogeneity of the sector. A
sector like Commercial services is expected to generate very few
pairs, but Financials on the other hand should give many trading
opportunities. The reason why, is that companies within the
Financial sector have more homogenous operations and earnings.
The screening process described gives us a large set of pairs
that are both market and sector neutral, which can be used to
take positions. This should not be done randomly, since timing is
an important issue. We will therefore introduce several trading
execution rules.
All the calculations described above will be updated on a daily
basis. However, we will not have to do this ourselves, but we
will be provided with updated numbers every day, showing pairs
that are likely mean revert within the next couple of weeks. In
order to execute the strategy we need a couple of trading rules
to follow, i.e. to clarify when to open and when to close a
trade. Our basic rule will be to open a position when the ratio
of two share prices hits the 2 rolling standard deviation and
close it when the ratio returns to the mean. However, we do not
want to open a position in a pair with a spread that is wide and
getting wider. This can partly be avoided by the following
procedure: We actually want to open a position when the price
ratio deviates with more than two standard deviations from the
130 days rolling mean. The position is not opened when the ratio
breaks the twostandarddeviations limit for the first time, but
rather when it crosses it to revert to the mean again. You can
say that we have an open position when the pair is on its way
back again (see the picture below).
Figure 1: Pairs Trading rules
summary:

Open position when the ratio hits the 2 standard deviation
band for two consecutive times
 Close position when the ratio hits the mean
Furthermore, there will be some additional rules to prevent us
from loosing too much money on one single trade. If the ratio
develops in an unfavourable way, we will use a stoploss and
close the position as we have lost 20% of the initial size of the
position. Finally, we will never keep a position for more that 50
days. On average, the mean reversion will occur in approximately
35 days , and there is no reason to wait for a pair to revert
fully, if there is very little return to be earned. The potential
return to be earned must always be higher than the return earned
on the benchmark or in the fixed income market. The maximum
holding period
of a position is therefore set to 50 days. This should be enough
time for the pairs to revert, but also a short enough time not to
loose time value.
The rules described are totally based on statistics and
predetermined numbers. In addition, there is a possibility for us
to make our own decisions. If we for example are aware of
fundamentals that are not taken into account in the calculations
and that indicates that there will be no mean reversion for a
specific pairs, we can of course avoid investing in such pairs.
From the rules it can be concluded that we will open our last
position no later than 50 days before the trading game ends. The
last 50 days we will spend trying to close the trades at the most
optimal points of time.
Summary:

Stop loss at 20% of position value
 Beta spread <0.2
 Sector neutrality
 Maximum holding period < 50 trading days
 10 equally weighted positions
As already mentioned, through this strategy we do almost totally
avoid the systematic market risk. The reason there is still some
market risk exposure, is that a minor beta spread is allowed for.
In order to find a sufficient number of pairs, we have to accept
this beta spread, but the spread is so small that in practise the
market risk we are exposed to is ignorable. Also the industry
risk is eliminated, since we are only investing in pairs
belonging to the same industry.
The main risk we are being exposed to is then the risk of stock
specific events, that is the risk of fundamental changes implying
that the prices may never mean revert again, or at least not
within 50 days. In order to control for this risk we use the
rules of stoploss and maximum holding period. This risk is
further reduced through diversification, which is obtained by
simultaneously investing in several pairs. Initially we plan to
open approximately 10 different positions. Finally, we do face
the risk that the trading game does not last long enough. It
might be the case that our strategy is successful in the long
run, but that a few short run failures will ruin our overall
excess return possibilities.
2.7 
General Discussion on pairs trading 

There are
generally two types of pairs trading: statistical arbitrage
convergence/divergence trades, and fundamentallydriven
valuation trades. In the former, the driving force for the trade
is a aberration in the longterm spread between the two
securities, and to realize the meanreversion back to the norm,
you short one and go long the other.
The trick is creating a program to find the pairs,
and for the relationship to hold.
The other form of pairs trading would be more
fundamentallydriven variation, which is the purvey of most
marketneutral hedge funds: in essence they short the most
overvalued stock(s) and go long the undervalued stock(s). It's
normal to "pair" up stocks by having the same number per sector
on the long and short side, although the traditional "pairs"
aren't used anymore. Pairs trading had originally been the domain
of BD's in the late 70's, early 80's before it dissipated
somewhat due to the bull market (who would want to be
marketneutral in a rampant bull market), and the impossibility
of assigning "pairs" due to the morphing of traditional sectors
and constituents. Most people don't
perform traditional "pairs trading" anymore (i.e. the selection
of two similar, but mispriced, stocks from the same
industry/sector), but perform a variation. Goetzmann et al wrote
a paper on it a few years back, but at the last firm I worked at,
the research analyst "poohpaahed" it because he couldn't get the
same results: he thinks Goetzmann
[7]
either waived commissions, or
worse, totally ignored slippage (i.e. always took the best price,
not the realistically one). Here's the paper :
source: forum http://www.wilmott.com
some quotations from this paper:
"take a longshort position when they diverge." A test requires
that both of these steps must be parameterized in some way. How
do you identify "stocks that move together?" Need they be in the
same industry? Should they only be liquid stocks? How far do they
have to diverge before a position is put on? When is a position
unwound? We have made some straightforward choices about each of
these questions. We put positions on at a twostandard deviation
spread, which might not always cover transactions costs even when
stock prices converge. Although it is tempting to try potentially
more profitable schemes, the danger in datasnooping refinements
outweigh the potential insights gained about the higher profits
that could result from learning through testing. As it stands
now, datasnooping is a serious concern in our study. Pairs
trading is closely related to a widely studied subject in the
academic literature mean reversion in stock prices. 2 We
consider the possibility that we have simply reformulated a test
of the previously documented tendency of stocks to revert towards
their mean at certain horizons. To address this issue, we develop
a bootstrapping test based upon random pair choice. If
pairstrading profits were simply due to meanreversion, then we
should find that randomly chosen pairs generate profits, i.e.
that buying losers and selling winners in general makes money.
This simple contrarian strategy is unprofitable over the period
that we study, suggesting that mean reversion is not the whole
story.
Although the effect we document is not merely an extension
of previously known anomalies, it is still not immune to the
datasnooping argument. Indeed we have explicitly "snooped" the
data to the extent that we are testing a strategy we know to have
been actively exploited by risk arbitrageurs. As a consequence
we cannot be sure that past trading profits under our simple
strategies will continue in the future. This potential critique
has another side, however. The fact that pairs trading is already
wellknown riskarbitrage strategy means that we can simply test
current practice rather than develop our filterrule ad hoc.
3 
Optimal Convergence Trading 

From [10], Vladislav Kargin:
"Consider an investment in a mispriced asset. An investor can expect that the
mispricing will be eliminated in the future and play on the convergence of the
asset price to its true value. This play is risky because the convergence is not
immediate and its exact date is uncertain. Often both the expected benefit and
leveraging his positions, that is, by borrowing additional investment funds. An
important question for the investor is what is the optimal leverage policy.
There are two intuitively appealing strategies in this situation. The first
one is to invest only if the mispricing exceeds a threshold and to keep the
position unchanged until the mispricing falls below another threshold (an (s, S)
strategy). For this strategy the relevant questions are what are the optimal
thresholds and what are the properties of the investment portfolio corresponding
to this strategy. The second type of strategy is to continuously change positions
according to the level of mispricing. In this case, we are interested in the optimal
functional form of the dependence of the position on the mispricing."
See also discussion on optimal growth strategies in
[9].
3.1 
Optimal Trading in presence of mean reversion 

In [14], Thompson define close form of trading
threshold strategies in presence of OrnsteinUhlenbeck process
and fixed transaction costs c.
if the price of the OU process is
dS_{t}=s d B_{t}  g S_{t} dt
The optimal strategy is
a threshold strategy, ie
to buy if S_{t} £ b/g and to sell
S_{t} ³ b/g whre b satisfies:
2bg c = 2 e^{b2/(g s2} 
ó
õ 

e^{u2/(g s2} du

with c fixed transaction cost.
3.2 
Continuous version of mean reversion : the OrnsteinUhlenbeck process 

Suppose that the dynamic of the mispricing can be modelled by
an AR(1) process. RA(1) process is the discretetime
counterpart to the OrnsteinUhlenbeck (OU) process in continuous time.
dX_{t}=b(aX_{t}) dt + s dW_{t}
(18)
where W_{t} is a standard Wiener process,
s>0 and a, b are constants.
So the X_{t} process
drifts towards a.
OU process also has a normal transition density function given by :
f(X_{t}=x,t;X_{t0}=x_{0},t_{0})=


e 

(19) 
with mean
m(t)=a+(x_{0}a) e^{b(tt0)}
(20)
with variance:
s^{(}t)= 

[1e^{2b(tt0)}]
(21) 
If the process displays the property of mean reversion (b>0), then
as t_{0}® ¥ or tt_{0}® +¥, the marginal density
of the process is invariant to time, ie OU process is stationary in the strict sense.
See that there is a time decay term for the variance.
For long
time horizon the variance of this process tends to
s^{2}/2b. So,
unlike the Brownian motion the variance is bounded (not grows to infinite).
The equation describing dX_{t}, the arithmetic OrnsteinUhlenbeck
equation presented above is a continuous time version of the
firstorder autoregressive process, AR(1), in discrete time.
It is the limiting case (D t tends to
zero) of the AR(1) process:
x_{t}  x_{t1} = a (1  e^{b}) + (e^{b}  1) x_{t}  1 + e_{t}
(22)
Where e_{t}
is normally distributed with mean zero and standard deviation
In order to estimate the parameters of meanreversion, run the regression:
x_{t}  x_{t  1} = a + b x_{t  1} + e_{t}
(24)
Calculate the parameters:

a 
= 
a/b ;


b 
= 
 ln(1 + b);

(25) 
s 
= 
s_{e}

æ
ç
ç
è 
ln(1 + b) 

(1 + b)^{2}  1 

ö
÷
÷
ø 


(26) 

The choice of the representation may depend on the data.
For example, with daily data, the AR discrete form is preferable,
with high frequency data, it might preferable to use
the continuous time model.
One important distinction between random walk and stationary
AR(1) processes: for the last one all the shocks are transitory,
whereas for random walk all shocks are permanent
MeanReversion Combined with Exponential Drift
It is possible combine Geometric Brownian
Motion (exponential drift) with meanreverting model.

= 
æ
ç
ç
è 
a +h( 

e^{a t}X) 
ö
÷
÷
ø 
dt
+s dW
(27) 
According to Granger (1981), a times series, Xt, is said to cause
another times series, Yt, if present Y can be predicted better by
using the value of X. The first step in the empirical analysis is
to examine the stationarity of the price series.
We now look at situations where the validity of the linear
regression model is dubious  where variables are trended or,
more formally, nonstationary (not quite the same thing).
Regressions can be spurious when variables are ns, i.e. you
appear to have 'significant' results when in fact you haven't.
Nelson and Plosser ran an experiment. They generated two random
variables (these are random walks):

x_{t} 
= 
x_{t1} + e_{t}


y_{t} 
= 
y_{t1} + n_{t}

(28) 

where both errors have the classical properties and are
independent. y and x should therefore be independent.
Regressing y on x, N & P got a 'significant' result (at the 5%
level) 75% of the time !!!
This is worrying!
This is a spurious regression and it occurs because
of the common trends in y and x.
In these circumstances, the t and F statistics do not have
the standard distributions. Unfortunately, the problem generally
gets worse with a larger sample size.
Such problems tend to occur with nonstationary variables and,
unfortunately, many economic variables are like this.
6 
Unit Root Hypohtesis Testing 

Formally, a stochastic process is stationary if all the roots of
its characteristic equation are >1 in absolute value. Solving is
the same as solving a difference equation.
Examples
For
y_{t} = r y_{t1} + e_{t}
(29)
we rewrite it as y_{t}  r y_{t1} = e_{t}
or (1r L)y_{t}=e_{t}.
Hence this will be stationary if the root of the
characteristic equation
1r L = 0 is >1.
The root is L = 1/r which is >1 if r <1.
This is the condition for stationarity.
Example II: y_{t} = 2.8y_{t1} 1.6y_{t2} + e_{t}
The characteristic equation is 1  2.8L + 1.6L^{2} = 0.
From this we get L = 1.25 and L = 0.5 are the roots. Both
roots need to be greater than 1 in absolute value for
stationarity. This does not apply here, so the process is
nonstationary.
However, in practice we do not know the r values, we have to
estimate them and then test whether the roots are all >1. We
could estimate
y_{t} =
r y_{t1} +
e_{t}
(30)
and test
H0: r = 1 (non stationary)
(31)
versus
H1: r < 1 (stationary)
(32)
using a ttest.
Unfortunately, if r=1, the
estimate of r is biased downwards (even in large samples) and
also the tdistribution is inappropriate. Hence can't use
standard methods.
Instead we use the DickeyFuller test. Rewrite
30
D y_{t} =
r^{*} y_{t1} +
e_{t}
(33)
where r^{*} = r1.
Now we test
H0: r^{*} = 0 (non stationary)
(34)
versus
H1: r^{*} < 0 (stationary)
(35)
We cannot use critical values from the
tdistribution, but DF provide alternative tables to use.
The DF equation only tests for first order autocorrelation of
y. If the order is higher, the test is invalid and the DF
equation suffers from residual correlation. To counter this, add
lagged values of D y to the equation before testing.
This gives
the Augmented DickeyFuller test. Sufficient lags should be
added to ensure e is white noise.
95% critical value for the augmented DickeyFuller statistic ADF=3.0199
It is important to know the order of integration of
nonstationary variables, so they may be differenced before being
included in a regression equation. The ADF test does this, but
it should be noted that it tends to have low power (i.e. it fails
to reject H0 of nonstationarity even when false) against the
alternative of a stationary series with r near to 1.
6.2 
Variants on the DickeyFuller Test 

The DickeyFuller test requires that the us be uncorrelated. But suppose we
have a model like the following, where the first difference of Y is a stationary
AR(p) process:
D Y_{t} = 

d_{i} D Y_{ti}+u_{t}
(36) 
This model yields a model for Y_{t} that is:
Y_{t} = Y_{t1} + 

d_{i} D Y_{ti}+u_{t}
(37) 
If this is really what's going on in our series, and we estimate a standard
D.F. test:
Y_{t} = 

Y_{t1} + u_{t}
(38) 
the term å_{i=1}^{p} d_{i} D Y_{ti}
gets lumped into the errors u_{t}. This induces an AR(p)
structure in the us, and the standard D.F. test statistics will be wrong.
There are two ways of dealing with this problem:

Change the model (known as the augmented DickeyFuller test), or
 Change the test statistic (the PhillipsPerron test).
6.3 
The Augmented DickeyFuller Test 

Rather than estimating the simple model, we can instead estimate:
D Y_{t} = 

d_{i} D Y_{ti}+u_{t}
(39) 
and test whether or not r = 0.
This is the Augmented DickeyFuller test.
As with the DF test, we can include a constant/trend term to differentiate
between a series with a unit root and one with a deterministic trend.
Y_{t} = a+b t +Y_{t1} + 

d_{i} D Y_{ti}+u_{t}
(40) 
The purpose of the lags of D Y_{t}
is to ensure that the us are white noise.
This means that in choosing p
(the number of lagged D Y_{ti} terms to include),
we have to consider two things:

Too few lags will leave autocorrelation in the errors, while
 Too many lags will reduce the power of the test statistic.
This suggests, as a practical matter, a couple different ways to go about
determining the value of p:

Start with a large value of p, and reduce it if the values of di are
insignificant at long lags  This is generally a pretty good approach.
 Start with a small value of p, and increase it if values of di are
significant. This is a lessgood approach...
 Estimate models with a range of values for p, and use an AIC/BIC/
Ftest to determine which is the best option. This is probably the best
option of all...
A sidenote: AIC and BIC tests:
The Akaike Information Criterion (AIC) and Bayes Information Criterion
(BIC) are general tests for model specification. They can be applied across a
range of different areas, and are like Ftests in that they allow for the testing
of the relative power of nested models. Each, however, does so by penalizing
models which are overspecified (i.e., those with "too many" parameters). The
AIC statistic is:
AIC(p)= logs_{p}^{2} + 

(41) 
where N is the number of observations in the regression, p is the number of
parameters in the model (including r and a),
and s_{p}^{2} is the estimated s^{2} for
the regression including p total parameters.
Similarly, the BIC statistic is
calculated as:
BIC(p)= ln s_{p}^{2} + 

(42) 
2.1 The KPSS test
One potential problem with all the unit root tests so far described is that they
take a unit root as the null hypothesis. Kwiatkowski et. al. (1992) provide
an alternative test (which has come to be known as the KPSS test) for testing
the null of stationarity against the alternative of a unit root. This method
considers models with constant terms, and either with or
without a deterministic trend term. Thus, the KPSS test tests the null
of a level or trendstationary process against the alternative of a unit root.
Formally, the KPSS test is equal to:
estimated error variance from the regression:
Y_{t} =a+ e_{t}
(43)
or:
Y_{t} =a+ b t + e_{t}
(44)
for the model with a trend.
The practical advantages to the KPSS test are twofold. First, they provide
an alternative to the DF/ADF/PP tests in which the null hypothesis is stationarity.
They are thus good "complements" for the tests we have focused on
so far. A common strategy is to present results of both ADF/PP and KPSS
tests, and show that the results are consistent (e.g., that the former reject
the null while the latter fails to do so, or viceversa). In cases where the two
tests diverge (e.g., both fail to reject the null), the possibility of "fractional
integration" should be considered (e.g. Baillie 1989; BoxSteffensmeier and
Smith 1996, 1998).
General Issues in Unit Root Testing
The Sims (1988) article I assigned is to point out an issue with unit root
econometrics in general: that classicists and Bayesians have very different
ideas about the value of knifeedge unit root tests like the ones here.1
Unlike classical statisticians, Bayesians regard (the "true" value of the
autocorrelation
parameter) as a random variable, and the goal to describe the
distribution of this variable, making use of the information contained in the
data. One result of this is that, unlike the classical approach (where the
distribution is skewed), the Bayesian perspective allows testing using
standard t distributions. For more on why this is, see the discussion in
Hamilton.
Another issue has to do with lag lengths. As in the case of ARIMA models,
choosing different lag lengths (e.g. in the ADF, PP and KPSS tests) can lead
to different conclusions. This is an element of subjectivity that one needs to
be aware of, and sensitivity testing across numerous different lags is almost
always a good idea.
Finally, the whole reason we do unit root tests will become clearer when we
talk about cointegration in a few weeks.
6.4 
Error Correction Model 

Error Correction Model (ECM) is a step forward to determine how variables are
linked together.
1. Test the variables for order of integration. They must both (all) be I(d).
2. Estimate the parameters of the long run relationship. For example,
y_{t}=a+b x_{t} + e_{t}
(45)
when y_{t} and z_{t} are cointegrated OLS is super consistent.
That is, the rate of convergence is T^{2} rather than just T in Chebyshev's inequality.
3. Denote the residuals from step 2 as and fit the model
The null and alternate hypotheses are

H_{0} : a = 0 => unit root = no cointegration




H_{1} : a ¹ 0 =>no unit root = cointegration



(47) 

Interpretation: Rejection of the Null implies the residual is
stationary.
If the residual series is stationary then y_{t} and x_{t}
must be cointegrated.
4. If you reject the null in step 3 then estimate the parameters of the
Error Correction Model

D y_{t} 
= 
a_{1} + a_{y} (y_{t1}b x_{t1}
+ 

a_{11}^{(i)}D y_{ti}
+ 

a_{12}^{(i)}D x_{ti}
+e_{yt}



D x_{t} 
= 
a_{2} + a_{x} (y_{t1}b x_{t1}
+ 

a_{21}^{(i)}D y_{ti}
+ 

a_{22}^{(i)}D x_{ti}
+e_{xt}


(48) 

ECM is generalized to vectors:
The components of the vector x_{t} = (x1t, x2t, .. , xnt) are
cointegrated of order (d,b), denoted by xt CI(d,b), if
All components of x_{t} are I(d) and
There exists a vector b = (b_{1}, b_{2},...,b_{n})
such that b x_{t} is I(db), where b > 0.
Note b is called the cointegrating vector.
Points to remember:
To make b unique we must normalize on one of the coefficients.
All variables must be cointegrated of the same order. But, all
variables of the same I(d) are not necessarily cointegrated.
If x_{t} is n x 1
then there may be as many as n1 cointegrating vectors. The
number of cointegrating vectors is called the cointegrating rank.
An interpretation of cointegrated variables is that they share a
common stochastic trend.
Given our notions of equilibrium in economics, we must conclude
that the time paths of cointegrated variables are determined in
part by how far we are from equilibrium. That is, if the
variables wander from each other, there must be some way for them
to get back together. This is
the notion of error correction.
Granger Representation theorem :
"Cointegration implies Error Correction Model (ECM)."
6.5.1 
Aaron at willmot.com 

see original discussion link
Cointegration is covered in any good econometrics textbook. If
you need more depth, Johansen's LikelihoodBased Inference in
Cointegrated Vector Autoregressive Models, Oxford University
Press, 1995, is good.
I do not recommend the original Engle and Granger paper (1987).
Two series are said to be (linearly) "cointegrated" if a (linear)
combination of them is stationary. The practical effect in
Finance is we deal with asset prices directly instead of asset
returns.
For example, suppose I want to hold a market neutral investment
in stock A (I think stock A will outperform the general market,
but I have no view about the direction of the overall market).
Traditionally, I buy 1,000,000 of stock A and short 1,000,000
times Beta of the Index. Beta is derived from the covariance of
returns between stock A and the Index. Looking at things another
way, I choose the portfolio of A and the Index that would have
had minimum variance of return in the past (over the estimation
interval I used for Beta, and subject to the constraint that the
holding of stock A is $1,000,000).
A linear cointegration approach is to select the portfolio in the
past that would have been most stationary. There are a variety of
ways of defining this (just as there are a variety of ways of
estimating Beta) but the simplest one is to select the portfolio
with the minimum extreme profit or loss over the interval. Note
that the criterion is based on P&L of the portfolio (price) not
return (derivative of price).
The key difference is correlation is extremely sensitive to small
time deviations, cointegration is not. Leads or lags in either
price reaction or data measurement make correlations useless. For
example, suppose you shifted the data in the problem above, so
your stock quotes were one day earlier than you Index quotes.
That would completely change the Beta, probably sending it near
zero, but would have little effect on the cointegration analysis.
Economists need cointegration because they deal with bad data,
and their theories incorporate lots of unpredictable leads and
lags. Finance, in theory, deals with perfect data with no leads
or lags. If you have really good data of execution prices,
cointregation throws out your most valuable (in a moneymaking
sense) information. If you can really execute, you don't care if
there's only a few seconds in which to do so. But if you have bad
data, either in the sense that the time is not welldetermined or
that you may not be able to execute, cointegration is much safer.
In a sense, people have been using cointegration for asset
management as long as they have been computing historical pro
forma strategy returns and looking at the entire chart, not just
the mean and standard deviation (or other assumedstationary
parameters).
My feeling is cointegration is essential for risk management and
hedging, but useless for trading and pricing. Correlation is easy
and wellunderstood. You can use it for riskmanagement and
hedging, but only if you backtest (which essentially checks the
results against a cointegration approach) to find the
appropriate adjustments and estimation techniques. Correlation is
useful for trading and pricing (sorry Paul) but only if you allow
stochastic covariance matrices.
More formally, if a vector
of time series is I(d) but a linear combination is integrated to a lower order, the time
series are said to be cointegrated.
from : Econometric Forecasting
http://www.lums2.lancs.ac.uk/MANSCI/Staff/EconometricForecasting.pdf
These are the arguments in favor of testing whether a series has a unit root:
(1) It gives information about the nature of the series that
should be helpful in model specification, particularly whether to
express the variable in levels or in differences.
(2) For two or more variables to be cointegrated each must
possess a unit root (or more than one).
These are the arguments against testing:
(1) Unit root tests are fairly blunt tools. They have low power
and often conclude that a unit root is present when in fact it is
not. Therefore, the finding that a variable does not possess a
unit root is a strong result. What is perhaps less well known is
that many unitroot tests suffer from size distortions. The
actual chance of rejecting the null hypothesis of a unit root,
when it is true, is much higher than implied by the nominal
significance level. These findings are based on 15 or more Monte
Carlo studies, of which Schwert (1989) is the most influential
(Stock 1994, p. 2777).
(2) The testing strategy needed is quite complex.
In practice, a nonseasonal economic variable rarely has more than
a single unit root and is made stationary by taking first
differences. Dickey and Fuller (1979) recognized that they could
test for the presence of a unit root by regressing the
firstdifferenced series on lagged values of the original series.
If a unit root is present, the coefficient on the lagged values
should not differ significantly from zero. They also developed
the special tables of critical values needed for the test.
Since the publication of the original unit root test there has
been an avalanche of modifications, alternatives, and
comparisons. Banerjee, Dolado, Galbraith, and Hendry (1993,
chapter 4) give details of the more popular methods. The standard
test today is the augmented DickeyFuller test (ADF), in which
lagged dependent variables are added to the regression. This is
intended to improve the properties of the disturbances, which the
test requires to be independent with constant variance, but
adding too many lagged variables weakens an already lowpowered
test.
Two problems must be solved to perform an ADF unitroot test: How
many lagged variables should be used? And should the series be
modeled with a constant and deterministic trend which, if
present, distort the test statistics? Taking the second problem
first, the ADFGLS test proposed by Elliott, Rothenberg, and
Stock (1996) has a straightforward strategy that is easy to
implement and uses the same tables of critical values as the
regular ADF test.
First, estimate the coefficients of an ordinary trend regression
but use generalized least squares rather than ordinary least
squares. Form the detrended series, y^{d}, given by
y_{t}^{d} = y_{t}  b_{0}  b_{1} t , where
b_{0} and b_{1} are the coefficients just estimated.
In the second stage, conduct a unit root test with the standard
ADF approach with no constant and no deterministic trend but use
y^{d} instead of the original series.
To solve the problem of how many lagged variables to use, start
with a fairly high lag order, for example, eight lags for annual,
16 for quarterly, and 24 for monthly data. Test successively
shorter lags to find the length that gives the best compromise
between keeping the power of the test up and keeping the
desirable properties of the disturbances. Monte Carlo experiments
reported by Stock (1994) and Elliott, Rothenberg and Stock (1996)
favor the Schwartz BIC over a likelihoodratio criterion but both
increased the power of the unitroot test compared with using an
arbitrarily fixed lag length. We suspect that this difference has
little consequence in practice. Cheung and Chinn (1997) give an
example of using the ADFGLS test on US GNP.
Although the ADFGLS test has so far been little used it does
seem to have several advantages over competing unitroot tests:
(1) It has a simple strategy that avoids the need for sequential
testing starting with the most general form of ADF equation (as
described by Dolado, Jenkinson, and SosvillaRivero (1990, p.
225)).
(2) It performs as well as or better than other unitroot tests.
Monte Carlo studies show that its size distortion (the difference
between actual and nominal significance levels) is almost as good
as the ADF ttest (Elliott, Rothenberg & Stock 1996; Stock 1994)
and much less than the PhillipsPerron Z test (Schwert 1989).
Also, the power of the ADFGLS statistic is often much greater
than that of the ADF ttest, particularly in borderline
situations.
Elliott et. al (1996) showed that there is no uniformly most
powerful test for this problem and derived tests that were
approximately most powerful in the sense that they have
asymptotic power close to the envelope of most powerful tests for
this problem.

Based on the idea that, if a series is stationary, the variance of the
series is not increasing over time; while a series with a unit root has
increasing variance.
 Intuition: Compare the variance of a subset of the data "early" in
the series with a similarlysized subset "later" in the process. In the
limit, for a stationary series, these two values should be the same, while
they will be different for an I(1) series. Thus, the null hypothesis is
stationarity, as for the KPSS test.
 There's a good, brief discussion of these tests in Hamilton (p. 53132).
Other cites are Cochrane (1988), Lo and McKinlay (1988), and
Cecchetti and Lam (1991).
The variance ratio methodology tests the hypothesis that the variance of multiperiod
returns increases linearly with time.
Hence if we calculate the variance s^{2} of a series of returns
every D t periods, the null hypothesis suggests that sampling
every k*D t periods will lead to a variance k s^{2}:
Variance(r_{k D t}) = k Variance(r_{D t})
(49)
The variance ratio is significantly below
one under mean reversion, and above one under random walk:


=1 under random walk



VR(k)= 
Variance(r_{k D t})/k 

Variance(r_{D t}) 


<1 under mean reversion


(50) 

>1 under mean aversion


(51) 

More precisely, in the usual fixed k asymptotic treatment,
under the null hypothesis that the x_{t} follow a random
walk with possible drift, given by
x_{t}=µ+x_{t1}+e_{t}
(52)
where µ is areal number and e_{t} is a sequence of
e_{t} is a sequence of zero mean independent random variables, it is
possible to show
that

( 
n 
) 
^{1/2}
(VR(k)1) ® N(0,s_{k}^{2})
(53) 
where s_{k}^{2} is some simple function
of k (this is not the variance, due to overlapping
observations, to make sample size sufficiently large for large k, and to
correct the bias in variance estimators)
note that this result is quite general, and stands under the
simple hypothesis that
e_{t} is a sequence of zero mean independent random variables.
Any significant deviation from 53
means that e_{t} are not independent random variables.
This result extends to the case where the e_{t}
are a martingale difference series
with conditional heteroscedasticity
though the variance s_{k}^{2} has to be adjusted a little.
The use of the VR statistic
can be advantageous when testing against several interesting
alternatives to the random walk model, most notably those
hypotheses associated with mean reversion.
In fact, a number of authors (e.g., Lo and Mackinlay (1989),
Faust (1992) and Richardson and Smith (1991)) have found that the
VR statistic has optimal power against such alternatives.
Note that VR(k) can be writen as :

VR(k) 
= 
Variance(r_{k t}) 

k Variance(r_{t}) 



= 
Variance(r_{t}+r_{t+1}+....+r_{t+k}) 

k Variance(r_{t}) 

(54) 

This expression can be expanded in :
VR(k)=1+2 

(1 

)r_{i}
(55) 
where r_{i} is the i'th term in the autocorrelation
function (ACF)
of returns. This expression holds asymptotically.
Note that this expression can be used to calculate
the ACF at various lags.
For example, for k=2
VR(2)=1+r_{1}
(56)
Note that if VR(2) is significantly under one,is the same as
as a negative autocorrelation at lag one : r_{1}<0
Let x_{i} i=0,N be the series of increments (log of returns for example)
where
x_{k} is the log price at time k, N, the sample size,
M=k(Nk+1)(1k/N) and µ^{^}=(x_{N}x_{0})/N estimate of the mean.
VR(k) is defined as s_{a}^{^}^{2}/s_{c}^{^}^{2}
Testing for the null hypothesis is to test if
is normally distributed.
Nomality classical tests can be applied, like z scores,
Kolmogorov Smirnov test (or rather a Lillifors test).
8 
Absolute Return Ratio Test 

source: Groenendijky & Al., A Hybrid Joint Moment Ratio Test for Financial Time Series
see []
If one augments the martingale assumption for financial
asset prices with the condition that the martingale differences have constant
(conditional) variance, it follows that the variance of asset returns is directly
proportional to the holding period. This property has been used to construct
formal testing procedures for the martingale hypothesis, known as variance
ratio tests.
Variance ratio tests are especially good at detecting linear dependence
in the returns.
While the variance ratio statistic describes
one aspect of asset returns, the idea behind this statistic can be generalized
to provide a more complete characterization of asset return data.
We focus on using a combination of the
variance ratio statistic and the first absolute moment ratio statistic.
The first absolute moment ratio statistic by itself is useful as a measure of linear
dependence if no higher order moments than the variance exist.
In combination with the variance ratio statistic it can be used to disentangle linear
dependence from other deviations of the standard assumption in
finance of
unconditionally normally distributed returns. In particular, the absolute
moment ratio statistic provides information concerning the tail of the
distribution and conditional heteroskedasticity.
By using lower order moments of
asset returns in the construction of volatility ratios, e.g., absolute returns,
one relaxes the conditions on the number of moments that need to exist for
standard asymptotic distribution theory to apply.
We formally prove that
our general testing methodology can in principle even be applied for return
distributions that lie in the domain of attraction of a stable law (which
includes the normal distribution as a special case). Stable laws, apart from
the normal distribution, have infinite variance, such that our approach is
applicable outside the finitevariance paradigm.
Since in empirical work there often
exists considerable controversy about the precise nature of the asset return.
The first absolute moment
has been used before as a measure of volatility,
Muller et al. observe a regularity
in the absolute moment estimates which is not in line with the presumption
of i.i.d. normal innovations; this regularity was labelled the scaling law. In
this paper we consider the ratios of these absolute moment estimates, we
obtain their statistical properties under various distributional assumptions,
and we explain the observed regularity behind the `scaling law'.
In particular, we show why the deviations observed by Muller et al. should not
be carelessly interpreted as evidence against the efficient market hypothesis.
Furthermore, we show that the absolute moment ratio statistics contain much
more information than the scaling law. Especially, when the statistic is used
in combination with the variance ratio statistic, most of the characteristic
features of asset returns come to the fore.
Specifically, we advocate the simultaneous use of volatility statistics based
on first (absolute returns) and second order moments (variances).
In such
a way we construct a test which is not only suited to detect linear
dependence in asset returns, but also fattailedness and nonlinear dependence,
e.g., volatility clustering.
We analytically show why moment ratios based
on absolute returns can be used to detect fattailedness and volatility
clustering, while standard variance ratios convey no information in this respect.
Discriminating between the alternative phenomena is important, since they
have different implications for portfolio selection and risk management.
Throughout the paper, we rely on a convenient graphical representation of
the statistics: the moment ratio curves.
The formal testing procedure we propose in this paper heavily builds
on the bootstrap. By performing a nonparametric bootstrap based on the
empirical returns, we construct uniform confidence intervals for the range
of moment ratios considered.
Absolute returns exhibits the highest correlation
(Rama Cont [4]).
9 
Multi variate cointegration  Vector Error Correction Modelling 

Among the general class of the multivariate ARIMA (AutoRegressive
Integrated Moving Average) model, the Vector Autoregressive (VAR) model turns
out to be particularly convenient for empirical work. Although there are important
reasons to allow also for moving average errors (e.g. L. utkepohl 1991, 1999), the
VAR model has become the dominant work horse in the analysis of multivariate
time series. Furthermore, Engle and Granger (1987) show that VAR model is
an attractive starting point to study the long run relationship between time series
that are stationary in first differences. Since Johansen's (1988) seminal paper, the
cointegrated VAR model has become very popular in empirical macroeconomics.
see resources.

[6], Engle and Granger seminal paper:
"Cointegration and ErrorCorrection: Representation, Estimation and Testing",
 ***** Explaining Cointegration Analysis: David F. Hendry and
Katarina Juselius:
part I, cached
part II, cached
 Carol Alexander is specialized in cointegration trading
and index tracking,
"Cointegration and asset allocation: A new active hedge fund strategy"
[2]
includes a good intro to cointegration,
see also, http://www.bankingmm.com
and related paper of Alexander [1]
"Cointegrationbased trading strategies:
A new approach to enhanced index tracking and statistical arbitrage"
 In
"Intraday Price Formation in US Equity Index Markets" [8],
Joel Hasbrouck is studying relationships between stocks and futures and ETF.
including
implementation source codes, cached
and
presentation slides
*****
 Chambers [3]
This paper analyses the effects of
sampling frequency on the properties of spectral regression
estimators of cointegrating parameters.
 "Numerically Stable Cointegration Analysis" [5]
is a practical impementation to estimate commen trends:
Cointegration analysis involves the solution of a generalized eigenproblem involving moment matrices
and inverted moment matrices. These formulae are unsuitable for actual computations because the condition
numbers of the resulting matrices are unnecessarily increased. Our note discusses how to use the structure of
the problem to achieve numerically stable computations, based on QR and singular value decompositions.
 [11], a simple illustration of cointegration
with a drunk man and his dog ...
****
 [13] Adrian Trapletti paper on intraday cointegration
for forex.
 Common stochastic trends, cycles and sectoral fluctuations
cached
*****
 johansen test critical values
 web
cached
This paper uses Reuters highfrequency exchange rate data to investigate the
contributions to the price discovery process by individual banks in the foreign ex
change market.
 "Intraday LeadLag Relationships Between the
Futures, Options and Stock Market", F. de Jong and M.W.M. Donders.
Includes interesting method to estimate cross corelation whith
asynchronous trades.
 Cointegration in Single Equations in lectures from
Ronald Bewley
 Vector Error Correction Modeling from SAS online support.
 Variance ration testing: an other test for stationary
aplication to stokc market indices, cached
random walk or mean reversion ..., cached
On the Asymptotic Power of the Variance Ratio Test, cached
 a general discussion on
Econometric Forecasting and methods, by P. Geoffrey Allen and
Robert Fildes.
 a simple presentation of Dickey Fuler test
in French, cached
 The R
tseries package include Augmented Dickey–Fuller
 How to do a 'Regular' DickeyFuller Test Using Excel
cached
bibliography list from Petits Déjeuners de la Finance

Alexander, C. (1994): History Debunked
RISK 7 no.12 (1994) pp5963
 Alexander, C. (1995): Cofeatures in international bond and equity markets
Mimeo
 Alexander, C., Johnson, A. (1992): Are foreign exchange markets really efficient ?
Economics Letters 40 (1992) 449453
 Alexander, C., Johnson, A. (1994): Dynamic Links
RISK 7:2 pp5661
 Alexander, C., Thillainathan, R (1996): the Asian Connections
Emerging Markets Investor 2:6 pp4247
 Beck, S.E.(1994): Cointegration and market inefficiency in commodities futures markets
Applied Economics 26:3 pp 24957
 Bradley, M., Lumpkin, S. (1992): The Treasury yield curve as a cointegrated system
Journal of Financial and Quantitative Analysis 27 pp 44963
 Brenner, R.J., Kroner, K.F. (1995): Arbitrage, cointegration and testing
the unbiasedness hypothesis in financial markets
Journal of Financial and Quantitative Analysis 30:1 pp2342
 Brenner, R.J., Harjes, Kroner, K.F. (1996): Another look at alternative
models of the short term interest rate
Journal of Financial and Quantitative Analysis 31:1 pp85108
 Booth, G., Tse, Y. (1995): Long Memory in Interest Rate Futures Markets:
A Fractional Cointegration Analysis
Journal of Futures Markets 15:5
 Campbell, J.Y., Lo, A.W., MacKinley, A.C. (1997): The Econometrics of
Financial Markets Princeton University Press
 Cerchi, M., Havenner, A. (1998): Cointegration and stock prices
Journal of Economic Dynamic and Control 12 pp3334
 Chowdhury, A.R. (1991): Futures market efficiency: evidence from
cointegration tests
The Journal of Futures Markets 11:5 pp57789
 Chol, l. (1992): DurbinHaussmann tests for a unit root
Oxford Bulletin of Economics and Statistics 54:3 pp289304
 Clare, A.D., Maras, M., Thomas, S.H. (1995): The integration and
efficiency of international bond markets
Journal of Business Finance and Accounting 22:2 pp31322
 Cochrane, J.H. (1991): A critique of the application of unit root tests
Jour. Econ. Dynamics and Control 15 pp27584
 Dickey, D.A., Fuller, W.A. (1979): Distribution of the estimates for
autoregressive time series with a unit root
Journal of the American Statistical Association 74 pp4279
 Duan, J.C., Pliska, S. (1998): Option valuation with cointegrated asset prices
Mimeo
 Dwyer, G.P., Wallace, M.S. (1992): Cointegration and market efficiency
Journal of international Money and Finance
 Engle, R.F., Granger, C.W.J. (1987): Cointegration and error correction:
representation, estimation and testing
Econometrica 55:2 pp25176
 Engle, R.F., Yoo, B.S. (1987): Forecasting and testing in cointegrated systems
Jour. Econometrics 35 pp14359
 Granger, C.W.J. (1988): Some recent developments on a concept of causality
Jour. Econometrics 39 pp199211
 Hail, A.D., Anderson, H.M., Granger C.W.J. (1992): A cointegration
analysis of Treasury bill yields
The Review of Economics and Statistics pp11626
 Hamilton, J.D. (1994): Time Series Analysis
Princeton University Press
 Harris, F.deB., McInish, T.H., Shoesmith, G.L., Wood, R.A. (1995):
Cointegration, Error Correction, And Price Discovery On Informationally Linked
Security Markets
Journal of Financial and Quantitative Analysis 30:4
 Hendry, D.F. (1986): Econometrics modelling with cointegrated variables:
an overview
Oxford Bulletin of Economics and Statistics 48:3 pp20112
 Hendry, D.F. (1995): Dynamic Econometrics
Oxford University Press
 Johansen, S. (1988): Statistical analysis of cointegration vectors
Journal of Economic Dynamics and Control 12 pp23154
 Johansen, S., Juselius, K. (1990): Maximum likelihood estimation and
inference on cointegration  with applications to the demand for money
Oxford Bulletin of Economics and Statistics 52:2 pp169210
 Masih, R. (1997): Cointegration of markets since the '87 crash
Quaterly Review of Economics and Finance 37:4
 Proietti, T. (1997): Shortrun dynamics in cointegrated systems
Oxford Bulletin of Economics and Statistics 59:3
 Schwartz, T.V., Szakmary, A.C. (1994): Price discovery in petroleum
markets: arbitrage, cointegration and the time interval of analysis
Journal of Futures Markets 14:2 pp147167
 Schmidt, P., Phillips, P.C.B. (1992): LM tests for a unit root in the
presence of deterministic trends
Oxford Bulletin of Economics and Statistics 54:3 pp257288
 Wang, G.H.K., Yau, J. (1994): A Time Series Approach To Testing For
Market Linkage: Unit Root And Cointegration Tests
Journal of Futures Markets 14:4
 [1]

Alexander, C and Dimitriu, A.
''Cointegrationbased trading strategies: A new approach to enhanced index
tracking and statistical arbitrage'' , 2002.
...
Discussion Paper 200208, ISMA Centre Discussion Papers in Finance
Series.
 [2]

Alexander, C, Giblin, I, and Weddington, W.
''Cointegration and asset allocation: A new active hedge fund strategy'' ,
2002.
...
Discussion Paper 200308, ISMA Centre Discussion Papers in Finance
Series.
 [3]

Chambers, M. J.
''Cointegration and Sampling Frequency'' , 2002.
...
 [4]

Cont, R.
''Empirical properties of asset returns  stylized facts and statistical
issues'' .
QUANTITATIVE FINANCE, 2000.
...
 [5]

Doornik, J. A and O'Brien, R.
''Numerically Stable Cointegration Analysis'', 2001.
...
 [6]

Engle, R and Granger, C.
''Cointegration and ErrorCorrection:
Representation, Estimation and Testing'' .
Econometrica, 55:251276, 1987.
 [7]

Goetzmann, W, g. Gatev, E, and Rouwenhorst, K. G.
''Pairs
Trading: Performance of a Relative Value Arbitrage Rule'' , Nov 1998.
...
 [8]

Hasbrouck, J.
''Intraday Price Formation in US Equity Index Markets'' , 2002.
...
 [9]

Herlemont, D.
''Croissance optimale'' , 2003.
...
Discussion papers.
 [10]

Kargin, V.
''Optimal Convergence Trading'' , 2004.
...
 [11]

Murray, M. P.
''A
drunk and her dog : An illustration of cointegration and error correction''
.
The American Statistician, Vol. 48, No. 1, February 1994, p. 3739.
...
 [12]

Stock, J. H and Watson, M.
''Variable Trends in Economic Time Series''.
Journal of Economic Perspectives, 3(3):147174, Summer 1988.
 [13]

Trapletti, A, Geyer, A, and Leisch, F.
''Forecasting exchange rates using cointegration models and intraday data''
.
Journal of Forecasting, 21:151166, 2002.
...
 [14]

Thompson, G. W. P.
''Optimal trading of an asset driven by a hidden Markov process in the
presence of fixed transaction costs'' , 2002.
...
 1

source: from Andrei Simonov, no longer available online
© Copyright 2003 YATS,
Daniel HERLEMONT,
All rights reserved,