# Coding Cross Correlation Function in Python

D

#### DarthXander

I have two data sets which I wish to perform the discrete correlation
function on and then plot the results for many values of t to see what
if any time lag exists between the data.
Thus far my code is;

import csv
import pylab
from pylab import *
from numpy import *
from numpy import array

x=[]
a=[]
y=[]
b=[]
g=[]
h=[]
d=[]

for Date, Close in HSBC:
x.append(Date)
a.append(float(Close))

for Date, Close in Barclays:
y.append(Date)
b.append(float(Close))

for index in range(len(a)):
g.append(a[index]-mean(a))

for index in range(len(b)):
h.append(b[index]-mean(b))

r=std(a)
s=std(b)

So I have all the necessary components for the DCF.

However I'm not faced with the challenge of performing the DCF for t
in the range of potentially 0-700 or so.
Currently I could do it individually for each value of tau ie;

t1=[]
for index in range(len(g)-1):
j=(g[index]*h[index+1])/(r*s)
t1.append(j)

d.append(mean(t1))

However to do this 700 times seems ridiculous. How would I get python
to perform this for me for t in a range of roughly 0-700?

Thanks
Alex

S

#### sturlamolden

However to do this 700 times seems ridiculous. How would I get python
to perform this for me for t in a range of roughly 0-700?

For two 1D ndarrays, the cross-correlation is

from numpy.fft import rfft, irfft
from numpy import fliplr

xcorr = lambda x,y : irfft(rfft(x)*rfft(fliplr(y)))

Normalize as you wish, and preferably pad with zeros before invoking
xcorr.

D

#### DarthXander

For two 1D ndarrays, the cross-correlation is

from numpy.fft import rfft, irfft
from numpy import fliplr

xcorr = lambda x,y : irfft(rfft(x)*rfft(fliplr(y)))

Normalize as you wish, and preferably pad with zeros before invoking
xcorr.

Thanks, though I'd like to do this longer hand than the built in
functions! Great to approximate it though.