1 Correlation vs. Convolution
Correlation is very similar to convolution, and it is best defined through its equivalent “correlation theorem”:
The difference between correlation and convolution is that that when correlating two signals, the Fourier transform of the second function ( in equation ) is conjugated before multiplying and integrating. Using that
we can show that correlating f(t) and g(t) is equivalent to convolving f(t) with a conjugated, time-reversed version of g(t):
Although this relation between convolution and correlation is often mentioned in the literature, I don’t personally find it very intuitively illuminating. I much prefer the “correlation theorem” in equation (), because when it is combined with the expression of a time-shifted signal in Fourier domain:
it shows that correlating a flat-spectrum signal with a time-shifted version of itself yields a measure of the power of the signal at the delay corresponding to the time shift:
There are a variety of architectures for building correlators, but perhaps the most straight-forward conceptually are FX correlators, where input signals are first Fourier-transformed (F) and then cross-multiplied (X), with one signal from each pairing being conjugated. This architecture is essentially a direct implementation of the right-hand side of the “correlation theorem”. Moreover, since scientists are often interested in the power spectrum of the correlated signals, FX correlators usually do not implement the inverse Fourier transform back to delay domain. Instead, they produce the correlation between inputs i and j as a function of frequency ν:
Suppose we have two signals from two antennas i,j, each consisting of an uncorrelated noise component and a correlated signal that enters each antenna feed at different time:
We can then express the time-averaged correlation function as
In the last step above, we use the fact that signals x, ni, and nj are uncorrelated, and so average to zero. In practice, for a finite integration time, these noise terms do not completely disappear — they merely integrate down as t − 1 / 2, as noise is wont to do. This is actually an important point about correlators: everything that enters as a signal either 1) correlates, and is present in , or 2) does not correlate, and is a source of noise to be integrated down.
Through the correlation process, the correlated component of f and g (namely x) is coherently accumulated with a sinusoidally varying phase that corresponds to a delay-bin where τ = τij. In that bin, the convolution that we compute gives us a measurement of the power of the correlated signal at the delay corresponding to the time interval between when that signal enters one antenna and when it enters the other. All of the other uncorrelated terms in f and g, when cross-correlated, integrate to zero.
2 How to Build a Correlator
The job of a correlator is typically to cross-correlate each pair of antennas in an array. For N antennas, there are N(N + 1) / 2 cross-products, meaning that correlators have the nasty property of generating more data out than goes in. Mathematically, correlators are quite simple. The difficulty with building correlators, and why they are often one of the most expensive components in a synthesis telescope, is in managing the flow of large amounts of data in real time, and providing the computing necessary to perform the cross-correlation itself.
For radio astronomy applications, there are basically two ways that correlators are built. Both ways involve a Fourier transform stage (often called the “F" stage) and a cross-correlation stage (often called the “X" stage). The two architectures differ in the ordering of these stages.
2.1 The XF Architecture, a.k.a. the “Lag" Correlator
An example of an XF correlator architecture, from Fig. 9.5 in GMRT’s documentation (Roshi 1997).
In this correlator architecture, signals from different antennas are first cross-correlated. This is typically done with what is called a “lag" correlation circuit. This circuit looks a lot like an FIR filter, which is to say, it implements a by-the-book correlation. Following the equation for correlating signals f and g:
one signal (g above) is conjugated and delayed relative to the other by a number of samples, τ, and then the two signals f and g * (t − τ) are cross-multiplied and integrated for some amount of time. Different circuits compute the integral for different time lags between f and g, producing the value of for different values of τ. This is the ‘X’ stage.
However, most radio astronomers would like to have their observations as a function of frequency, so in an XF correlator, the correlation product is then Fourier transformed into frequency domain. Hence the ‘F’ stage comes after the X.
The problem with XF correlators is that, for N antennas being correlated, there are N2 / Fourier transforms that must be implemented. Moreover, the fact that each “lag” must be computed independently, but still requires all the data to do so, is inefficient. However, for smaller numbers of antennas, XF correlators are still sometimes used to good effect.
2.2 The FX Architecture
An example of an FX correlator architecture, from Parsons et al. (2008).
In this correlator architecture, the ‘F’ Stage (Fourier transform) is applied first. Hence, for N antennas, only N FFTs are required. Next comes the ‘X’ Stage, which in this case is a simple cross-multiplication frequency by frequency. This is because the FX correlator architecture is essentially implementing a correlation according to the “correlation theorem" above, where, after the Fourier transform, we have that
Notice that, because we wanted our output in frequency domain, we didn’t have to bother with the Fourier transform back to time domain at the end of our correlation theorem equation. So after Fourier transforming each antenna signal to frequency domain, we only need to multiply the signal from an antenna at a given frequency with other antenna samples at that same frequency. This is both efficient, and easier to parallelize!
Almost all large correlators follow the FX architecture these days. For smaller correlators with narrow bandwidths, CPUs may be used to do both the F and X stages. For larger correlators with higher bandwidths, FPGA processors are typically used to do the F stage, and the X stage is implemented either with FPGAs or GPU processors.