# Least Squares

### From AstroBaki

### [edit] Short Topical Videos

### [edit] Reference Material

- Least Squares (Wikipedia)
- Carl Heiles on Least Squares (The "Quick" Version)
- Carl Heiles on Least Squares (The Full Version)
- Error and Propagation of Uncertainties

## Contents |

Carl Heiles has written the definitive lab document on least squares fitting, and you should definitely read at least the lite version in the reference materials above. That being said, all the examples in the document are in IDL, and so this page will focus on performing those examples with Python. Note that all figures and equations referenced are those within Carl Heiles’ document.

This is also the first lab where you will really be working with error and the propagation of uncertainty, so it is highly recommended that you look over chapter 3 of Taylor’s error analysis text (the basics covered there will be up for game for quizzes).

## 1 A Numerical Example and Its Solution in IDL

### 1.1 Generation of the numerical example

Suppose that we make four measurements of the angle y and we want to fit to a parabolic function in time t. In the notation of Equation 1.1, s would be unity, t the time, and u the time squared, so the number of unknowns is three (N = 3). Because there are four independent measurements (M = 4) the subscripts run from m = 0 →3. Suppose that the four values of time are 5, 7, 9, 11.

First, we create the matrix *X* in Python,

X = np.zeros(N, M, dtype = float) = np.zeros(3, 4, dtype = float)

and then we populate it with numbers. In your own work, you would normally do this by reading a data file and transferring the numbers to the matrix using IDL commands; to work through this example, you might manually type them in. After populating the matrix, in direct correspondence with Equation 2.1a we have *s*_{m} = 1, *t*_{m} = *t**i**m**e*_{m} , *u*_{m} = *t**i**m**e*^{2} :

Suppose that the four measured values of *y* are (Equation 2.1c),

Figure 4.1 shows the datapoints, together with the fitted curve. In Python, you can generate a column vector like *Y* by creating an array,

Y = np.array([142, 168, 211, 251]),

and a matrix like *X* by using the np.matrix function, providing each of the columns in a list, as,

X = np.matrix([[1,1,1,1],[5,7,9,11],[25,49,81,121]])

## 2 Solution of the Numerical Example in Python

In Python we calculate the normal equation matrices and denote the [α] in Equation 2.4a by XX:

XX = np.dot(X,np.transpose(X))

and we denote the [β] in Equation 2.4b by XY:

XY = np.dot(Y,np.transpose(X))

We can take the inverse of [α] (XX) in Python by:

XXI = np.linalg.inv(XX)

Finally, we can find the least-squares fitted quantities **a** by multiplying these two matrices,

a = np.dot(XY,XXI)

In Python we denote the matrix of predicted values by YBAR, which is,

YBAR = np.dot(a,X),

and we can also define the residuals in Y as,

DELY = Y - YBAR

In Python we can also denote *s*^{2} in Equations 3.1 and 3.6 by s_sq and write

s_sq = np.dot(DELY/(M-N),np.transpose(DELY)),

or

s_sq = np.sum(DELY**2)/(M-N)

It is *always* a good idea to plot all three quantities (the measured values Y, the fitted values YBAR, and the residuals DELY to make sure your fit looks reasonable and to check for bad datapoints.

To get the error in the derived coefficients, we need a way to select the diagonal elements of a matrix. Obviously, any N by N matrix has N diagonal elements; a convenient way to get them is,

diag_elems = np.dot(NbyNmatrix,np.identity(N))

In Python, we define the variances of the N derived coefficients by vardc (think of "**var**iances of **d**erived **c**oefficients"). You can find this as in Equation 3.7 from,

vardc = s_sq*diag_elems

## 3 Discussion of the numerical example

For this numerical example, the solution (the array of derived coefficients) is,

and the errors in the derived coefficients (the square root of the σ^{2}’s of the derived coefficients, i.e. , or in Python, np.sqrt(vardc)) are:

These results look *horrible*: the uncertainties are large fractions of the derived coefficients. The reason: we have specifically chosen an example with terrible covariance. And the great thing is that this can be fixed easily (at least in this case–certainly not always), without taking more data! To address this issue, look to section 5 of Least-Squares Lite.