We can reformulate this problem into a matrix problem as follows: Let A be the n by 2 matrix like this:
A = [ x1 1
x2 1
x3 1
x4 1
...
xn 1]
and let y be the column vector of all the yi values.
Then if we set c = [m; b] (column vector of the coefficients) then
r = y - Ac, and thus we are trying to find the vector c
which minimizes ||y - Ac||.
As discussed in class and in the text (pg. 333) this is called a least-squares problem and the solution comes from realizing that the residual must be orthogonal to the column space of A or that ATr = 0. This works out to form the normal equations:
ATAc = ATy.
If A has full rank (in this case, 2, which means that we have at least
2 distinct x values), then ATA is invertible, so this
has a unique solution.
x = [2 5 7 8]'; % column of x-values y = [1 2 3 3]'; % column of y-values A = [x, ones(4,1)]; % form matrix A, add column of 4 1s ATA = A'*A; % form ATA for normal equations and b = A'*y; % form ATy c = inv(ATA)*b;then c(1) would be the slope (should be 0.3571) and c(2) the intercept (0.2857).
Once we have A and y, we can do this more efficiently with
c = (A'*A)\(A'*y);If you want to see the data and the results you can plot the points and the curve as follows:
predy = A*c; % compute predicted values plot(x,y,'o',x,predy) % plot data as circles and draw best-fit line
Here's how it works for a quadratic (and, in general, for the others). The residual is still observed - predicted and it is still orthogonal to the column space of a matrix, so we first look at the residual: r = y - Ac where y is the column vector of y-values, c is the column vector of coefficients we're solving for (for the quadratic, c = [a b c]'), and A is the n by 3 matrix of the form
A = [ x12 x1 1
x22 x2 1
x32 x3 1
...
xn2 xn 1]
Note that if you take a row of A and multiply it by the column c you get
axi2 + bxi + c.
ATA c = ATy
How do we do a fit to other types of equations? The key thing to note
is that the columns in A correspond to the functions in the equation
we are trying to fit. For example, for the line the functions are
x and 1 and the first column of A is xi and the 2nd column are
all 1s. Thus if we wanted to fit the crazy function above to some data,
we'd form the matrix A with columns corresponding to the functions
x, 1, sin(x), cos(2x), log(x) and ex.
Once we have the matrix A, we just use the normal equations to solve for
the coefficients.
x = [0 1 2 3 4 5 6 7 8]' y = [1 1 2 2 3 3 4 5 7]'
2. Take the same data and find the best-fit quadratic curve to it.
3. One application of these curves is in approximating other functions. For the data take 11 equally spaced points from to 0 to 2pi and the value of cosine at each point:
x = 2*pi*[0:10]'/10; % 11 equally spaced points y = cos(x);Find the polynomials of degree 2 (quadratic) and 4 (quartic) that best fit this data.
4. Sales models are usually linear functions, but it makes more sense to include some seasonal fluctuations. Suppose over a year, monthly sales amounts (in $1K) are recorded as follows:
s = [20 15 12 8 13 22 28 33 35 45 56 62]and we think the data follows the function s = a t + b + c sin (2*pi*t/12) where t is measured in months. Find the values of a, b and c so that this curve is the best fit to the data.