2 * Introduction to lsquares::
3 * Functions and Variables for lsquares::
6 @c -----------------------------------------------------------------------------
7 @node Introduction to lsquares, Functions and Variables for lsquares, Package lsquares, Package lsquares
8 @section Introduction to lsquares
9 @c -----------------------------------------------------------------------------
11 @code{lsquares} is a collection of functions to implement the method of least squares
12 to estimate parameters for a model from numerical data.
14 @opencatbox{Categories:}
15 @category{Statistical estimation}
16 @category{Share packages}
17 @category{Package lsquares}
20 @c -----------------------------------------------------------------------------
21 @node Functions and Variables for lsquares, , Introduction to lsquares, Package lsquares
22 @section Functions and Variables for lsquares
23 @c -----------------------------------------------------------------------------
25 @anchor{lsquares_estimates}
26 @deffn {Function} lsquares_estimates @
27 @fname{lsquares_estimates} (@var{D}, @var{x}, @var{e}, @var{a}) @
28 @fname{lsquares_estimates} (@var{D}, @var{x}, @var{e}, @var{a}, initial = @var{L}, tol = @var{t})
30 Estimate parameters @var{a} to best fit the equation @var{e}
31 in the variables @var{x} and @var{a} to the data @var{D},
32 as determined by the method of least squares.
33 @code{lsquares_estimates} first seeks an exact solution,
34 and if that fails, then seeks an approximate solution.
36 The return value is a list of lists of equations of the form @code{[a = ..., b = ..., c = ...]}.
37 Each element of the list is a distinct, equivalent minimum of the mean square error.
39 The data @var{D} must be a matrix.
40 Each row is one datum (which may be called a `record' or `case' in some contexts),
41 and each column contains the values of one variable across all data.
42 The list of variables @var{x} gives a name for each column of @var{D},
43 even the columns which do not enter the analysis.
44 The list of parameters @var{a} gives the names of the parameters for which
46 The equation @var{e} is an expression or equation in the variables @var{x} and @var{a};
47 if @var{e} is not an equation, it is treated the same as @code{@var{e} = 0}.
49 Additional arguments to @code{lsquares_estimates}
50 are specified as equations and passed on verbatim to the function @code{lbfgs}
51 which is called to find estimates by a numerical method
52 when an exact result is not found.
54 If some exact solution can be found (via @code{solve}),
55 the data @var{D} may contain non-numeric values.
56 However, if no exact solution is found,
57 each element of @var{D} must have a numeric value.
58 This includes numeric constants such as @code{%pi} and @code{%e} as well as literal numbers
59 (integers, rationals, ordinary floats, and bigfloats).
60 Numerical calculations are carried out with ordinary floating-point arithmetic,
61 so all other kinds of numbers are converted to ordinary floats for calculations.
63 If @code{lsquares_estimates} needs excessive amounts of time or runs out of memory
64 @mrefcomma{lsquares_estimates_approximate} which skips the attempt to find an exact
65 solution, might still succeed.
67 @code{load("lsquares")} loads this function.
70 @mref{lsquares_estimates_exact},
71 @mref{lsquares_estimates_approximate},@*
73 @mref{lsquares_residuals},
74 and @mref{lsquares_residual_mse}.
78 A problem for which an exact solution is found.
83 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
84 @c lsquares_estimates (
85 @c M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
88 (%i1) load ("lsquares")$
90 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
105 (%i3) lsquares_estimates (
106 M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
108 (%o3) [[A = - --, B = - --, C = -----, D = - ---]]
113 A problem for which no exact solution is found,
114 so @code{lsquares_estimates} resorts to numerical approximation.
117 @c load ("lsquares")$
118 @c M : matrix ([1, 1], [2, 7/4], [3, 11/4], [4, 13/4]);
119 @c lsquares_estimates (
120 @c M, [x,y], y=a*x^b+c, [a,b,c], initial=[3,3,3], iprint=[-1,0]);
123 (%i1) load ("lsquares")$
124 (%i2) M : matrix ([1, 1], [2, 7/4], [3, 11/4], [4, 13/4]);
139 (%i3) lsquares_estimates (
140 M, [x,y], y=a*x^b+c, [a,b,c], initial=[3,3,3], iprint=[-1,0]);
141 (%o3) [[a = 1.375751433061394, b = 0.7148891534417651,
142 c = - 0.4020908910062951]]
146 Exponential functions aren't well-conditioned for least min square fitting.
147 In case that fitting to them fails it might be possible to get rid of the
148 exponential function using an logarithm.
151 @c load ("lsquares")$
152 @c yvalues: [1,3,5,60,200,203,80]$
153 @c time: [1,2,4,5,6,8,10]$
155 @c yvalues_log: log(yvalues)$
156 @c f_log: log(subst(y=exp(y),f));
157 @c lsquares_estimates (transpose(matrix(yvalues_log,time)),
158 @c [y,t], f_log, [a,b]);
161 (%i1) load ("lsquares")$
162 (%i2) yvalues: [1,3,5,60,200,203,80]$
163 (%i3) time: [1,2,4,5,6,8,10]$
165 (%i4) f: y=a*exp(b*t);
169 (%i5) yvalues_log: log(yvalues)$
171 (%i6) f_log: log(subst(y=exp(y),f));
176 (%i7) lsquares_estimates (transpose(matrix(yvalues_log,time)),
177 [y,t], f_log, [a,b]);
178 *************************************************
179 N= 2 NUMBER OF CORRECTIONS=25
181 F= 6.802906290754687D+00 GNORM= 2.851243373781393D+01
182 *************************************************
184 I NFN FUNC GNORM STEPLENGTH
186 1 3 1.141838765593467D+00 1.067358003667488D-01 1.390943719972406D-02
187 2 5 1.141118195694385D+00 1.237977833033414D-01 5.000000000000000D+00
188 3 6 1.136945723147959D+00 3.806696991691383D-01 1.000000000000000D+00
189 4 7 1.133958243220262D+00 3.865103550379243D-01 1.000000000000000D+00
190 5 8 1.131725773805499D+00 2.292258231154026D-02 1.000000000000000D+00
191 6 9 1.131625585698168D+00 2.664440547017370D-03 1.000000000000000D+00
192 7 10 1.131620564856599D+00 2.519366958715444D-04 1.000000000000000D+00
194 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
196 (%o7) [[a = 1.155904145765554, b = 0.5772666876959847]]
200 @opencatbox{Categories:}
201 @category{Package lsquares}
202 @category{Numerical methods}
207 @anchor{lsquares_estimates_exact}
208 @deffn {Function} lsquares_estimates_exact (@var{MSE}, @var{a})
210 Estimate parameters @var{a} to minimize the mean square error @var{MSE},
211 by constructing a system of equations and attempting to solve them symbolically via @code{solve}.
212 The mean square error is an expression in the parameters @var{a},
213 such as that returned by @code{lsquares_mse}.
215 The return value is a list of lists of equations of the form @code{[a = ..., b = ..., c = ...]}.
216 The return value may contain zero, one, or two or more elements.
217 If two or more elements are returned,
218 each represents a distinct, equivalent minimum of the mean square error.
221 @mref{lsquares_estimates},
222 @mref{lsquares_estimates_approximate},
224 @mref{lsquares_residuals},
225 and @mref{lsquares_residual_mse}.
230 @c load ("lsquares")$
232 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
233 @c mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
234 @c lsquares_estimates_exact (mse, [A, B, C, D]);
237 (%i1) load ("lsquares")$
239 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
253 (%i3) mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
257 > ((- B M ) - A M + (M + D) - C)
261 (%o3) -------------------------------------------------
264 (%i4) lsquares_estimates_exact (mse, [A, B, C, D]);
266 (%o4) [[A = - --, B = - --, C = -----, D = - ---]]
271 @opencatbox{Categories:}
272 @category{Package lsquares}
277 @anchor{lsquares_estimates_approximate}
278 @deffn {Function} lsquares_estimates_approximate (@var{MSE}, @var{a}, initial = @var{L}, tol = @var{t})
280 Estimate parameters @var{a} to minimize the mean square error @var{MSE},
281 via the numerical minimization function @code{lbfgs}.
282 The mean square error is an expression in the parameters @var{a},
283 such as that returned by @code{lsquares_mse}.
285 The solution returned by @code{lsquares_estimates_approximate} is a local (perhaps global) minimum
286 of the mean square error.
287 For consistency with @code{lsquares_estimates_exact},
288 the return value is a nested list which contains one element,
289 namely a list of equations of the form @code{[a = ..., b = ..., c = ...]}.
291 Additional arguments to @code{lsquares_estimates_approximate}
292 are specified as equations and passed on verbatim to the function @code{lbfgs}.
294 @var{MSE} must evaluate to a number when the parameters are assigned numeric values.
295 This requires that the data from which @var{MSE} was constructed
296 comprise only numeric constants such as @code{%pi} and @code{%e} and literal numbers
297 (integers, rationals, ordinary floats, and bigfloats).
298 Numerical calculations are carried out with ordinary floating-point arithmetic,
299 so all other kinds of numbers are converted to ordinary floats for calculations.
301 @code{load("lsquares")} loads this function.
304 @mref{lsquares_estimates},
305 @mref{lsquares_estimates_exact},
306 @mref{lsquares_mse},@*
307 @mref{lsquares_residuals},
308 and @mref{lsquares_residual_mse}.
313 @c load ("lsquares")$
315 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
316 @c mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
317 @c lsquares_estimates_approximate (
318 @c mse, [A, B, C, D], iprint = [-1, 0]);
321 (%i1) load ("lsquares")$
323 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
337 (%i3) mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
341 > ((- B M ) - A M + (M + D) - C)
345 (%o3) -------------------------------------------------
348 (%i4) lsquares_estimates_approximate (
349 mse, [A, B, C, D], iprint = [-1, 0]);
350 (%o4) [[A = - 3.678504947401971, B = - 1.683070351177937,
351 C = 10.63469950148714, D = - 3.340357993175297]]
355 @opencatbox{Categories:}
356 @category{Package lsquares}
357 @category{Numerical methods}
362 @anchor{lsquares_mse}
363 @deffn {Function} lsquares_mse (@var{D}, @var{x}, @var{e})
365 Returns the mean square error (MSE), a summation expression, for the equation @var{e}
366 in the variables @var{x}, with data @var{D}.
368 The MSE is defined as:
371 $${1 \over n} \, \sum_{i=1}^n \left[{\rm lhs}\left(e_i\right) - {\rm rhs}\left(e_i\right)\right]^2,$$
378 - > (lhs(e ) - rhs(e ))
385 where @var{n} is the number of data and @code{@var{e}[i]} is the equation @var{e}
386 evaluated with the variables in @var{x} assigned values from the @code{i}-th datum, @code{@var{D}[i]}.
388 @code{load("lsquares")} loads this function.
393 @c load ("lsquares")$
395 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
396 @c mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
401 (%i1) load ("lsquares")$
403 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
417 (%i3) mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
421 > ((- B M ) - A M + (M + D) - C)
425 (%o3) -------------------------------------------------
432 4 > (M + D) ((- B M ) - A M + (M + D) - C)
433 / i, 1 i, 3 i, 2 i, 1
436 --------------------------------------------------------------
441 (%o5) (((D + 3) - C - 2 B - 2 A) + ((D + -) - C - B - 2 A)
444 + ((D + 2) - C - B - 2 A) + ((D + -) - C - 2 B - A)
447 + ((D + 1) - C - B - A) )/5
450 (%i3) mse : lsquares_mse (M, [z, x, y], (z + D)^2 = A*x + B*y + C);
454 > ((D + M ) - C - M B - M A)
458 (%o3) ---------------------------------------------
466 4 > (D + M ) ((D + M ) - C - M B - M A)
467 / i, 1 i, 1 i, 3 i, 2
470 (%o4) ----------------------------------------------------------
478 (%o5) (((D + 3) - C - 2 B - 2 A) + ((D + -) - C - B - 2 A)
481 + ((D + 2) - C - B - 2 A) + ((D + -) - C - 2 B - A)
484 + ((D + 1) - C - B - A) )/5
488 @opencatbox{Categories:}
489 @category{Package lsquares}
494 @anchor{lsquares_residuals}
495 @deffn {Function} lsquares_residuals (@var{D}, @var{x}, @var{e}, @var{a})
497 Returns the residuals for the equation @var{e}
498 with specified parameters @var{a} and data @var{D}.
500 @var{D} is a matrix, @var{x} is a list of variables,
501 @var{e} is an equation or general expression;
502 if not an equation, @var{e} is treated as if it were @code{@var{e} = 0}.
503 @var{a} is a list of equations which specify values for any free parameters in @var{e} aside from @var{x}.
505 The residuals are defined as:
508 $${\rm lhs}\left(e_i\right) - {\rm rhs}\left(e_i\right),$$
517 where @code{@var{e}[i]} is the equation @var{e}
518 evaluated with the variables in @var{x} assigned values from the @code{i}-th datum, @code{@var{D}[i]},
519 and assigning any remaining free variables from @var{a}.
521 @code{load("lsquares")} loads this function.
526 @c load ("lsquares")$
528 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
529 @c a : lsquares_estimates (
530 @c M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
531 @c lsquares_residuals (
532 @c M, [z,x,y], (z+D)^2 = A*x+B*y+C, first(a));
535 (%i1) load ("lsquares")$
537 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
552 (%i3) a : lsquares_estimates (
553 M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
555 (%o3) [[A = - --, B = - --, C = -----, D = - ---]]
559 (%i4) lsquares_residuals (
560 M, [z,x,y], (z+D)^2 = A*x+B*y+C, first(a));
562 (%o4) [--, - --, - --, --, --]
567 @opencatbox{Categories:}
568 @category{Package lsquares}
573 @anchor{lsquares_residual_mse}
574 @deffn {Function} lsquares_residual_mse (@var{D}, @var{x}, @var{e}, @var{a})
576 Returns the residual mean square error (MSE) for the equation @var{e}
577 with specified parameters @var{a} and data @var{D}.
579 The residual MSE is defined as:
582 $${1 \over n} \, \sum_{i=1}^n \left[{\rm lhs}\left(e_i\right) - {\rm rhs}\left(e_i\right)\right]^2,$$
589 - > (lhs(e ) - rhs(e ))
596 where @code{@var{e}[i]} is the equation @var{e}
597 evaluated with the variables in @var{x} assigned values from the @code{i}-th datum, @code{@var{D}[i]},
598 and assigning any remaining free variables from @var{a}.
600 @code{load("lsquares")} loads this function.
605 @c load ("lsquares")$
607 @c [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
608 @c a : lsquares_estimates (
609 @c M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
610 @c lsquares_residual_mse (
611 @c M, [z,x,y], (z + D)^2 = A*x + B*y + C, first (a));
614 (%i1) load ("lsquares")$
616 [1,1,1], [3/2,1,2], [9/4,2,1], [3,2,2], [2,2,1]);
631 (%i3) a : lsquares_estimates (
632 M, [z,x,y], (z+D)^2 = A*x+B*y+C, [A,B,C,D]);
634 (%o3) [[A = - --, B = - --, C = -----, D = - ---]]
638 (%i4) lsquares_residual_mse (
639 M, [z,x,y], (z + D)^2 = A*x + B*y + C, first (a));
646 @opencatbox{Categories:}
647 @category{Package lsquares}
653 @deffn {Function} plsquares @
654 @fname{plsquares} (@var{Mat},@var{VarList},@var{depvars}) @
655 @fname{plsquares} (@var{Mat},@var{VarList},@var{depvars},@var{maxexpon}) @
656 @fname{plsquares} (@var{Mat},@var{VarList},@var{depvars},@var{maxexpon},@var{maxdegree})
657 Multivariable polynomial adjustment of a data table by the "least squares"
658 method. @var{Mat} is a matrix containing the data, @var{VarList} is a list of variable names (one for each Mat column, but use "-" instead of varnames to ignore Mat columns), @var{depvars} is the name of a dependent variable or a list with one or more names of dependent variables (which names should be in @var{VarList}), @var{maxexpon} is the optional maximum exponent for each independent variable (1 by default), and @var{maxdegree} is the optional maximum polynomial degree (@var{maxexpon} by default); note that the sum of exponents of each term must be equal or smaller than @var{maxdegree}, and if @code{maxdgree = 0} then no limit is applied.
660 If @var{depvars} is the name of a dependent variable (not in a list), @code{plsquares} returns the adjusted polynomial. If @var{depvars} is a list of one or more dependent variables, @code{plsquares} returns a list with the adjusted polynomial(s). The Coefficients of Determination are displayed in order to inform about the goodness of fit, which ranges from 0 (no correlation) to 1 (exact correlation). These values are also stored in the global variable @var{DETCOEF} (a list if @var{depvars} is a list).
663 A simple example of multivariable linear adjustment:
665 (%i1) load("plsquares")$
667 (%i2) plsquares(matrix([1,2,0],[3,5,4],[4,7,9],[5,8,10]),
669 Determination Coefficient for z = .9897039897039897
671 (%o2) z = ---------------
675 The same example without degree restrictions:
677 (%i3) plsquares(matrix([1,2,0],[3,5,4],[4,7,9],[5,8,10]),
679 Determination Coefficient for z = 1.0
680 x y + 23 y - 29 x - 19
681 (%o3) z = ----------------------
685 How many diagonals does a N-sides polygon have? What polynomial degree should be used?
687 (%i4) plsquares(matrix([3,0],[4,2],[5,5],[6,9],[7,14],[8,20]),
688 [N,diagonals],diagonals,5);
689 Determination Coefficient for diagonals = 1.0
692 (%o4) diagonals = --------
694 (%i5) ev(%, N=9); /* Testing for a 9 sides polygon */
698 How many ways do we have to put two queens without they are threatened into a n x n chessboard?
700 (%i6) plsquares(matrix([0,0],[1,0],[2,0],[3,8],[4,44]),
701 [n,positions],[positions],4);
702 Determination Coefficient for [positions] = [1.0]
705 3 n - 10 n + 9 n - 2 n
706 (%o6) [positions = -------------------------]
709 (%i7) ev(%[1], n=8); /* Testing for a (8 x 8) chessboard */
710 (%o7) positions = 1288
713 An example with six dependent variables:
715 (%i8) mtrx:matrix([0,0,0,0,0,1,1,1],[0,1,0,1,1,1,0,0],
716 [1,0,0,1,1,1,0,0],[1,1,1,1,0,0,0,1])$
717 (%i8) plsquares(mtrx,[a,b,_And,_Or,_Xor,_Nand,_Nor,_Nxor],
718 [_And,_Or,_Xor,_Nand,_Nor,_Nxor],1,0);
719 Determination Coefficient for
720 [_And, _Or, _Xor, _Nand, _Nor, _Nxor] =
721 [1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
722 (%o2) [_And = a b, _Or = - a b + b + a,
723 _Xor = - 2 a b + b + a, _Nand = 1 - a b,
724 _Nor = a b - b - a + 1, _Nxor = 2 a b - b - a + 1]
727 To use this function write first @code{load("lsquares")}.
729 @opencatbox{Categories:}
730 @category{Package lsquares}
731 @category{Numerical methods}