1 @c -----------------------------------------------------------------------------
2 @c File : lbfgs.de.texi
3 @c License : GNU General Public License (GPL)
5 @c Original : lbfgs.texi revision 1.9
8 @c This file is part of Maxima -- GPL CAS based on DOE-MACSYMA
9 @c -----------------------------------------------------------------------------
12 * Introduction to lbfgs::
13 * Functions and Variables for lbfgs::
16 @c -----------------------------------------------------------------------------
17 @node Introduction to lbfgs, Functions and Variables for lbfgs, Top, Top
18 @section Introduction to lbfgs
20 @code{lbfgs} is an implementation of the L-BFGS algorithm [1] to solve
21 unconstrained minimization problems via a limited-memory quasi-Newton (BFGS)
22 algorithm. It is called a limited-memory method because a low-rank
23 approximation of the Hessian matrix inverse is stored instead of the entire
24 Hessian inverse. The program was originally written in Fortran [2] by Jorge
25 Nocedal, incorporating some functions originally written by Jorge J. Mor@'{e}
26 and David J. Thuente, and translated into Lisp automatically via the program
27 @code{f2cl}. The Maxima package @code{lbfgs} comprises the translated code
28 plus an interface function which manages some details.
32 [1] D. Liu and J. Nocedal. "On the limited memory BFGS method for large
33 scale optimization". @i{Mathematical Programming B} 45:503--528 (1989)
35 [2] @url{http://netlib.org/opt/lbfgs_um.shar}
38 @c @category{Numerical methods} @category{Share packages} @category{Package lbfgs}
41 @c -----------------------------------------------------------------------------
42 @node Functions and Variables for lbfgs, , Introduction to lbfgs, Top
43 @section Functions and Variables for lbfgs
45 @c -----------------------------------------------------------------------------
46 @deffn {Function} lbfgs (@var{FOM}, @var{X}, @var{X0}, @var{epsilon}, @var{iprint})
47 @deffnx {Function} lbfgs ([@var{FOM}, @var{grad}] @var{X}, @var{X0}, @var{epsilon}, @var{iprint})
49 Finds an approximate solution of the unconstrained minimization of the figure
50 of merit @var{FOM} over the list of variables @var{X}, starting from initial
51 estimates @var{X0}, such that @math{norm(grad(FOM)) < epsilon*max(1, norm(X))}.
53 @var{grad}, if present, is the gradient of @var{FOM} with respect to the
54 variables @var{X}. @var{grad} is a list, with one element for each element of
55 @var{X}. If not present, the gradient is computed automatically by symbolic
58 The algorithm applied is a limited-memory quasi-Newton (BFGS) algorithm [1].
59 It is called a limited-memory method because a low-rank approximation of the
60 Hessian matrix inverse is stored instead of the entire Hessian inverse.
61 Each iteration of the algorithm is a line search, that is,
62 a search along a ray in the variables @var{X},
63 with the search direction computed from the approximate Hessian inverse.
64 The FOM is always decreased by a successful line search.
65 Usually (but not always) the norm of the gradient of FOM also decreases.
67 @var{iprint} controls progress messages printed by @code{lbfgs}.
71 @code{@var{iprint}[1]} controls the frequency of progress messages.
76 Messages at the first and last iterations.
78 Print a message every @code{@var{iprint}[1]} iterations.
81 @code{@var{iprint}[2]} controls the verbosity of progress messages.
84 Print out iteration count, number of evaluations of @var{FOM}, value of
85 @var{FOM}, norm of the gradient of @var{FOM}, and step length.
87 Same as @code{@var{iprint}[2] = 0}, plus @var{X0} and the gradient of @var{FOM}
88 evaluated at @var{X0}.
90 Same as @code{@var{iprint}[2] = 1}, plus values of @var{X} at each iteration.
92 Same as @code{@var{iprint}[2] = 2}, plus the gradient of @var{FOM} at each
97 The columns printed by @code{lbfgs} are the following.
101 Number of iterations. It is incremented for each line search.
103 Number of evaluations of the figure of merit.
105 Value of the figure of merit at the end of the most recent line search.
107 Norm of the gradient of the figure of merit at the end of the most recent line
110 An internal parameter of the search algorithm.
113 Additional information concerning details of the algorithm are found in the
114 comments of the original Fortran code [2].
116 See also @code{lbfgs_nfeval_max} and @code{lbfgs_ncorrections}.
120 [1] D. Liu and J. Nocedal. "On the limited memory BFGS method for large
121 scale optimization". @i{Mathematical Programming B} 45:503--528 (1989)
123 [2] @url{http://netlib.org/opt/lbfgs_um.shar}
127 The same FOM as computed by FGCOMPUTE in the program sdrive.f in the LBFGS
128 package from Netlib. Note that the variables in question are subscripted
129 variables. The FOM has an exact minimum equal to zero at @math{u[k] = 1} for
130 @math{k = 1, ..., 8}.
134 @c t1[j] := 1 - u[j];
135 @c t2[j] := 10*(u[j + 1] - u[j]^2);
137 @c FOM : sum (t1[2*j - 1]^2 + t2[2*j - 1]^2, j, 1, n/2);
138 @c lbfgs (FOM, '[u[1], u[2], u[3], u[4], u[5], u[6], u[7], u[8]],
139 @c [-1.2, 1, -1.2, 1, -1.2, 1, -1.2, 1], 1e-3, [1, 0]);
143 (%i1) load ("lbfgs");
144 (%o1) /usr/share/maxima/5.10.0cvs/share/lbfgs/lbfgs.mac
145 (%i2) t1[j] := 1 - u[j];
148 (%i3) t2[j] := 10*(u[j + 1] - u[j]^2);
150 (%o3) t2 := 10 (u - u )
154 (%i5) FOM : sum (t1[2*j - 1]^2 + t2[2*j - 1]^2, j, 1, n/2);
156 (%o5) 100 (u - u ) + (1 - u ) + 100 (u - u ) + (1 - u )
159 + 100 (u - u ) + (1 - u ) + 100 (u - u ) + (1 - u )
161 (%i6) lbfgs (FOM, '[u[1],u[2],u[3],u[4],u[5],u[6],u[7],u[8]],
162 [-1.2, 1, -1.2, 1, -1.2, 1, -1.2, 1], 1e-3, [1, 0]);
163 *************************************************
164 N= 8 NUMBER OF CORRECTIONS=25
166 F= 9.680000000000000D+01 GNORM= 4.657353755084532D+02
167 *************************************************
171 I NFN FUNC GNORM STEPLENGTH
173 1 3 1.651479526340304D+01 4.324359291335977D+00 7.926153934390631D-04
174 2 4 1.650209316638371D+01 3.575788161060007D+00 1.000000000000000D+00
175 3 5 1.645461701312851D+01 6.230869903601577D+00 1.000000000000000D+00
176 4 6 1.636867301275588D+01 1.177589920974980D+01 1.000000000000000D+00
177 5 7 1.612153014409201D+01 2.292797147151288D+01 1.000000000000000D+00
178 6 8 1.569118407390628D+01 3.687447158775571D+01 1.000000000000000D+00
179 7 9 1.510361958398942D+01 4.501931728123680D+01 1.000000000000000D+00
180 8 10 1.391077875774294D+01 4.526061463810632D+01 1.000000000000000D+00
181 9 11 1.165625686278198D+01 2.748348965356917D+01 1.000000000000000D+00
182 10 12 9.859422687859137D+00 2.111494974231644D+01 1.000000000000000D+00
183 11 13 7.815442521732281D+00 6.110762325766556D+00 1.000000000000000D+00
184 12 15 7.346380905773160D+00 2.165281166714631D+01 1.285316401779533D-01
185 13 16 6.330460634066370D+00 1.401220851762050D+01 1.000000000000000D+00
186 14 17 5.238763939851439D+00 1.702473787613255D+01 1.000000000000000D+00
187 15 18 3.754016790406701D+00 7.981845727704576D+00 1.000000000000000D+00
188 16 20 3.001238402309352D+00 3.925482944716691D+00 2.333129631296807D-01
189 17 22 2.794390709718290D+00 8.243329982546473D+00 2.503577283782332D-01
190 18 23 2.563783562918759D+00 1.035413426521790D+01 1.000000000000000D+00
191 19 24 2.019429976377856D+00 1.065187312346769D+01 1.000000000000000D+00
192 20 25 1.428003167670903D+00 2.475962450826961D+00 1.000000000000000D+00
193 21 27 1.197874264861340D+00 8.441707983493810D+00 4.303451060808756D-01
194 22 28 9.023848941942773D-01 1.113189216635162D+01 1.000000000000000D+00
195 23 29 5.508226405863770D-01 2.380830600326308D+00 1.000000000000000D+00
196 24 31 3.902893258815567D-01 5.625595816584421D+00 4.834988416524465D-01
197 25 32 3.207542206990315D-01 1.149444645416472D+01 1.000000000000000D+00
198 26 33 1.874468266362791D-01 3.632482152880997D+00 1.000000000000000D+00
199 27 34 9.575763380706598D-02 4.816497446154354D+00 1.000000000000000D+00
200 28 35 4.085145107543406D-02 2.087009350166495D+00 1.000000000000000D+00
201 29 36 1.931106001379290D-02 3.886818608498966D+00 1.000000000000000D+00
202 30 37 6.894000721499670D-03 3.198505796342214D+00 1.000000000000000D+00
203 31 38 1.443296033051864D-03 1.590265471025043D+00 1.000000000000000D+00
204 32 39 1.571766603154336D-04 3.098257063980634D-01 1.000000000000000D+00
205 33 40 1.288011776581970D-05 1.207784183577257D-02 1.000000000000000D+00
206 34 41 1.806140173752971D-06 4.587890233385193D-02 1.000000000000000D+00
207 35 42 1.769004645459358D-07 1.790537375052208D-02 1.000000000000000D+00
208 36 43 3.312164100763217D-10 6.782068426119681D-04 1.000000000000000D+00
212 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
213 I& NFN& FUNC& GNORM& STEPLENGTH\cr
215 1& 3& 1.651479526340304D+01& 4.324359291335977D+00& 7.926153934390631D-04\cr
216 2& 4& 1.650209316638371D+01& 3.575788161060007D+00& 1.000000000000000D+00\cr
217 3& 5& 1.645461701312851D+01& 6.230869903601577D+00& 1.000000000000000D+00\cr
218 4& 6& 1.636867301275588D+01& 1.177589920974980D+01& 1.000000000000000D+00\cr
219 5& 7& 1.612153014409201D+01& 2.292797147151288D+01& 1.000000000000000D+00\cr
220 6& 8& 1.569118407390628D+01& 3.687447158775571D+01& 1.000000000000000D+00\cr
221 7& 9& 1.510361958398942D+01& 4.501931728123680D+01& 1.000000000000000D+00\cr
222 8&10& 1.391077875774294D+01& 4.526061463810632D+01& 1.000000000000000D+00\cr
223 9&11& 1.165625686278198D+01& 2.748348965356917D+01& 1.000000000000000D+00\cr
224 10&12& 9.859422687859137D+00& 2.111494974231644D+01& 1.000000000000000D+00\cr
225 11&13& 7.815442521732281D+00& 6.110762325766556D+00& 1.000000000000000D+00\cr
226 12&15& 7.346380905773160D+00& 2.165281166714631D+01& 1.285316401779533D-01\cr
227 13&16& 6.330460634066370D+00& 1.401220851762050D+01& 1.000000000000000D+00\cr
228 14&17& 5.238763939851439D+00& 1.702473787613255D+01& 1.000000000000000D+00\cr
229 15&18& 3.754016790406701D+00& 7.981845727704576D+00& 1.000000000000000D+00\cr
230 16&20& 3.001238402309352D+00& 3.925482944716691D+00& 2.333129631296807D-01\cr
231 17&22& 2.794390709718290D+00& 8.243329982546473D+00& 2.503577283782332D-01\cr
232 18&23& 2.563783562918759D+00& 1.035413426521790D+01& 1.000000000000000D+00\cr
233 19&24& 2.019429976377856D+00& 1.065187312346769D+01& 1.000000000000000D+00\cr
234 20&25& 1.428003167670903D+00& 2.475962450826961D+00& 1.000000000000000D+00\cr
235 21&27& 1.197874264861340D+00& 8.441707983493810D+00& 4.303451060808756D-01\cr
236 22&28& 9.023848941942773D-01& 1.113189216635162D+01& 1.000000000000000D+00\cr
237 23&29& 5.508226405863770D-01& 2.380830600326308D+00& 1.000000000000000D+00\cr
238 24&31& 3.902893258815567D-01& 5.625595816584421D+00& 4.834988416524465D-01\cr
239 25&32& 3.207542206990315D-01& 1.149444645416472D+01& 1.000000000000000D+00\cr
240 26&33& 1.874468266362791D-01& 3.632482152880997D+00& 1.000000000000000D+00\cr
241 27&34& 9.575763380706598D-02& 4.816497446154354D+00& 1.000000000000000D+00\cr
242 28&35& 4.085145107543406D-02& 2.087009350166495D+00& 1.000000000000000D+00\cr
243 29&36& 1.931106001379290D-02& 3.886818608498966D+00& 1.000000000000000D+00\cr
244 30&37& 6.894000721499670D-03& 3.198505796342214D+00& 1.000000000000000D+00\cr
245 31&38& 1.443296033051864D-03& 1.590265471025043D+00& 1.000000000000000D+00\cr
246 32&39& 1.571766603154336D-04& 3.098257063980634D-01& 1.000000000000000D+00\cr
247 33&40& 1.288011776581970D-05& 1.207784183577257D-02& 1.000000000000000D+00\cr
248 34&41& 1.806140173752971D-06& 4.587890233385193D-02& 1.000000000000000D+00\cr
249 35&42& 1.769004645459358D-07& 1.790537375052208D-02& 1.000000000000000D+00\cr
250 36&43& 3.312164100763217D-10& 6.782068426119681D-04& 1.000000000000000D+00\cr
255 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
257 (%o6) [u = 1.000005339815974, u = 1.000009942839805,
259 u = 1.000005339815974, u = 1.000009942839805,
261 u = 1.000005339815974, u = 1.000009942839805,
263 u = 1.000005339815974, u = 1.000009942839805]
267 A regression problem. The FOM is the mean square difference between the
268 predicted value @math{F(X[i])} and the observed value @math{Y[i]}. The function
269 @math{F} is a bounded monotone function (a so-called "sigmoidal" function). In
270 this example, @code{lbfgs} computes approximate values for the parameters of
271 @math{F} and @code{plot2d} displays a comparison of @math{F} with the observed
276 @c FOM : '((1/length(X))*sum((F(X[i]) - Y[i])^2, i, 1,
278 @c X : [1, 2, 3, 4, 5];
279 @c Y : [0, 0.5, 1, 1.25, 1.5];
280 @c F(x) := A/(1 + exp(-B*(x - C)));
282 @c estimates : lbfgs (FOM, '[A, B, C], [1, 1, 1], 1e-4, [1, 0]);
283 @c plot2d ([F(x), [discrete, X, Y]], [x, -1, 6]), ''estimates;
287 (%i1) load ("lbfgs");
288 (%o1) /usr/share/maxima/5.10.0cvs/share/lbfgs/lbfgs.mac
289 (%i2) FOM : '((1/length(X))*sum((F(X[i]) - Y[i])^2, i, 1,
292 sum((F(X ) - Y ) , i, 1, length(X))
294 (%o2) -----------------------------------
296 (%i3) X : [1, 2, 3, 4, 5];
297 (%o3) [1, 2, 3, 4, 5]
298 (%i4) Y : [0, 0.5, 1, 1.25, 1.5];
299 (%o4) [0, 0.5, 1, 1.25, 1.5]
300 (%i5) F(x) := A/(1 + exp(-B*(x - C)));
302 (%o5) F(x) := ----------------------
303 1 + exp((- B) (x - C))
306 (%o6) ((----------------- - 1.5) + (----------------- - 1.25)
307 - B (5 - C) - B (4 - C)
310 + (----------------- - 1) + (----------------- - 0.5)
311 - B (3 - C) - B (2 - C)
315 + --------------------)/5
318 (%i7) estimates : lbfgs (FOM, '[A, B, C], [1, 1, 1], 1e-4, [1, 0]);
319 *************************************************
320 N= 3 NUMBER OF CORRECTIONS=25
322 F= 1.348738534246918D-01 GNORM= 2.000215531936760D-01
323 *************************************************
328 I NFN FUNC GNORM STEPLENGTH
329 1 3 1.177820636622582D-01 9.893138394953992D-02 8.554435968992371D-01
330 2 6 2.302653892214013D-02 1.180098521565904D-01 2.100000000000000D+01
331 3 8 1.496348495303005D-02 9.611201567691633D-02 5.257340567840707D-01
332 4 9 7.900460841091139D-03 1.325041647391314D-02 1.000000000000000D+00
333 5 10 7.314495451266917D-03 1.510670810312237D-02 1.000000000000000D+00
334 6 11 6.750147275936680D-03 1.914964958023047D-02 1.000000000000000D+00
335 7 12 5.850716021108205D-03 1.028089194579363D-02 1.000000000000000D+00
336 8 13 5.778664230657791D-03 3.676866074530332D-04 1.000000000000000D+00
337 9 14 5.777818823650782D-03 3.010740179797255D-04 1.000000000000000D+00
341 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
342 I& NFN& FUNC& GNORM& STEPLENGTH\cr
344 1& 3&1.177820636622582D-01& 9.893138394953992D-02& 8.554435968992371D-01\cr
345 2& 6&2.302653892214013D-02& 1.180098521565904D-01& 2.100000000000000D+01\cr
346 3& 8&1.496348495303005D-02& 9.611201567691633D-02& 5.257340567840707D-01\cr
347 4& 9&7.900460841091139D-03& 1.325041647391314D-02& 1.000000000000000D+00\cr
348 5& 10&7.314495451266917D-03& 1.510670810312237D-02& 1.000000000000000D+00\cr
349 6& 11&6.750147275936680D-03& 1.914964958023047D-02& 1.000000000000000D+00\cr
350 7& 12&5.850716021108205D-03& 1.028089194579363D-02& 1.000000000000000D+00\cr
351 8& 13&5.778664230657791D-03& 3.676866074530332D-04& 1.000000000000000D+00\cr
352 9& 14&5.777818823650782D-03& 3.010740179797255D-04& 1.000000000000000D+00\cr
357 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
359 (%o7) [A = 1.461933911464101, B = 1.601593973254802,
360 C = 2.528933072164854]
361 (%i8) plot2d ([F(x), [discrete, X, Y]], [x, -1, 6]), ''estimates;
365 Gradient of FOM is specified (instead of computing it automatically).
369 @c F(a, b, c) := (a - 5)^2 + (b - 3)^4 + (c - 2)^6;
370 @c F_grad : map (lambda ([x], diff (F(a, b, c), x)), [a, b, c]);
371 @c estimates : lbfgs ([F(a, b, c), F_grad],
372 @c [a, b, c], [0, 0, 0], 1e-4, [1, 0]);
375 (%i1) load ("lbfgs")$
376 (%i2) F(a, b, c) := (a - 5)^2 + (b - 3)^4 + (c - 2)^6;
378 (%o2) F(a, b, c) := (a - 5) + (b - 3) + (c - 2)
379 (%i3) F_grad : map (lambda ([x], diff (F(a, b, c), x)), [a, b, c]);
381 (%o3) [2 (a - 5), 4 (b - 3) , 6 (c - 2) ]
382 (%i4) estimates : lbfgs ([F(a, b, c), F_grad],
383 [a, b, c], [0, 0, 0], 1e-4, [1, 0]);
384 *************************************************
385 N= 3 NUMBER OF CORRECTIONS=25
387 F= 1.700000000000000D+02 GNORM= 2.205175729958953D+02
388 *************************************************
393 I NFN FUNC GNORM STEPLENGTH
395 1 2 6.632967565917638D+01 6.498411132518770D+01 4.534785987412505D-03
396 2 3 4.368890936228036D+01 3.784147651974131D+01 1.000000000000000D+00
397 3 4 2.685298972775190D+01 1.640262125898521D+01 1.000000000000000D+00
398 4 5 1.909064767659852D+01 9.733664001790506D+00 1.000000000000000D+00
399 5 6 1.006493272061515D+01 6.344808151880209D+00 1.000000000000000D+00
400 6 7 1.215263596054294D+00 2.204727876126879D+00 1.000000000000000D+00
401 7 8 1.080252896385334D-02 1.431637116951849D-01 1.000000000000000D+00
402 8 9 8.407195124830908D-03 1.126344579730013D-01 1.000000000000000D+00
403 9 10 5.022091686198527D-03 7.750731829225274D-02 1.000000000000000D+00
404 10 11 2.277152808939775D-03 5.032810859286795D-02 1.000000000000000D+00
405 11 12 6.489384688303218D-04 1.932007150271008D-02 1.000000000000000D+00
406 12 13 2.075791943844548D-04 6.964319310814364D-03 1.000000000000000D+00
407 13 14 7.349472666162257D-05 4.017449067849554D-03 1.000000000000000D+00
408 14 15 2.293617477985237D-05 1.334590390856715D-03 1.000000000000000D+00
409 15 16 7.683645404048675D-06 6.011057038099201D-04 1.000000000000000D+00
413 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
414 I& NFN& FUNC& GNORM& STEPLENGTH\cr
416 1& 2& 6.632967565917638D+01& 6.498411132518770D+01& 4.534785987412505D-03\cr
417 2& 3& 4.368890936228036D+01& 3.784147651974131D+01& 1.000000000000000D+00\cr
418 3& 4& 2.685298972775190D+01& 1.640262125898521D+01& 1.000000000000000D+00\cr
419 4& 5& 1.909064767659852D+01& 9.733664001790506D+00& 1.000000000000000D+00\cr
420 5& 6& 1.006493272061515D+01& 6.344808151880209D+00& 1.000000000000000D+00\cr
421 6& 7& 1.215263596054294D+00& 2.204727876126879D+00& 1.000000000000000D+00\cr
422 7& 8& 1.080252896385334D-02& 1.431637116951849D-01& 1.000000000000000D+00\cr
423 8& 9& 8.407195124830908D-03& 1.126344579730013D-01& 1.000000000000000D+00\cr
424 9& 10& 5.022091686198527D-03& 7.750731829225274D-02& 1.000000000000000D+00\cr
425 10& 11& 2.277152808939775D-03& 5.032810859286795D-02& 1.000000000000000D+00\cr
426 11& 12& 6.489384688303218D-04& 1.932007150271008D-02& 1.000000000000000D+00\cr
427 12& 13& 2.075791943844548D-04& 6.964319310814364D-03& 1.000000000000000D+00\cr
428 13& 14& 7.349472666162257D-05& 4.017449067849554D-03& 1.000000000000000D+00\cr
429 14& 15& 2.293617477985237D-05& 1.334590390856715D-03& 1.000000000000000D+00\cr
430 15& 16& 7.683645404048675D-06& 6.011057038099201D-04& 1.000000000000000D+00\cr
435 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
437 (%o4) [a = 5.000086823042934, b = 3.05239542970518,
438 c = 1.927980629919583]
442 @c @category{Package lbfgs}
446 @c -----------------------------------------------------------------------------
447 @defvr {Variable} lbfgs_nfeval_max
450 @code{lbfgs_nfeval_max} is the maximum number of evaluations of the figure of
451 merit (FOM) in @code{lbfgs}. When @code{lbfgs_nfeval_max} is reached,
452 @code{lbfgs} returns the result of the last successful line search.
455 @c @category{Package lbfgs}
459 @c -----------------------------------------------------------------------------
460 @defvr {Variable} lbfgs_ncorrections
463 @code{lbfgs_ncorrections} is the number of corrections applied
464 to the approximate inverse Hessian matrix which is maintained by @code{lbfgs}.
467 @c @category{Package lbfgs}
471 @c --- End of file lbfgs.de.texi -----------------------------------------------