2 * Introduction to lbfgs::
3 * Functions and Variables for lbfgs::
6 @node Introduction to lbfgs, Functions and Variables for lbfgs, , Package lbfgs
7 @section Introduction to lbfgs
9 @code{lbfgs} is an implementation of the L-BFGS algorithm [1]
10 to solve unconstrained minimization problems via a limited-memory quasi-Newton (BFGS) algorithm.
11 It is called a limited-memory method because a low-rank approximation of the
12 Hessian matrix inverse is stored instead of the entire Hessian inverse.
13 The program was originally written in Fortran [2] by Jorge Nocedal,
14 incorporating some functions originally written by Jorge J. Mor@'{e} and David J. Thuente,
15 and translated into Lisp automatically via the program @code{f2cl}.
16 The Maxima package @code{lbfgs} comprises the translated code plus
17 an interface function which manages some details.
21 [1] D. Liu and J. Nocedal. "On the limited memory BFGS method for large
22 scale optimization". @i{Mathematical Programming B} 45:503--528 (1989)
24 [2] @url{https://www.netlib.org/opt/lbfgs_um.shar}
26 @opencatbox{Categories:}
27 @category{Numerical methods}
28 @category{Optimization}
29 @category{Share packages}
30 @category{Package lbfgs}
33 @node Functions and Variables for lbfgs, , Introduction to lbfgs, Package lbfgs
34 @section Functions and Variables for lbfgs
37 @deffn {Function} lbfgs @
38 @fname{lbfgs} (@var{FOM}, @var{X}, @var{X0}, @var{epsilon}, @var{iprint}) @
39 @fname{lbfgs} ([@var{FOM}, @var{grad}] @var{X}, @var{X0}, @var{epsilon}, @var{iprint})
41 Finds an approximate solution of the unconstrained minimization of the figure of merit @var{FOM}
42 over the list of variables @var{X},
43 starting from initial estimates @var{X0},
44 such that @math{norm(grad(FOM)) < epsilon*max(1, norm(X))}.
46 @var{grad}, if present, is the gradient of @var{FOM} with respect to the variables @var{X}.
47 @var{grad} may be a list or a function that returns a list, with one element for each element of @var{X}.
48 If not present, the gradient is computed automatically by symbolic differentiation.
49 If @var{FOM} is a function, the gradient @var{grad} must be supplied by the user.
51 The algorithm applied is a limited-memory quasi-Newton (BFGS) algorithm [1].
52 It is called a limited-memory method because a low-rank approximation of the
53 Hessian matrix inverse is stored instead of the entire Hessian inverse.
54 Each iteration of the algorithm is a line search, that is,
55 a search along a ray in the variables @var{X},
56 with the search direction computed from the approximate Hessian inverse.
57 The FOM is always decreased by a successful line search.
58 Usually (but not always) the norm of the gradient of FOM also decreases.
60 @var{iprint} controls progress messages printed by @code{lbfgs}.
64 @code{@var{iprint}[1]} controls the frequency of progress messages.
69 Messages at the first and last iterations.
71 Print a message every @code{@var{iprint}[1]} iterations.
74 @code{@var{iprint}[2]} controls the verbosity of progress messages.
77 Print out iteration count, number of evaluations of @var{FOM}, value of @var{FOM},
78 norm of the gradient of @var{FOM}, and step length.
80 Same as @code{@var{iprint}[2] = 0}, plus @var{X0} and the gradient of @var{FOM} evaluated at @var{X0}.
82 Same as @code{@var{iprint}[2] = 1}, plus values of @var{X} at each iteration.
84 Same as @code{@var{iprint}[2] = 2}, plus the gradient of @var{FOM} at each iteration.
88 The columns printed by @code{lbfgs} are the following.
92 Number of iterations. It is incremented for each line search.
94 Number of evaluations of the figure of merit.
96 Value of the figure of merit at the end of the most recent line search.
98 Norm of the gradient of the figure of merit at the end of the most recent line search.
100 An internal parameter of the search algorithm.
103 Additional information concerning details of the algorithm are found in the
104 comments of the original Fortran code [2].
106 See also @mref{lbfgs_nfeval_max} and @mrefdot{lbfgs_ncorrections}
110 [1] D. Liu and J. Nocedal. "On the limited memory BFGS method for large
111 scale optimization". @i{Mathematical Programming B} 45:503--528 (1989)
113 [2] @url{https://www.netlib.org/opt/lbfgs_um.shar}
117 The same FOM as computed by FGCOMPUTE in the program sdrive.f in the LBFGS package from Netlib.
118 Note that the variables in question are subscripted variables.
119 The FOM has an exact minimum equal to zero at @math{u[k] = 1} for @math{k = 1, ..., 8}.
122 @c t1[j] := 1 - u[j];
123 @c t2[j] := 10*(u[j + 1] - u[j]^2);
125 @c FOM : sum (t1[2*j - 1]^2 + t2[2*j - 1]^2, j, 1, n/2);
126 @c lbfgs (FOM, '[u[1], u[2], u[3], u[4], u[5], u[6], u[7], u[8]],
127 @c [-1.2, 1, -1.2, 1, -1.2, 1, -1.2, 1], 1e-3, [1, 0]);
131 (%i1) load ("lbfgs")$
132 (%i2) t1[j] := 1 - u[j];
135 (%i3) t2[j] := 10*(u[j + 1] - u[j]^2);
137 (%o3) t2 := 10 (u - u )
141 (%i5) FOM : sum (t1[2*j - 1]^2 + t2[2*j - 1]^2, j, 1, n/2);
143 (%o5) 100 (u - u ) + (1 - u ) + 100 (u - u ) + (1 - u )
146 + 100 (u - u ) + (1 - u ) + 100 (u - u ) + (1 - u )
148 (%i6) lbfgs (FOM, '[u[1],u[2],u[3],u[4],u[5],u[6],u[7],u[8]],
149 [-1.2, 1, -1.2, 1, -1.2, 1, -1.2, 1], 1e-3, [1, 0]);
150 *************************************************
151 N= 8 NUMBER OF CORRECTIONS=25
153 F= 9.680000000000000D+01 GNORM= 4.657353755084533D+02
154 *************************************************
158 I NFN FUNC GNORM STEPLENGTH
160 1 3 1.651479526340304D+01 4.324359291335977D+00 7.926153934390631D-04
161 2 4 1.650209316638371D+01 3.575788161060007D+00 1.000000000000000D+00
162 3 5 1.645461701312851D+01 6.230869903601577D+00 1.000000000000000D+00
163 4 6 1.636867301275588D+01 1.177589920974980D+01 1.000000000000000D+00
164 5 7 1.612153014409201D+01 2.292797147151288D+01 1.000000000000000D+00
165 6 8 1.569118407390628D+01 3.687447158775571D+01 1.000000000000000D+00
166 7 9 1.510361958398942D+01 4.501931728123679D+01 1.000000000000000D+00
167 8 10 1.391077875774293D+01 4.526061463810630D+01 1.000000000000000D+00
168 9 11 1.165625686278198D+01 2.748348965356907D+01 1.000000000000000D+00
169 10 12 9.859422687859144D+00 2.111494974231706D+01 1.000000000000000D+00
170 11 13 7.815442521732282D+00 6.110762325764183D+00 1.000000000000000D+00
171 12 15 7.346380905773044D+00 2.165281166715009D+01 1.285316401779678D-01
172 13 16 6.330460634066464D+00 1.401220851761508D+01 1.000000000000000D+00
173 14 17 5.238763939854303D+00 1.702473787619218D+01 1.000000000000000D+00
174 15 18 3.754016790406625D+00 7.981845727632704D+00 1.000000000000000D+00
175 16 20 3.001238402313225D+00 3.925482944745832D+00 2.333129631316462D-01
176 17 22 2.794390709722064D+00 8.243329982586480D+00 2.503577283802312D-01
177 18 23 2.563783562920545D+00 1.035413426522664D+01 1.000000000000000D+00
178 19 24 2.019429976373283D+00 1.065187312340952D+01 1.000000000000000D+00
179 20 25 1.428003167668592D+00 2.475962450735100D+00 1.000000000000000D+00
180 21 27 1.197874264859232D+00 8.441707983339661D+00 4.303451060697367D-01
181 22 28 9.023848942003913D-01 1.113189216665625D+01 1.000000000000000D+00
182 23 29 5.508226405855795D-01 2.380830599637816D+00 1.000000000000000D+00
183 24 31 3.902893258879521D-01 5.625595817143044D+00 4.834988416747262D-01
184 25 32 3.207542206881058D-01 1.149444645298493D+01 1.000000000000000D+00
185 26 33 1.874468266118200D-01 3.632482152347445D+00 1.000000000000000D+00
186 27 34 9.575763380282112D-02 4.816497449000391D+00 1.000000000000000D+00
187 28 35 4.085145106760390D-02 2.087009347116811D+00 1.000000000000000D+00
188 29 36 1.931106005512628D-02 3.886818624052740D+00 1.000000000000000D+00
189 30 37 6.894000636920714D-03 3.198505769992936D+00 1.000000000000000D+00
190 31 38 1.443296008850287D-03 1.590265460381961D+00 1.000000000000000D+00
191 32 39 1.571766574930155D-04 3.098257002223532D-01 1.000000000000000D+00
192 33 40 1.288011779655132D-05 1.207784334505595D-02 1.000000000000000D+00
193 34 41 1.806140190993455D-06 4.587890258846915D-02 1.000000000000000D+00
194 35 42 1.769004612050548D-07 1.790537363138099D-02 1.000000000000000D+00
195 36 43 3.312164244118216D-10 6.782068546986653D-04 1.000000000000000D+00
199 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
200 I& NFN& FUNC& GNORM& STEPLENGTH\cr
202 1& 3& 1.651479526340304D+01& 4.324359291335977D+00& 7.926153934390631D-04\cr
203 2& 4& 1.650209316638371D+01& 3.575788161060007D+00& 1.000000000000000D+00\cr
204 3& 5& 1.645461701312851D+01& 6.230869903601577D+00& 1.000000000000000D+00\cr
205 4& 6& 1.636867301275588D+01& 1.177589920974980D+01& 1.000000000000000D+00\cr
206 5& 7& 1.612153014409201D+01& 2.292797147151288D+01& 1.000000000000000D+00\cr
207 6& 8& 1.569118407390628D+01& 3.687447158775571D+01& 1.000000000000000D+00\cr
208 7& 9& 1.510361958398942D+01& 4.501931728123680D+01& 1.000000000000000D+00\cr
209 8&10& 1.391077875774294D+01& 4.526061463810632D+01& 1.000000000000000D+00\cr
210 9&11& 1.165625686278198D+01& 2.748348965356917D+01& 1.000000000000000D+00\cr
211 10&12& 9.859422687859137D+00& 2.111494974231644D+01& 1.000000000000000D+00\cr
212 11&13& 7.815442521732281D+00& 6.110762325766556D+00& 1.000000000000000D+00\cr
213 12&15& 7.346380905773160D+00& 2.165281166714631D+01& 1.285316401779533D-01\cr
214 13&16& 6.330460634066370D+00& 1.401220851762050D+01& 1.000000000000000D+00\cr
215 14&17& 5.238763939851439D+00& 1.702473787613255D+01& 1.000000000000000D+00\cr
216 15&18& 3.754016790406701D+00& 7.981845727704576D+00& 1.000000000000000D+00\cr
217 16&20& 3.001238402309352D+00& 3.925482944716691D+00& 2.333129631296807D-01\cr
218 17&22& 2.794390709718290D+00& 8.243329982546473D+00& 2.503577283782332D-01\cr
219 18&23& 2.563783562918759D+00& 1.035413426521790D+01& 1.000000000000000D+00\cr
220 19&24& 2.019429976377856D+00& 1.065187312346769D+01& 1.000000000000000D+00\cr
221 20&25& 1.428003167670903D+00& 2.475962450826961D+00& 1.000000000000000D+00\cr
222 21&27& 1.197874264861340D+00& 8.441707983493810D+00& 4.303451060808756D-01\cr
223 22&28& 9.023848941942773D-01& 1.113189216635162D+01& 1.000000000000000D+00\cr
224 23&29& 5.508226405863770D-01& 2.380830600326308D+00& 1.000000000000000D+00\cr
225 24&31& 3.902893258815567D-01& 5.625595816584421D+00& 4.834988416524465D-01\cr
226 25&32& 3.207542206990315D-01& 1.149444645416472D+01& 1.000000000000000D+00\cr
227 26&33& 1.874468266362791D-01& 3.632482152880997D+00& 1.000000000000000D+00\cr
228 27&34& 9.575763380706598D-02& 4.816497446154354D+00& 1.000000000000000D+00\cr
229 28&35& 4.085145107543406D-02& 2.087009350166495D+00& 1.000000000000000D+00\cr
230 29&36& 1.931106001379290D-02& 3.886818608498966D+00& 1.000000000000000D+00\cr
231 30&37& 6.894000721499670D-03& 3.198505796342214D+00& 1.000000000000000D+00\cr
232 31&38& 1.443296033051864D-03& 1.590265471025043D+00& 1.000000000000000D+00\cr
233 32&39& 1.571766603154336D-04& 3.098257063980634D-01& 1.000000000000000D+00\cr
234 33&40& 1.288011776581970D-05& 1.207784183577257D-02& 1.000000000000000D+00\cr
235 34&41& 1.806140173752971D-06& 4.587890233385193D-02& 1.000000000000000D+00\cr
236 35&42& 1.769004645459358D-07& 1.790537375052208D-02& 1.000000000000000D+00\cr
237 36&43& 3.312164100763217D-10& 6.782068426119681D-04& 1.000000000000000D+00\cr
242 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
244 (%o6) [u = 1.000005339816132, u = 1.000009942840108,
246 u = 1.000005339816132, u = 1.000009942840108,
248 u = 1.000005339816132, u = 1.000009942840108,
250 u = 1.000005339816132, u = 1.000009942840108]
254 A regression problem.
255 The FOM is the mean square difference between the predicted value @math{F(X[i])}
256 and the observed value @math{Y[i]}.
257 The function @math{F} is a bounded monotone function (a so-called "sigmoidal" function).
258 In this example, @code{lbfgs} computes approximate values for the parameters of @math{F}
259 and @code{plot2d} displays a comparison of @math{F} with the observed data.
262 @c FOM : '((1/length(X))*sum((F(X[i]) - Y[i])^2, i, 1,
264 @c X : [1, 2, 3, 4, 5];
265 @c Y : [0, 0.5, 1, 1.25, 1.5];
266 @c F(x) := A/(1 + exp(-B*(x - C)));
268 @c estimates : lbfgs (FOM, '[A, B, C], [1, 1, 1], 1e-4, [1, 0]);
269 @c plot2d ([F(x), [discrete, X, Y]], [x, -1, 6]), ''estimates;
273 (%i1) load ("lbfgs")$
274 (%i2) FOM : '((1/length(X))*sum((F(X[i]) - Y[i])^2, i, 1,
277 sum((F(X ) - Y ) , i, 1, length(X))
279 (%o2) -----------------------------------
281 (%i3) X : [1, 2, 3, 4, 5];
282 (%o3) [1, 2, 3, 4, 5]
283 (%i4) Y : [0, 0.5, 1, 1.25, 1.5];
284 (%o4) [0, 0.5, 1, 1.25, 1.5]
285 (%i5) F(x) := A/(1 + exp(-B*(x - C)));
287 (%o5) F(x) := ----------------------
288 1 + exp((- B) (x - C))
291 (%o6) ((----------------- - 1.5) + (----------------- - 1.25)
292 - B (5 - C) - B (4 - C)
295 + (----------------- - 1) + (----------------- - 0.5)
296 - B (3 - C) - B (2 - C)
300 + --------------------)/5
303 (%i7) estimates : lbfgs (FOM, '[A, B, C], [1, 1, 1], 1e-4, [1, 0]);
304 *************************************************
305 N= 3 NUMBER OF CORRECTIONS=25
307 F= 1.348738534246918D-01 GNORM= 2.000215531936760D-01
308 *************************************************
313 I NFN FUNC GNORM STEPLENGTH
314 1 3 1.177820636622582D-01 9.893138394953992D-02 8.554435968992371D-01
315 2 6 2.302653892214013D-02 1.180098521565904D-01 2.100000000000000D+01
316 3 8 1.496348495303004D-02 9.611201567691624D-02 5.257340567840710D-01
317 4 9 7.900460841091138D-03 1.325041647391314D-02 1.000000000000000D+00
318 5 10 7.314495451266914D-03 1.510670810312226D-02 1.000000000000000D+00
319 6 11 6.750147275936668D-03 1.914964958023037D-02 1.000000000000000D+00
320 7 12 5.850716021108202D-03 1.028089194579382D-02 1.000000000000000D+00
321 8 13 5.778664230657800D-03 3.676866074532179D-04 1.000000000000000D+00
322 9 14 5.777818823650780D-03 3.010740179797108D-04 1.000000000000000D+00
326 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
327 I& NFN& FUNC& GNORM& STEPLENGTH\cr
329 1& 3&1.177820636622582D-01& 9.893138394953992D-02& 8.554435968992371D-01\cr
330 2& 6&2.302653892214013D-02& 1.180098521565904D-01& 2.100000000000000D+01\cr
331 3& 8&1.496348495303004D-02& 9.611201567691624D-02& 5.257340567840710D-01\cr
332 4& 9&7.900460841091138D-03& 1.325041647391314D-02& 1.000000000000000D+00\cr
333 5& 10&7.314495451266914D-03& 1.510670810312226D-02& 1.000000000000000D+00\cr
334 6& 11&6.750147275936668D-03& 1.914964958023037D-02& 1.000000000000000D+00\cr
335 7& 12&5.850716021108202D-03& 1.028089194579382D-02& 1.000000000000000D+00\cr
336 8& 13&5.778664230657800D-03& 3.676866074532179D-04& 1.000000000000000D+00\cr
337 9& 14&5.777818823650780D-03& 3.010740179797108D-04& 1.000000000000000D+00\cr
342 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
344 (%o7) [A = 1.461933911464101, B = 1.601593973254801,
345 C = 2.528933072164855]
346 (%i8) plot2d ([F(x), [discrete, X, Y]], [x, -1, 6]), ''estimates;
350 Gradient of FOM is specified (instead of computing it automatically).
351 Both the FOM and its gradient are passed as functions to @code{lbfgs}.
355 @c F(a, b, c) := (a - 5)^2 + (b - 3)^4 + (c - 2)^6;
356 @c define(F_grad(a, b, c),
357 @c map (lambda ([x], diff (F(a, b, c), x)), [a, b, c]));
358 @c estimates : lbfgs ([F, F_grad],
359 @c [a, b, c], [0, 0, 0], 1e-4, [1, 0]);
362 (%i1) load ("lbfgs")$
363 (%i2) F(a, b, c) := (a - 5)^2 + (b - 3)^4 + (c - 2)^6$
364 (%i3) define(F_grad(a, b, c),
365 map (lambda ([x], diff (F(a, b, c), x)), [a, b, c]))$
366 (%i4) estimates : lbfgs ([F, F_grad],
367 [a, b, c], [0, 0, 0], 1e-4, [1, 0]);
368 *************************************************
369 N= 3 NUMBER OF CORRECTIONS=25
371 F= 1.700000000000000D+02 GNORM= 2.205175729958953D+02
372 *************************************************
377 I NFN FUNC GNORM STEPLENGTH
379 1 2 6.632967565917637D+01 6.498411132518770D+01 4.534785987412505D-03
380 2 3 4.368890936228036D+01 3.784147651974131D+01 1.000000000000000D+00
381 3 4 2.685298972775191D+01 1.640262125898520D+01 1.000000000000000D+00
382 4 5 1.909064767659852D+01 9.733664001790506D+00 1.000000000000000D+00
383 5 6 1.006493272061515D+01 6.344808151880209D+00 1.000000000000000D+00
384 6 7 1.215263596054292D+00 2.204727876126877D+00 1.000000000000000D+00
385 7 8 1.080252896385329D-02 1.431637116951845D-01 1.000000000000000D+00
386 8 9 8.407195124830860D-03 1.126344579730008D-01 1.000000000000000D+00
387 9 10 5.022091686198525D-03 7.750731829225275D-02 1.000000000000000D+00
388 10 11 2.277152808939775D-03 5.032810859286796D-02 1.000000000000000D+00
389 11 12 6.489384688303218D-04 1.932007150271009D-02 1.000000000000000D+00
390 12 13 2.075791943844547D-04 6.964319310814365D-03 1.000000000000000D+00
391 13 14 7.349472666162258D-05 4.017449067849554D-03 1.000000000000000D+00
392 14 15 2.293617477985238D-05 1.334590390856715D-03 1.000000000000000D+00
393 15 16 7.683645404048675D-06 6.011057038099202D-04 1.000000000000000D+00
397 \halign{\hfil\tt#&\quad\hfil\tt#\quad&\tt#\hfil\quad&\tt#\hfil\quad&\tt#\hfil\cr
398 I& NFN& FUNC& GNORM& STEPLENGTH\cr
400 1& 2& 6.632967565917637D+01& 6.498411132518770D+01& 4.534785987412505D-03\cr
401 2& 3& 4.368890936228036D+01& 3.784147651974131D+01& 1.000000000000000D+00\cr
402 3& 4& 2.685298972775191D+01& 1.640262125898520D+01& 1.000000000000000D+00\cr
403 4& 5& 1.909064767659852D+01& 9.733664001790506D+00& 1.000000000000000D+00\cr
404 5& 6& 1.006493272061515D+01& 6.344808151880209D+00& 1.000000000000000D+00\cr
405 6& 7& 1.215263596054292D+00& 2.204727876126877D+00& 1.000000000000000D+00\cr
406 7& 8& 1.080252896385329D-02& 1.431637116951845D-01& 1.000000000000000D+00\cr
407 8& 9& 8.407195124830860D-03& 1.126344579730008D-01& 1.000000000000000D+00\cr
408 9& 10& 5.022091686198525D-03& 7.750731829225275D-02& 1.000000000000000D+00\cr
409 10& 11& 2.277152808939775D-03& 5.032810859286796D-02& 1.000000000000000D+00\cr
410 11& 12& 6.489384688303218D-04& 1.932007150271009D-02& 1.000000000000000D+00\cr
411 12& 13& 2.075791943844547D-04& 6.964319310814365D-03& 1.000000000000000D+00\cr
412 13& 14& 7.349472666162258D-05& 4.017449067849554D-03& 1.000000000000000D+00\cr
413 14& 15& 2.293617477985238D-05& 1.334590390856715D-03& 1.000000000000000D+00\cr
414 15& 16& 7.683645404048675D-06& 6.011057038099202D-04& 1.000000000000000D+00\cr
419 THE MINIMIZATION TERMINATED WITHOUT DETECTING ERRORS.
421 (%o4) [a = 5.000086823042934, b = 3.052395429705181,
422 c = 1.927980629919583]
425 @opencatbox{Categories:}
426 @category{Package lbfgs}
431 @anchor{lbfgs_nfeval_max}
432 @defvr {Variable} lbfgs_nfeval_max
435 @code{lbfgs_nfeval_max} is the maximum number of evaluations of the figure of merit (FOM) in @code{lbfgs}.
436 When @code{lbfgs_nfeval_max} is reached,
437 @code{lbfgs} returns the result of the last successful line search.
439 @opencatbox{Categories:}
440 @category{Package lbfgs}
445 @anchor{lbfgs_ncorrections}
446 @defvr {Variable} lbfgs_ncorrections
449 @code{lbfgs_ncorrections} is the number of corrections applied
450 to the approximate inverse Hessian matrix which is maintained by @code{lbfgs}.
452 @opencatbox{Categories:}
453 @category{Package lbfgs}