massif/docs/ms-manual.xml

   1 <?xml version="1.0"?> <!-- -*- sgml -*- -->
   2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
   3           "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
   4 [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
   5
   6
   7 <chapter id="ms-manual" xreflabel="Massif: a heap profiler">
   8   <title>Massif: a heap profiler</title>
   9
  10 <para>To use this tool, you must specify
  11 <option>--tool=massif</option> on the Valgrind
  12 command line.</para>
  13
  14 <sect1 id="ms-manual.overview" xreflabel="Overview">
  15 <title>Overview</title>
  16
  17 <para>Massif is a heap profiler.  It measures how much heap memory your
  18 program uses.  This includes both the useful space, and the extra bytes
  19 allocated for book-keeping and alignment purposes.  It can also
  20 measure the size of your program's stack(s), although it does not do so by
  21 default.</para>
  22
  23 <para>Heap profiling can help you reduce the amount of memory your program
  24 uses.  On modern machines with virtual memory, this provides the following
  25 benefits:</para>
  26
  27 <itemizedlist>
  28   <listitem><para>It can speed up your program -- a smaller
  29     program will interact better with your machine's caches and
  30     avoid paging.</para></listitem>
  31
  32   <listitem><para>If your program uses lots of memory, it will
  33     reduce the chance that it exhausts your machine's swap
  34     space.</para></listitem>
  35 </itemizedlist>
  36
  37 <para>Also, there are certain space leaks that aren't detected by
  38 traditional leak-checkers, such as Memcheck's.  That's because
  39 the memory isn't ever actually lost -- a pointer remains to it --
  40 but it's not in use.  Programs that have leaks like this can
  41 unnecessarily increase the amount of memory they are using over
  42 time.  Massif can help identify these leaks.</para>
  43
  44 <para>Importantly, Massif tells you not only how much heap memory your
  45 program is using, it also gives very detailed information that indicates
  46 which parts of your program are responsible for allocating the heap memory.
  47 </para>
  48
  49 <para>Massif also provides <xref linkend="&vg-xtree-id;"/> memory
  50   profiling using the command line
  51   option <computeroutput>--xtree-memory</computeroutput> and the monitor command
  52    <computeroutput>xtmemory</computeroutput>.</para>
  53
  54 </sect1>
  55
  56
  57 <sect1 id="ms-manual.using-print" xreflabel="Using Massif and ms_print">
  58 <title>Using Massif and ms_print</title>
  59
  60 <para>First off, as for the other Valgrind tools, you should compile with
  61 debugging info (the <option>-g</option> option).  It shouldn't
  62 matter much what optimisation level you compile your program with, as this
  63 is unlikely to affect the heap memory usage.</para>
  64
  65 <para>Then, you need to run Massif itself to gather the profiling
  66 information, and then run ms_print to present it in a readable way.</para>
  67
  68
  69
  70
  71 <sect2 id="ms-manual.anexample" xreflabel="An Example">
  72 <title>An Example Program</title>
  73
  74 <para>An example will make things clear.  Consider the following C program
  75 (annotated with line numbers) which allocates a number of different blocks
  76 on the heap.</para>
  77
  78 <screen><![CDATA[
  79  1      #include <stdlib.h>
  80  2
  81  3      void g(void)
  82  4      {
  83  5         malloc(4000);
  84  6      }
  85  7
  86  8      void f(void)
  87  9      {
  88 10         malloc(2000);
  89 11         g();
  90 12      }
  91 13
  92 14      int main(void)
  93 15      {
  94 16         int i;
  95 17         int* a[10];
  96 18
  97 19         for (i = 0; i < 10; i++) {
  98 20            a[i] = malloc(1000);
  99 21         }
 100 22
 101 23         f();
 102 24
 103 25         g();
 104 26
 105 27         for (i = 0; i < 10; i++) {
 106 28            free(a[i]);
 107 29         }
 108 30
 109 31         return 0;
 110 32      }
 111 ]]></screen>
 112
 113 </sect2>
 114
 115
 116 <sect2 id="ms-manual.running-massif" xreflabel="Running Massif">
 117 <title>Running Massif</title>
 118
 119 <para>To gather heap profiling information about the program
 120 <computeroutput>prog</computeroutput>, type:</para>
 121 <screen><![CDATA[
 122 valgrind --tool=massif prog
 123 ]]></screen>
 124
 125 <para>The program will execute (slowly).  Upon completion, no summary
 126 statistics are printed to Valgrind's commentary;  all of Massif's profiling
 127 data is written to a file.  By default, this file is called
 128 <filename>massif.out.&lt;pid&gt;</filename>, where
 129 <filename>&lt;pid&gt;</filename> is the process ID, although this filename
 130 can be changed with the <option>--massif-out-file</option> option.</para>
 131
 132 </sect2>
 133
 134
 135 <sect2 id="ms-manual.running-ms_print" xreflabel="Running ms_print">
 136 <title>Running ms_print</title>
 137
 138 <para>To see the information gathered by Massif in an easy-to-read form, use
 139 ms_print.  If the output file's name is
 140 <filename>massif.out.12345</filename>, type:</para>
 141 <screen><![CDATA[
 142 ms_print massif.out.12345]]></screen>
 143
 144 <para>ms_print will produce (a) a graph showing the memory consumption over
 145 the program's execution, and (b) detailed information about the responsible
 146 allocation sites at various points in the program, including the point of
 147 peak memory allocation.  The use of a separate script for presenting the
 148 results is deliberate:  it separates the data gathering from its
 149 presentation, and means that new methods of presenting the data can be added in
 150 the future.</para>
 151
 152 </sect2>
 153
 154
 155 <sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble">
 156 <title>The Output Preamble</title>
 157
 158 <para>After running this program under Massif, the first part of ms_print's
 159 output contains a preamble which just states how the program, Massif and
 160 ms_print were each invoked:</para>
 161
 162 <screen><![CDATA[
 163 --------------------------------------------------------------------------------
 164 Command:            example
 165 Massif arguments:   (none)
 166 ms_print arguments: massif.out.12797
 167 --------------------------------------------------------------------------------
 168 ]]></screen>
 169
 170 </sect2>
 171
 172
 173 <sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph">
 174 <title>The Output Graph</title>
 175
 176 <para>The next part is the graph that shows how memory consumption occurred
 177 as the program executed:</para>
 178
 179 <screen><![CDATA[
 180     KB
 181 19.63^                                                                       #
 182      |                                                                       #
 183      |                                                                       #
 184      |                                                                       #
 185      |                                                                       #
 186      |                                                                       #
 187      |                                                                       #
 188      |                                                                       #
 189      |                                                                       #
 190      |                                                                       #
 191      |                                                                       #
 192      |                                                                       #
 193      |                                                                       #
 194      |                                                                       #
 195      |                                                                       #
 196      |                                                                       #
 197      |                                                                       #
 198      |                                                                      :#
 199      |                                                                      :#
 200      |                                                                      :#
 201    0 +----------------------------------------------------------------------->ki     0                                                                   113.4
 202
 203
 204 Number of snapshots: 25
 205  Detailed snapshots: [9, 14 (peak), 24]
 206 ]]></screen>
 207
 208 <para>Why is most of the graph empty, with only a couple of bars at the very
 209 end?  By default, Massif uses "instructions executed" as the unit of time.
 210 For very short-run programs such as the example, most of the executed
 211 instructions involve the loading and dynamic linking of the program.  The
 212 execution of <computeroutput>main</computeroutput> (and thus the heap
 213 allocations) only occur at the very end.  For a short-running program like
 214 this, we can use the <option>--time-unit=B</option> option
 215 to specify that we want the time unit to instead be the number of bytes
 216 allocated/deallocated on the heap and stack(s).</para>
 217
 218 <para>If we re-run the program under Massif with this option, and then
 219 re-run ms_print, we get this more useful graph:</para>
 220
 221 <screen><![CDATA[
 222 19.63^                                               ###
 223      |                                               #
 224      |                                               #  ::
 225      |                                               #  : :::
 226      |                                      :::::::::#  : :  ::
 227      |                                      :        #  : :  : ::
 228      |                                      :        #  : :  : : :::
 229      |                                      :        #  : :  : : :  ::
 230      |                            :::::::::::        #  : :  : : :  : :::
 231      |                            :         :        #  : :  : : :  : :  ::
 232      |                        :::::         :        #  : :  : : :  : :  : ::
 233      |                     @@@:   :         :        #  : :  : : :  : :  : : @
 234      |                   ::@  :   :         :        #  : :  : : :  : :  : : @
 235      |                :::: @  :   :         :        #  : :  : : :  : :  : : @
 236      |              :::  : @  :   :         :        #  : :  : : :  : :  : : @
 237      |            ::: :  : @  :   :         :        #  : :  : : :  : :  : : @
 238      |         :::: : :  : @  :   :         :        #  : :  : : :  : :  : : @
 239      |       :::  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
 240      |    :::: :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
 241      |  :::  : :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
 242    0 +----------------------------------------------------------------------->KB     0                                                                   29.48
 243
 244 Number of snapshots: 25
 245  Detailed snapshots: [9, 14 (peak), 24]
 246 ]]></screen>
 247
 248 <para>The size of the graph can be changed with ms_print's
 249 <option>--x</option> and <option>--y</option> options.  Each vertical bar
 250 represents a snapshot, i.e. a measurement of the memory usage at a certain
 251 point in time.  If the next snapshot is more than one column away, a
 252 horizontal line of characters is drawn from the top of the snapshot to just
 253 before the next snapshot column.  The text at the bottom show that 25
 254 snapshots were taken for this program, which is one per heap
 255 allocation/deallocation, plus a couple of extras.  Massif starts by taking
 256 snapshots for every heap allocation/deallocation, but as a program runs for
 257 longer, it takes snapshots less frequently.  It also discards older
 258 snapshots as the program goes on;  when it reaches the maximum number of
 259 snapshots (100 by default, although changeable with the
 260 <option>--max-snapshots</option> option) half of them are
 261 deleted.  This means that a reasonable number of snapshots are always
 262 maintained.</para>
 263
 264 <para>Most snapshots are <emphasis>normal</emphasis>, and only basic
 265 information is recorded for them.  Normal snapshots are represented in the
 266 graph by bars consisting of ':' characters.</para>
 267
 268 <para>Some snapshots are <emphasis>detailed</emphasis>.  Information about
 269 where allocations happened are recorded for these snapshots, as we will see
 270 shortly.  Detailed snapshots are represented in the graph by bars consisting
 271 of '@' characters.  The text at the bottom show that 3 detailed
 272 snapshots were taken for this program (snapshots 9, 14 and 24).  By default,
 273 every 10th snapshot is detailed, although this can be changed via the
 274 <option>--detailed-freq</option> option.</para>
 275
 276 <para>Finally, there is at most one <emphasis>peak</emphasis> snapshot.  The
 277 peak snapshot is a detailed snapshot, and records the point where memory
 278 consumption was greatest.  The peak snapshot is represented in the graph by
 279 a bar consisting of '#' characters.  The text at the bottom shows
 280 that snapshot 14 was the peak.</para>
 281
 282 <para>Massif's determination of when the peak occurred can be wrong, for
 283 two reasons.</para>
 284
 285 <itemizedlist>
 286   <listitem><para>Peak snapshots are only ever taken after a deallocation
 287   happens.  This avoids lots of unnecessary peak snapshot recordings
 288   (imagine what happens if your program allocates a lot of heap blocks in
 289   succession, hitting a new peak every time).  But it means that if your
 290   program never deallocates any blocks, no peak will be recorded.  It also
 291   means that if your program does deallocate blocks but later allocates to a
 292   higher peak without subsequently deallocating, the reported peak will be
 293   too low.
 294   </para>
 295   </listitem>
 296
 297   <listitem><para>Even with this behaviour, recording the peak accurately
 298   is slow.  So by default Massif records a peak whose size is within 1% of
 299   the size of the true peak.  This inaccuracy in the peak measurement can be
 300   changed with the <option>--peak-inaccuracy</option> option.</para>
 301   </listitem>
 302 </itemizedlist>
 303
 304 <para>The following graph is from an execution of Konqueror, the KDE web
 305 browser.  It shows what graphs for larger programs look like.</para>
 306 <screen><![CDATA[
 307     MB
 308 3.952^                                                                    #
 309      |                                                                   @#:
 310      |                                                                 :@@#:
 311      |                                                            @@::::@@#:
 312      |                                                            @ :: :@@#::
 313      |                                                          @@@ :: :@@#::
 314      |                                                       @@:@@@ :: :@@#::
 315      |                                                    :::@ :@@@ :: :@@#::
 316      |                                                    : :@ :@@@ :: :@@#::
 317      |                                                  :@: :@ :@@@ :: :@@#::
 318      |                                                @@:@: :@ :@@@ :: :@@#:::
 319      |                           :       ::         ::@@:@: :@ :@@@ :: :@@#:::
 320      |                        :@@:    ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#:::
 321      |                     ::::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 322      |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 323      |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 324      |                    @: ::@@:::::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 325      |                ::@@@: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 326      |             :::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 327      |           @@:::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
 328    0 +----------------------------------------------------------------------->Mi
 329      0                                                                   626.4
 330
 331 Number of snapshots: 63
 332  Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41,
 333                       42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)]
 334 ]]></screen>
 335
 336 <para>Note that the larger size units are KB, MB, GB, etc.  As is typical
 337 for memory measurements, these are based on a multiplier of 1024, rather
 338 than the standard SI multiplier of 1000.  Strictly speaking, they should be
 339 written KiB, MiB, GiB, etc.</para>
 340
 341 </sect2>
 342
 343
 344 <sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details">
 345 <title>The Snapshot Details</title>
 346
 347 <para>Returning to our example, the graph is followed by the detailed
 348 information for each snapshot.  The first nine snapshots are normal, so only
 349 a small amount of information is recorded for each one:</para>
 350 <screen><![CDATA[
 351 --------------------------------------------------------------------------------
 352   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 353 --------------------------------------------------------------------------------
 354   0              0                0                0             0            0
 355   1          1,008            1,008            1,000             8            0
 356   2          2,016            2,016            2,000            16            0
 357   3          3,024            3,024            3,000            24            0
 358   4          4,032            4,032            4,000            32            0
 359   5          5,040            5,040            5,000            40            0
 360   6          6,048            6,048            6,000            48            0
 361   7          7,056            7,056            7,000            56            0
 362   8          8,064            8,064            8,000            64            0
 363 ]]></screen>
 364
 365 <para>Each normal snapshot records several things.</para>
 366
 367 <itemizedlist>
 368   <listitem><para>Its number.</para></listitem>
 369
 370   <listitem><para>The time it was taken. In this case, the time unit is
 371   bytes, due to the use of
 372   <option>--time-unit=B</option>.</para></listitem>
 373
 374   <listitem><para>The total memory consumption at that point.</para></listitem>
 375
 376   <listitem><para>The number of useful heap bytes allocated at that point.
 377   This reflects the number of bytes asked for by the
 378   program.</para></listitem>
 379
 380   <listitem><para>The number of extra heap bytes allocated at that point.
 381   This reflects the number of bytes allocated in excess of what the program
 382   asked for.  There are two sources of extra heap bytes.</para>
 383
 384   <para>First, every heap block has administrative bytes associated with it.
 385   The exact number of administrative bytes depends on the details of the
 386   allocator.  By default Massif assumes 8 bytes per block, as can be seen
 387   from the example, but this number can be changed via the
 388   <option>--heap-admin</option> option.</para>
 389
 390   <para>Second, allocators often round up the number of bytes asked for to a
 391   larger number, usually 8 or 16.  This is required to ensure that elements
 392   within the block are suitably aligned.  If N bytes are asked for, Massif
 393   rounds N up to the nearest multiple of the value specified by the
 394   <option><link linkend="opt.alignment">--alignment</link></option> option.
 395   </para></listitem>
 396
 397   <listitem><para>The size of the stack(s).  By default, stack profiling is
 398   off as it slows Massif down greatly.  Therefore, the stack column is zero
 399   in the example.  Stack profiling can be turned on with the
 400   <option>--stacks=yes</option> option.
 401
 402   </para></listitem>
 403 </itemizedlist>
 404
 405 <para>The next snapshot is detailed.  As well as the basic counts, it gives
 406 an allocation tree which indicates exactly which pieces of code were
 407 responsible for allocating heap memory:</para>
 408
 409 <screen><![CDATA[
 410   9          9,072            9,072            9,000            72            0
 411 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
 412 ->99.21% (9,000B) 0x804841A: main (example.c:20)
 413 ]]></screen>
 414
 415 <para>The allocation tree can be read from the top down.  The first line
 416 indicates all heap allocation functions such as <function>malloc</function>
 417 and C++ <function>new</function>.  All heap allocations go through these
 418 functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
 419 bytes) go through them.  But how were <function>malloc</function> and new
 420 called?  At this point, every allocation so far has been due to line 20
 421 inside <function>main</function>, hence the second line in the tree.  The
 422 <option>-></option> indicates that main (line 20) called
 423 <function>malloc</function>.</para>
 424
 425 <para>Let's see what the subsequent output shows happened next:</para>
 426
 427 <screen><![CDATA[
 428 --------------------------------------------------------------------------------
 429   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 430 --------------------------------------------------------------------------------
 431  10         10,080           10,080           10,000            80            0
 432  11         12,088           12,088           12,000            88            0
 433  12         16,096           16,096           16,000            96            0
 434  13         20,104           20,104           20,000           104            0
 435  14         20,104           20,104           20,000           104            0
 436 99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
 437 ->49.74% (10,000B) 0x804841A: main (example.c:20)
 438 |
 439 ->39.79% (8,000B) 0x80483C2: g (example.c:5)
 440 | ->19.90% (4,000B) 0x80483E2: f (example.c:11)
 441 | | ->19.90% (4,000B) 0x8048431: main (example.c:23)
 442 | |
 443 | ->19.90% (4,000B) 0x8048436: main (example.c:25)
 444 |
 445 ->09.95% (2,000B) 0x80483DA: f (example.c:10)
 446   ->09.95% (2,000B) 0x8048431: main (example.c:23)
 447 ]]></screen>
 448
 449 <para>The first four snapshots are similar to the previous ones.  But then
 450 the global allocation peak is reached, and a detailed snapshot (number 14)
 451 is taken.  Its allocation tree shows that 20,000B of useful heap memory has
 452 been allocated, and the lines and arrows indicate that this is from three
 453 different code locations: line 20, which is responsible for 10,000B
 454 (49.74%);  line 5, which is responsible for 8,000B (39.79%); and line 10,
 455 which is responsible for 2,000B (9.95%).</para>
 456
 457 <para>We can then drill down further in the allocation tree.  For example,
 458 of the 8,000B asked for by line 5, half of it was due to a call from line
 459 11, and half was due to a call from line 25.</para>
 460
 461 <para>In short, Massif collates the stack trace of every single allocation
 462 point in the program into a single tree, which gives a complete picture at
 463 a particular point in time of how and why all heap memory was
 464 allocated.</para>
 465
 466 <para>Note that the tree entries correspond not to functions, but to
 467 individual code locations.  For example, if function <function>A</function>
 468 calls <function>malloc</function>, and function <function>B</function> calls
 469 <function>A</function> twice, once on line 10 and once on line 11, then
 470 the two calls will result in two distinct stack traces in the tree.  In
 471 contrast, if <function>B</function> calls <function>A</function> repeatedly
 472 from line 15 (e.g. due to a loop), then each of those calls will be
 473 represented by the same stack trace in the tree.</para>
 474
 475 <para>Note also that each tree entry with children in the example satisfies an
 476 invariant: the entry's size is equal to the sum of its children's sizes.
 477 For example, the first entry has size 20,000B, and its children have sizes
 478 10,000B, 8,000B, and 2,000B.  In general, this invariant almost always
 479 holds.  However, in rare circumstances stack traces can be malformed, in
 480 which case a stack trace can be a sub-trace of another stack trace.  This
 481 means that some entries in the tree may not satisfy the invariant -- the
 482 entry's size will be greater than the sum of its children's sizes.  This is
 483 not a big problem, but could make the results confusing.  Massif can
 484 sometimes detect when this happens;  if it does, it issues a warning:</para>
 485
 486 <screen><![CDATA[
 487 Warning: Malformed stack trace detected.  In Massif's output,
 488          the size of an entry's child entries may not sum up
 489          to the entry's size as they normally do.
 490 ]]></screen>
 491
 492 <para>However, Massif does not detect and warn about every such occurrence.
 493 Fortunately, malformed stack traces are rare in practice.</para>
 494
 495 <para>Returning now to ms_print's output, the final part is similar:</para>
 496
 497 <screen><![CDATA[
 498 --------------------------------------------------------------------------------
 499   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 500 --------------------------------------------------------------------------------
 501  15         21,112           19,096           19,000            96            0
 502  16         22,120           18,088           18,000            88            0
 503  17         23,128           17,080           17,000            80            0
 504  18         24,136           16,072           16,000            72            0
 505  19         25,144           15,064           15,000            64            0
 506  20         26,152           14,056           14,000            56            0
 507  21         27,160           13,048           13,000            48            0
 508  22         28,168           12,040           12,000            40            0
 509  23         29,176           11,032           11,000            32            0
 510  24         30,184           10,024           10,000            24            0
 511 99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
 512 ->79.81% (8,000B) 0x80483C2: g (example.c:5)
 513 | ->39.90% (4,000B) 0x80483E2: f (example.c:11)
 514 | | ->39.90% (4,000B) 0x8048431: main (example.c:23)
 515 | |
 516 | ->39.90% (4,000B) 0x8048436: main (example.c:25)
 517 |
 518 ->19.95% (2,000B) 0x80483DA: f (example.c:10)
 519 | ->19.95% (2,000B) 0x8048431: main (example.c:23)
 520 |
 521 ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
 522 ]]></screen>
 523
 524 <para>The final detailed snapshot shows how the heap looked at termination.
 525 The 00.00% entry represents the code locations for which memory was
 526 allocated and then freed (line 20 in this case, the memory for which was
 527 freed on line 28).  However, no code location details are given for this
 528 entry;  by default, Massif only records the details for code locations
 529 responsible for more than 1% of useful memory bytes, and ms_print likewise
 530 only prints the details for code locations responsible for more than 1%.
 531 The entries that do not meet this threshold are aggregated.  This avoids
 532 filling up the output with large numbers of unimportant entries.  The
 533 thresholds can be changed with the
 534 <option>--threshold</option> option that both Massif and
 535 ms_print support.</para>
 536
 537 </sect2>
 538
 539
 540 <sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs">
 541 <title>Forking Programs</title>
 542 <para>If your program forks, the child will inherit all the profiling data that
 543 has been gathered for the parent.</para>
 544
 545 <para>If the output file format string (controlled by
 546 <option>--massif-out-file</option>) does not contain <option>%p</option>, then
 547 the outputs from the parent and child will be intermingled in a single output
 548 file, which will almost certainly make it unreadable by ms_print.</para>
 549 </sect2>
 550
 551
 552 <sect2 id="ms-manual.not-measured"
 553        xreflabel="Measuring All Memory in a Process">
 554 <title>Measuring All Memory in a Process</title>
 555 <para>
 556 It is worth emphasising that by default Massif measures only heap memory, i.e.
 557 memory allocated with
 558 <function>malloc</function>,
 559 <function>calloc</function>,
 560 <function>realloc</function>,
 561 <function>memalign</function>,
 562 <function>new</function>,
 563 <function>new[]</function>,
 564 and a few other, similar functions.  (And it can optionally measure stack
 565 memory, of course.)  This means it does <emphasis>not</emphasis> directly
 566 measure memory allocated with lower-level system calls such as
 567 <function>mmap</function>,
 568 <function>mremap</function>, and
 569 <function>brk</function>.
 570 </para>
 571
 572 <para>
 573 Heap allocation functions such as <function>malloc</function> are built on
 574 top of these system calls.  For example, when needed, an allocator will
 575 typically call <function>mmap</function> to allocate a large chunk of
 576 memory, and then hand over pieces of that memory chunk to the client program
 577 in response to calls to <function>malloc</function> et al.  Massif directly
 578 measures only these higher-level <function>malloc</function> et al calls,
 579 not the lower-level system calls.
 580 </para>
 581
 582 <para>
 583 Furthermore, a client program may use these lower-level system calls
 584 directly to allocate memory.  By default, Massif does not measure these.  Nor
 585 does it measure the size of code, data and BSS segments.  Therefore, the
 586 numbers reported by Massif may be significantly smaller than those reported by
 587 tools such as <filename>top</filename> that measure a program's total size in
 588 memory.
 589 </para>
 590
 591 <para>
 592 However, if you wish to measure <emphasis>all</emphasis> the memory used by
 593 your program, you can use the <option>--pages-as-heap=yes</option>.  When this
 594 option is enabled, Massif's normal heap block profiling is replaced by
 595 lower-level page profiling.  Every page allocated via
 596 <function>mmap</function> and similar system calls is treated as a distinct
 597 block.  This means that code, data and BSS segments are all measured, as they
 598 are just memory pages.  Even the stack is measured, since it is ultimately
 599 allocated (and extended when necessary) via <function>mmap</function>;  for
 600 this reason <option>--stacks=yes</option> is not allowed in conjunction with
 601 <option>--pages-as-heap=yes</option>.
 602 </para>
 603
 604 <para>
 605 After <option>--pages-as-heap=yes</option> is used, ms_print's output is
 606 mostly unchanged.  One difference is that the start of each detailed snapshot
 607 says:
 608 </para>
 609
 610 <screen><![CDATA[
 611 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
 612 ]]></screen>
 613
 614 <para>instead of the usual:</para>
 615
 616 <screen><![CDATA[
 617 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
 618 ]]></screen>
 619
 620 <para>
 621 The stack traces in the output may be more difficult to read, and interpreting
 622 them may require some detailed understanding of the lower levels of a program
 623 like the memory allocators.  But for some programs having the full information
 624 about memory usage can be very useful.
 625 </para>
 626
 627 </sect2>
 628
 629
 630 <sect2 id="ms-manual.acting" xreflabel="Action on Massif's Information">
 631 <title>Acting on Massif's Information</title>
 632 <para>Massif's information is generally fairly easy to act upon.  The
 633 obvious place to start looking is the peak snapshot.</para>
 634
 635 <para>It can also be useful to look at the overall shape of the graph, to
 636 see if memory usage climbs and falls as you expect;  spikes in the graph
 637 might be worth investigating.</para>
 638
 639 <para>The detailed snapshots can get quite large.  It is worth viewing them
 640 in a very wide window.   It's also a good idea to view them with a text
 641 editor.  That makes it easy to scroll up and down while keeping the cursor
 642 in a particular column, which makes following the allocation chains easier.
 643 </para>
 644
 645 </sect2>
 646
 647 </sect1>
 648
 649
 650 <sect1 id="ms-manual.using-visualizer" xreflabel="Using massif-visualizer">
 651 <title>Using massif-visualizer</title>
 652
 653 <para>
 654 <ulink url="https://github.com/KDE/massif-visualizer">massif-visualizer</ulink>
 655 is a graphical viewer for Massif data that is often easier to use than
 656 ms_print. massif-visualizer is not shipped within Valgrind, but is available in
 657 various places online.
 658 </para>
 659
 660 </sect1>
 661
 662
 663 <sect1 id="ms-manual.options" xreflabel="Massif Command-line Options">
 664 <title>Massif Command-line Options</title>
 665
 666 <para>Massif-specific command-line options are:</para>
 667
 668 <!-- start of xi:include in the manpage -->
 669 <variablelist id="ms.opts.list">
 670
 671   <varlistentry id="opt.heap" xreflabel="--heap">
 672     <term>
 673       <option><![CDATA[--heap=<yes|no> [default: yes] ]]></option>
 674     </term>
 675     <listitem>
 676       <para>Specifies whether heap profiling should be done.</para>
 677     </listitem>
 678   </varlistentry>
 679
 680   <varlistentry id="opt.heap-admin" xreflabel="--heap-admin">
 681     <term>
 682       <option><![CDATA[--heap-admin=<size> [default: 8] ]]></option>
 683     </term>
 684     <listitem>
 685       <para>If heap profiling is enabled, gives the number of administrative
 686       bytes per block to use.  This should be an estimate of the average,
 687       since it may vary.  For example, the allocator used by
 688       glibc on Linux requires somewhere between 4 to
 689       15 bytes per block, depending on various factors.  That allocator also
 690       requires admin space for freed blocks, but Massif cannot
 691       account for this.</para>
 692     </listitem>
 693   </varlistentry>
 694
 695   <varlistentry id="opt.stacks" xreflabel="--stacks">
 696     <term>
 697       <option><![CDATA[--stacks=<yes|no> [default: no] ]]></option>
 698     </term>
 699     <listitem>
 700       <para>Specifies whether stack profiling should be done.  This option
 701       slows Massif down greatly, and so is off by default.  Note that Massif
 702       assumes that the main stack has size zero at start-up.  This is not
 703       true, but doing otherwise accurately is difficult.  Furthermore,
 704       starting at zero better indicates the size of the part of the main
 705       stack that a user program actually has control over.</para>
 706       <para>If you give at least 4 <option>-v</option> verbosity arguments,
 707         then massif produces a trace for each stack increase and decrease.
 708         The stack increase trace contains the IP address that increased the stack.
 709         Note that to get fully precise IP address, you must specify the options
 710         <option>-px-default=unwindregs-at-mem-access
 711           --px-file-backed=unwindregs-at-mem-access</option>.
 712       </para>
 713     </listitem>
 714   </varlistentry>
 715
 716   <varlistentry id="opt.pages-as-heap" xreflabel="--pages-as-heap">
 717     <term>
 718       <option><![CDATA[--pages-as-heap=<yes|no> [default: no] ]]></option>
 719     </term>
 720     <listitem>
 721       <para>Tells Massif to profile memory at the page level rather
 722         than at the malloc'd block level.  See above for details.
 723       </para>
 724     </listitem>
 725   </varlistentry>
 726
 727   <varlistentry id="opt.depth" xreflabel="--depth">
 728     <term>
 729       <option><![CDATA[--depth=<number> [default: 30] ]]></option>
 730     </term>
 731     <listitem>
 732       <para>Maximum depth of the allocation trees recorded for detailed
 733       snapshots.  Increasing it will make Massif run somewhat more slowly,
 734       use more memory, and produce bigger output files.</para>
 735     </listitem>
 736   </varlistentry>
 737
 738   <varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn">
 739     <term>
 740       <option><![CDATA[--alloc-fn=<name> ]]></option>
 741     </term>
 742     <listitem>
 743       <para>Functions specified with this option will be treated as though
 744       they were a heap allocation function such as
 745       <function>malloc</function>.  This is useful for functions that are
 746       wrappers to <function>malloc</function> or <function>new</function>,
 747       which can fill up the allocation trees with uninteresting information.
 748       This option can be specified multiple times on the command line, to
 749       name multiple functions.</para>
 750
 751       <para>Note that the named function will only be treated this way if it is
 752       the top entry in a stack trace, or just below another function treated
 753       this way.  For example, if you have a function
 754       <function>malloc1</function> that wraps <function>malloc</function>,
 755       and <function>malloc2</function> that wraps
 756       <function>malloc1</function>, just specifying
 757       <option>--alloc-fn=malloc2</option> will have no effect.  You need to
 758       specify <option>--alloc-fn=malloc1</option> as well.  This is a little
 759       inconvenient, but the reason is that checking for allocation functions
 760       is slow, and it saves a lot of time if Massif can stop looking through
 761       the stack trace entries as soon as it finds one that doesn't match
 762       rather than having to continue through all the entries.</para>
 763
 764       <para>Note that C++ names are demangled.  Note also that overloaded
 765       C++ names must be written in full.  Single quotes may be necessary to
 766       prevent the shell from breaking them up.  For example:
 767 <screen><![CDATA[
 768 --alloc-fn='operator new(unsigned, std::nothrow_t const&)'
 769 ]]></screen>
 770       Arguments of type <computeroutput>size_t</computeroutput> need to be replaced
 771       with <computeroutput>unsigned long</computeroutput> on 64bit platforms and <computeroutput>unsigned</computeroutput>
 772       on 32bit platforms.
 773       </para>
 774
 775       <para><option>--alloc-fn</option> will work with inline functions.
 776       Inline function names are not mangled, which means that you only need
 777       to provide the function name and not the argument list.
 778       </para>
 779
 780       <para><option>--alloc-fn</option> does not support wildcards.
 781       </para>
 782       </listitem>
 783   </varlistentry>
 784
 785   <varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn">
 786     <term>
 787       <option><![CDATA[--ignore-fn=<name> ]]></option>
 788     </term>
 789     <listitem>
 790       <para>Any direct heap allocation (i.e. a call to
 791       <function>malloc</function>, <function>new</function>, etc, or a call
 792       to a function named by an <option>--alloc-fn</option>
 793       option) that occurs in a function specified by this option will be
 794       ignored.  This is mostly useful for testing purposes.  This option can
 795       be specified multiple times on the command line, to name multiple
 796       functions.
 797       </para>
 798
 799       <para>Any <function>realloc</function> of an ignored block will
 800       also be ignored, even if the <function>realloc</function> call does
 801       not occur in an ignored function.  This avoids the possibility of
 802       negative heap sizes if ignored blocks are shrunk with
 803       <function>realloc</function>.
 804       </para>
 805
 806       <para>The rules for writing C++ function names are the same as
 807       for <option>--alloc-fn</option> above.
 808       </para>
 809       </listitem>
 810   </varlistentry>
 811
 812   <varlistentry id="opt.threshold" xreflabel="--threshold">
 813     <term>
 814       <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
 815     </term>
 816     <listitem>
 817       <para>The significance threshold for heap allocations, as a
 818       percentage of total memory size.  Allocation tree entries that account
 819       for less than this will be aggregated.  Note that this should be
 820       specified in tandem with ms_print's option of the same name.</para>
 821     </listitem>
 822   </varlistentry>
 823
 824   <varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy">
 825     <term>
 826       <option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option>
 827     </term>
 828     <listitem>
 829       <para>Massif does not necessarily record the actual global memory
 830       allocation peak;  by default it records a peak only when the global
 831       memory allocation size exceeds the previous peak by at least 1.0%.
 832       This is because there can be many local allocation peaks along the way,
 833       and doing a detailed snapshot for every one would be expensive and
 834       wasteful, as all but one of them will be later discarded.  This
 835       inaccuracy can be changed (even to 0.0%) via this option, but Massif
 836       will run drastically slower as the number approaches zero.</para>
 837     </listitem>
 838   </varlistentry>
 839
 840   <varlistentry id="opt.time-unit" xreflabel="--time-unit">
 841     <term>
 842       <option><![CDATA[--time-unit=<i|ms|B> [default: i] ]]></option>
 843     </term>
 844     <listitem>
 845       <para>The time unit used for the profiling.  There are three
 846       possibilities: instructions executed (i), which is good for most
 847       cases; real (wallclock) time (ms, i.e. milliseconds), which is
 848       sometimes useful; and bytes allocated/deallocated on the heap and/or
 849       stack (B), which is useful for very short-run programs, and for
 850       testing purposes, because it is the most reproducible across different
 851       machines.</para> </listitem>
 852   </varlistentry>
 853
 854   <varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq">
 855     <term>
 856       <option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option>
 857     </term>
 858     <listitem>
 859       <para>Frequency of detailed snapshots.  With
 860       <option>--detailed-freq=1</option>, every snapshot is
 861       detailed.</para>
 862     </listitem>
 863   </varlistentry>
 864
 865   <varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots">
 866     <term>
 867       <option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option>
 868     </term>
 869     <listitem>
 870       <para>The maximum number of snapshots recorded.  If set to N, for all
 871       programs except very short-running ones, the final number of snapshots
 872       will be between N/2 and N.</para>
 873     </listitem>
 874   </varlistentry>
 875
 876   <varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file">
 877     <term>
 878       <option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option>
 879     </term>
 880     <listitem>
 881       <para>Write the profile data to <computeroutput>file</computeroutput>
 882       rather than to the default output file,
 883       <computeroutput>massif.out.&lt;pid&gt;</computeroutput>.  The
 884       <option>%p</option> and <option>%q</option> format specifiers can be
 885       used to embed the process ID and/or the contents of an environment
 886       variable in the name, as is the case for the core option
 887       <option><link linkend="opt.log-file">--log-file</link></option>.
 888       </para>
 889     </listitem>
 890   </varlistentry>
 891
 892 </variablelist>
 893 <!-- end of xi:include in the manpage -->
 894
 895 </sect1>
 896
 897 <sect1 id="ms-manual.monitor-commands" xreflabel="Massif Monitor Commands">
 898 <title>Massif Monitor Commands</title>
 899 <para>The Massif tool provides monitor commands handled by the Valgrind
 900 gdbserver (see <xref linkend="manual-core-adv.gdbserver-commandhandling"/>).
 901 Valgrind python code provides GDB front end commands giving an easier usage of
 902 the massif monitor commands (see
 903 <xref linkend="manual-core-adv.gdbserver-gdbmonitorfrontend"/>).  To launch a
 904 massif monitor command via its GDB front end command, instead of prefixing
 905 the command with "monitor", you must use the GDB <varname>massif</varname>
 906 command (or the shorter aliases <varname>ms</varname>).  Using the massif GDB
 907 front end command provide a more flexible usage, such as auto-completion of the
 908 command by GDB. In GDB, you can use <varname>help massif</varname> to get
 909 help about the massif front end monitor commands and you can
 910 use <varname>apropos massif</varname> to get all the commands mentionning the
 911 word "massif" in their name or on-line help.
 912 </para>
 913
 914 <itemizedlist>
 915   <listitem>
 916     <para><varname>snapshot [&lt;filename&gt;]</varname> requests
 917     to take a snapshot and save it in the given &lt;filename&gt;
 918     (default massif.vgdb.out).
 919     </para>
 920   </listitem>
 921   <listitem>
 922     <para><varname>detailed_snapshot [&lt;filename&gt;]</varname>
 923     requests to take a detailed snapshot and save it in the given
 924     &lt;filename&gt; (default massif.vgdb.out).
 925     </para>
 926   </listitem>
 927   <listitem>
 928     <para><varname>all_snapshots [&lt;filename&gt;]</varname>
 929     requests to take all captured snapshots so far and save them in the given
 930     &lt;filename&gt; (default massif.vgdb.out).
 931     </para>
 932   </listitem>
 933   <listitem>
 934     <para><varname>xtmemory [&lt;filename&gt; default xtmemory.kcg.%p.%n]</varname>
 935       requests Massif tool to produce an xtree heap memory report.
 936       See <xref linkend="&vg-xtree-id;"/> for
 937       a detailed explanation about execution trees. </para>
 938   </listitem>
 939 </itemizedlist>
 940 </sect1>
 941
 942 <sect1 id="ms-manual.clientreqs" xreflabel="Client requests">
 943 <title>Massif Client Requests</title>
 944
 945 <para>Massif does not have a <filename>massif.h</filename> file, but it does
 946 implement two of the core client requests:
 947 <function>VALGRIND_MALLOCLIKE_BLOCK</function> and
 948 <function>VALGRIND_FREELIKE_BLOCK</function>;  they are described in
 949 <xref linkend="manual-core-adv.clientreq"/>.
 950 </para>
 951
 952 </sect1>
 953
 954
 955 <sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Command-line Options">
 956 <title>ms_print Command-line Options</title>
 957
 958 <para>ms_print's options are:</para>
 959
 960 <!-- start of xi:include in the manpage -->
 961 <variablelist id="ms_print.opts.list">
 962
 963   <varlistentry>
 964     <term>
 965       <option><![CDATA[-h --help ]]></option>
 966     </term>
 967     <listitem>
 968       <para>Show the help message.</para>
 969     </listitem>
 970   </varlistentry>
 971
 972   <varlistentry>
 973     <term>
 974       <option><![CDATA[--version ]]></option>
 975     </term>
 976     <listitem>
 977       <para>Show the version number.</para>
 978     </listitem>
 979   </varlistentry>
 980
 981   <varlistentry>
 982     <term>
 983       <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
 984     </term>
 985     <listitem>
 986       <para>Same as Massif's <option>--threshold</option> option, but
 987       applied after profiling rather than during.</para>
 988     </listitem>
 989   </varlistentry>
 990
 991   <varlistentry>
 992     <term>
 993       <option><![CDATA[--x=<4..1000> [default: 72]]]></option>
 994     </term>
 995     <listitem>
 996       <para>Width of the graph, in columns.</para>
 997     </listitem>
 998   </varlistentry>
 999
1000   <varlistentry>
1001     <term>
1002       <option><![CDATA[--y=<4..1000> [default: 20] ]]></option>
1003     </term>
1004     <listitem>
1005       <para>Height of the graph, in rows.</para>
1006     </listitem>
1007   </varlistentry>
1008
1009 </variablelist>
1010
1011 </sect1>
1012
1013 <sect1 id="ms-manual.fileformat" xreflabel="fileformat">
1014 <title>Massif's Output File Format</title>
1015 <para>Massif's file format is plain text (i.e. not binary) and deliberately
1016 easy to read for both humans and machines.  Nonetheless, the exact format
1017 is not described here.  This is because the format is currently very
1018 Massif-specific.  In the future we hope to make the format more general, and
1019 thus suitable for possible use with other tools.  Once this has been done,
1020 the format will be documented here.</para>
1021
1022 </sect1>
1023
1024 </chapter>