1 <?xml version="1.0"?> <!-- -*- sgml -*- -->
2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
4 [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
6 <chapter id="cl-format" xreflabel="Callgrind Format Specification">
7 <title>Callgrind Format Specification</title>
9 <para>This chapter describes the Callgrind Format, Version 1.</para>
11 <para>The format description is meant for the user to be able to understand the
12 file contents; but more important, it is given for authors of measurement or
13 visualization tools to be able to write and read this format.</para>
15 <sect1 id="cl-format.overview" xreflabel="Overview">
16 <title>Overview</title>
18 <para>The profile data format is ASCII based.
19 It is written by Callgrind, and it is upwards compatible
20 to the format used by Cachegrind (ie. Cachegrind uses a subset). It can
21 be read by callgrind_annotate and KCachegrind.</para>
23 <para>This chapter gives on overview of format features and examples.
24 For detailed syntax, look at the format reference.</para>
26 <sect2 id="cl-format.overview.basics" xreflabel="Basic Structure">
27 <title>Basic Structure</title>
29 <para>To uniquely specify that a file is a callgrind profile, it
30 should add "# callgrind format" as first line. This is optional but
31 recommended for easy format detection.</para>
33 <para>Each file has a header part of an arbitrary number of lines of the
34 format "key: value". After the header, lines specifying profile costs
35 follow. Everywhere, comments on own lines starting with '#' are allowed.
36 The header lines with keys "positions" and "events" define
37 the meaning of cost lines in the second part of the file: the value of
38 "positions" is a list of subpositions, and the value of "events" is a list
39 of event type names. Cost lines consist of subpositions followed by 64-bit
40 counters for the events, in the order specified by the "positions" and "events"
43 <para>The "events" header line is always required in contrast to the optional
44 line for "positions", which defaults to "line", i.e. a line number of some
45 source file. In addition, the second part of the file contains position
46 specifications of the form "spec=name". "spec" can be e.g. "fn" for a
47 function name or "fl" for a file name. Cost lines are always related to
48 the function/file specifications given directly before.</para>
52 <sect2 id="cl-format.overview.example1" xreflabel="Simple Example">
53 <title>Simple Example</title>
55 <para>The event names in the following example are quite arbitrary, and are not
56 related to event names used by Callgrind. Especially, cycle counts matching
57 real processors probably will never be generated by any Valgrind tools, as these
58 are bound to simulations of simple machine models for acceptable slowdown.
59 However, any profiling tool could use the format described in this chapter.</para>
62 <screen># callgrind format
63 events: Cycles Instructions Flops
67 16 20 12</screen></para>
69 <para>The above example gives profile information for event types "Cycles",
70 "Instructions", and "Flops". Thus, cost lines give the number of CPU cycles
71 passed by, number of executed instructions, and number of floating point
72 operations executed while running code corresponding to some source
73 position. As there is no line specifying the value of "positions", it defaults
74 to "line", which means that the first number of a cost line is always a line
77 <para>Thus, the first cost line specifies that in line 15 of source file
78 <filename>file.f</filename> there is code belonging to function
79 <function>main</function>. While running, 90 CPU cycles passed by, and 2 of
80 the 14 instructions executed were floating point operations. Similarly, the
81 next line specifies that there were 12 instructions executed in the context
82 of function <function>main</function> which can be related to line 16 in
83 file <filename>file.f</filename>, taking 20 CPU cycles. If a cost line
84 specifies less event counts than given in the "events" line, the rest is
85 assumed to be zero. I.e. there was no floating point instruction executed
86 relating to line 16.</para>
88 <para>Note that regular cost lines always give self (also called exclusive)
89 cost of code at a given position. If you specify multiple cost lines for the
90 same position, these will be summed up. On the other hand, in the example above
91 there is no specification of how many times function
92 <function>main</function> actually was
93 called: profile data only contains sums.</para>
98 <sect2 id="cl-format.overview.associations" xreflabel="Associations">
99 <title>Associations</title>
101 <para>The most important extension to the original format of Cachegrind is the
102 ability to specify call relationship among functions. More generally, you
103 specify associations among positions. For this, the second part of the
104 file also can contain association specifications. These look similar to
105 position specifications, but consist of two lines. For calls, the format
108 calls=(Call Count) (Target position)
109 (Source position) (Inclusive cost of call)
112 <para>The destination only specifies subpositions like line number. Therefore,
113 to be able to specify a call to another function in another source file, you
114 have to precede the above lines with a "cfn=" specification for the name of the
115 called function, and optionally a "cfi=" specification if the function is in
116 another source file ("cfl=" is an alternative specification for "cfi=" because
117 of historical reasons, and both should be supported by format readers).
118 The second line looks like a regular cost line with the difference
119 that inclusive cost spent inside of the function call has to be specified.</para>
121 <para>Other associations are for example (conditional) jumps. See the
122 reference below for details.</para>
127 <sect2 id="cl-format.overview.example2" xreflabel="Extended Example">
128 <title>Extended Example</title>
130 <para>The following example shows 3 functions, <function>main</function>,
131 <function>func1</function>, and <function>func2</function>. Function
132 <function>main</function> calls <function>func1</function> once and
133 <function>func2</function> 3 times. <function>func1</function> calls
134 <function>func2</function> 2 times.
136 <screen># callgrind format
159 20 700</screen></para>
161 <para>One can see that in <function>main</function> only code from line 16
162 is executed where also the other functions are called. Inclusive cost of
163 <function>main</function> is 820, which is the sum of self cost 20 and costs
164 spent in the calls: 400 for the single call to <function>func1</function>
165 and 400 as sum for the three calls to <function>func2</function>.</para>
167 <para>Function <function>func1</function> is located in
168 <filename>file1.c</filename>, the same as <function>main</function>.
169 Therefore, a "cfi=" specification for the call to <function>func1</function>
170 is not needed. The function <function>func1</function> only consists of code
171 at line 51 of <filename>file1.c</filename>, where <function>func2</function>
177 <sect2 id="cl-format.overview.compression1" xreflabel="Name Compression">
178 <title>Name Compression</title>
180 <para>With the introduction of association specifications like calls it is
181 needed to specify the same function or same file name multiple times. As
182 absolute filenames or symbol names in C++ can be quite long, it is advantageous
183 to be able to specify integer IDs for position specifications.
184 Here, the term "position" corresponds to a file name (source or object file)
185 or function name.</para>
187 <para>To support name compression, a position specification can be not only of
188 the format "spec=name", but also "spec=(ID) name" to specify a mapping of an
189 integer ID to a name, and "spec=(ID)" to reference a previously defined ID
190 mapping. There is a separate ID mapping for each position specification,
191 i.e. you can use ID 1 for both a file name and a symbol name.</para>
193 <para>With string compression, the example from above looks like this:
194 <screen># callgrind format
217 20 700</screen></para>
219 <para>As position specifications carry no information themselves, but only change
220 the meaning of subsequent cost lines or associations, they can appear
221 everywhere in the file without any negative consequence. Especially, you can
222 define name compression mappings directly after the header, and before any cost
223 lines. Thus, the above example can also be written as
224 <screen># callgrind format
227 # define file ID mapping
230 # define function ID mapping
243 <sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression">
244 <title>Subposition Compression</title>
246 <para>If a Callgrind data file should hold costs for each assembler instruction
247 of a program, you specify subposition "instr" in the "positions:" header line,
248 and each cost line has to include the address of some instruction. Addresses
249 are allowed to have a size of 64 bits to support 64-bit architectures. Thus,
250 repeating similar, long addresses for almost every line in the data file can
251 enlarge the file size quite significantly, and
252 motivates for subposition compression: instead of every cost line starting with
253 a 16 character long address, one is allowed to specify relative addresses.
254 This relative specification is not only allowed for instruction addresses, but
255 also for line numbers; both addresses and line numbers are called "subpositions".</para>
257 <para>A relative subposition always is based on the corresponding subposition
258 of the last cost line, and starts with a "+" to specify a positive difference,
259 a "-" to specify a negative difference, or consists of "*" to specify the same
260 subposition. Because absolute subpositions always are positive (ie. never
261 prefixed by "-"), any relative specification is non-ambiguous; additionally,
262 absolute and relative subposition specifications can be mixed freely.
263 Assume the following example (subpositions can always be specified
264 as hexadecimal numbers, beginning with "0x"):
265 <screen># callgrind format
266 positions: instr line
272 0x80001238 91 6</screen></para>
274 <para>With subposition compression, this looks like
275 <screen># callgrind format
276 positions: instr line
282 +1 +1 6</screen></para>
284 <para>Remark: For assembler annotation to work, instruction addresses have to
285 be corrected to correspond to addresses found in the original binary. I.e. for
286 relocatable shared objects, often a load offset has to be subtracted.</para>
291 <sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous">
292 <title>Miscellaneous</title>
294 <sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information">
295 <title>Cost Summary Information</title>
297 <para>For the visualization to be able to show cost percentage, a sum of the
298 cost of the full run has to be known. Usually, it is assumed that this is the
299 sum of all cost lines in a file. But sometimes, this is not correct. Thus, you
300 can specify a "summary:" line in the header giving the full cost for the
301 profile run. An import filter may use this to show a progress bar
302 while loading a large data file.</para>
306 <sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types">
307 <title>Long Names for Event Types and inherited Types</title>
309 <para>Event types for cost lines are specified in the "events:" line with an
310 abbreviated name. For visualization, it makes sense to be able to specify some
311 longer, more descriptive name. For an event type "Ir" which means "Instruction
312 Fetches", this can be specified the header line
313 <screen>event: Ir : Instruction Fetches
314 events: Ir Dr</screen></para>
316 <para>In this example, "Dr" itself has no long name associated. The order of
317 "event:" lines and the "events:" line is of no importance. Additionally,
318 inherited event types can be introduced for which no raw data is available, but
319 which are calculated from given types. Suppose the last example, you could add
320 <screen>event: Sum = Ir + Dr</screen>
321 to specify an additional event type "Sum", which is calculated by adding costs
322 for "Ir and "Dr".</para>
330 <sect1 id="cl-format.reference" xreflabel="Reference">
331 <title>Reference</title>
333 <sect2 id="cl-format.reference.grammar" xreflabel="Grammar">
334 <title>Grammar</title>
337 <screen>ProfileDataFile := FormatSpec? FormatVersion? Creator? PartData*</screen>
338 <screen>FormatSpec := "# callgrind format\n"</screen>
339 <screen>FormatVersion := "version: 1\n"</screen>
340 <screen>Creator := "creator:" NoNewLineChar* "\n"</screen>
341 <screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen>
342 <screen>HeaderLine := (empty line)
343 | ('#' NoNewLineChar*)
347 | CostLineDef</screen>
348 <screen>PartDetail := TargetCommand | TargetID</screen>
349 <screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen>
350 <screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen>
351 <screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen>
352 <screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen>
353 <screen>InheritedDef := "=" InheritedExpr</screen>
354 <screen>InheritedExpr := Name
355 | Number Space* ("*" Space*)? Name
356 | InheritedExpr Space* "+" Space* InheritedExpr</screen>
357 <screen>LongNameDef := ":" NoNewLineChar*</screen>
358 <screen>CostLineDef := "events:" Space* Name (Space+ Name)*
359 | "positions:" "instr"? (Space+ "line")?</screen>
360 <screen>BodyLine := (empty line)
361 | ('#' NoNewLineChar*)
366 | CondJumpSpec</screen>
367 <screen>CostLine := SubPositionList Costs?</screen>
368 <screen>SubPositionList := (SubPosition+ Space+)+</screen>
369 <screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen>
370 <screen>Costs := (Number Space+)+</screen>
371 <screen>PositionSpec := Position "=" Space* PositionName</screen>
372 <screen>Position := CostPosition | CalledPosition</screen>
373 <screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen>
374 <screen>CalledPosition := " "cob" | "cfi" | "cfl" | "cfn"</screen>
375 <screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen>
376 <screen>CallSpec := CallLine "\n" CostLine</screen>
377 <screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen>
378 <screen>UncondJumpSpec := "jump=" Space* Number Space+ SubPositionList</screen>
379 <screen>CondJumpSpec := "jcnd=" Space* Number Space+ Number Space+ SubPositionList</screen>
380 <screen>Space := " " | "\t"</screen>
381 <screen>Number := HexNumber | (Digit)+</screen>
382 <screen>Digit := "0" | ... | "9"</screen>
383 <screen>HexNumber := "0x" (Digit | HexChar)+</screen>
384 <screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen>
385 <screen>Name = Alpha (Digit | Alpha)*</screen>
386 <screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen>
387 <screen>NoNewLineChar := all characters without "\n"</screen>
390 <para>A profile data file ("ProfileDataFile") starts with basic information
391 such as a format marker, the version and creator information, and then has a list of parts, where
392 each part has its own header and body. Parts typically are different threads
393 and/or time spans/phases within a profiled application run.</para>
395 <para>Note that callgrind_annotate currently only supports profile data files with
396 one part. Callgrind may produce multiple parts for one profile run, but defaults
397 to one output file for each part.</para>
401 <sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines">
402 <title>Description of Header Lines</title>
404 <para>Basic information in the first lines of a profile data file:</para>
408 <para><computeroutput># callgrind format</computeroutput> [Callgrind]</para>
409 <para>This line specifies that the file is a callgrind profile,
410 and it has to be the first line. It was added late to the
411 format (with Valgrind 3.13) and is optional, as all readers also
412 should work with older callgrind profiles not including this line.
413 However, generation of this line is recommended to allow desktop
414 environments and file managers to uniquely detect the format.</para>
418 <para><computeroutput>version: number</computeroutput> [Callgrind]</para>
419 <para>This is used to distinguish future profile data formats. A
420 major version of 0 or 1 is supposed to be upwards compatible with
421 Cachegrind's format. It is optional; if not appearing, version 1
422 is assumed. Otherwise, it has to follow directly after the format
423 specification (i.e. be the first line if the optional format
424 specification is skipped).</para>
428 <para><computeroutput>creator: string</computeroutput> [Callgrind]</para>
429 <para>This is an arbitrary string to denote the creator of this file.
435 <para>The header for each part has an arbitrary number of lines of the format
436 "key: value". Possible <emphasis>key</emphasis> values for the header are:</para>
441 <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para>
442 <para>Optional. This specifies the process ID of the supervised application
443 for which this profile was generated.</para>
447 <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para>
448 <para>Optional. This specifies the full command line of the supervised
449 application for which this profile was generated.</para>
453 <para><computeroutput>part: number</computeroutput> [Callgrind]</para>
454 <para>Optional. This specifies a sequentially incremented number for each dump
455 generated, starting at 1.</para>
459 <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para>
460 <para>This specifies various information for this dump. For some
461 types, the semantic is defined, but any description type is allowed.
462 Unknown types should be ignored.</para>
463 <para>There are the types "I1 cache", "D1 cache", "LL cache", which
464 specify parameters used for the cache simulator. These are the only
465 types originally used by Cachegrind. Additionally, Callgrind uses
466 the following types: "Timerange" gives a rough range of the basic
467 block counter, for which the cost of this dump was collected.
468 Type "Trigger" states the reason of why this trace was generated.
469 E.g. program termination or forced interactive dump.</para>
473 <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para>
474 <para>For cost lines, this defines the semantic of the first numbers.
475 Any combination of "instr", "bb" and "line" is allowed, but has to be
476 in this order which corresponds to position numbers at the start of
477 the cost lines later in the file.</para>
478 <para>If "instr" is specified, the position is the address of an
479 instruction whose execution raised the events given later on the
480 line. This address is relative to the offset of the binary/shared
481 library file to not have to specify relocation info. For "line",
482 the position is the line number of a source file, which is
483 responsible for the events raised. Note that the mapping of "instr"
484 and "line" positions are given by the debugging line information
485 produced by the compiler.</para>
486 <para>This header line is optional, defaulting to "positions:
487 line" if not specified.</para>
491 <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para>
492 <para>A list of short names of the event types logged in cost
493 lines in this part of the profile data file. Arbitrary short
494 names are allowed. The order given specifies the required order
495 in cost lines. Thus, the first event type is the second or third
496 number in a cost line, depending on the value of "positions".
497 Required to appear for each header part exactly once.</para>
501 <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para>
502 <para>Optional. This header line specifies a summary cost, which should be
503 equal or larger than a total over all self costs. It may be larger as
504 the cost lines may not represent all cost of the program run.</para>
508 <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para>
509 <para>Optional. Should appear at the end of the file (although
510 looking like a header line). Must give the total of all cost lines,
511 to allow for a consistency check.</para>
518 <sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines">
519 <title>Description of Body Lines</title>
521 <para>The regular body line is a cost line consisting of one or two
522 position numbers (depending on "positions:" header line, see above)
523 and an array of cost numbers. A position number either is a
524 line numbers into a source file or an instruction address within binary
525 code, with source/binary file names specified as position names (see
526 below). The cost numbers get mapped to event types in the same order
527 as specified in the "events:" header line. If less numbers than event
528 types are given, the costs default to zero for the remaining event
531 <para>Further, there exist lines
532 <computeroutput>spec=position name</computeroutput>. A position name
533 is an arbitrary string. If it starts with "(" and a
534 digit, it's a string in compressed format. Otherwise it's the real
535 position string. This allows for file and symbol names as position
536 strings, as these never start with "(" + <emphasis>digit</emphasis>.
537 The compressed format is either "(" <emphasis>number</emphasis> ")"
538 <emphasis>space</emphasis> <emphasis>position</emphasis> or only
539 "(" <emphasis>number</emphasis> ")". The first relates
540 <emphasis>position</emphasis> to <emphasis>number</emphasis> in the
541 context of the given format specification from this line to the end of
542 the file; it makes the (<emphasis>number</emphasis>) an alias for
543 <emphasis>position</emphasis>. Compressed format is always
546 <para>Position specifications allowed:</para>
550 <para><computeroutput>ob=</computeroutput> [Callgrind]</para>
551 <para>The ELF object where the cost of next cost lines happens.</para>
555 <para><computeroutput>fl=</computeroutput> [Cachegrind]</para>
559 <para><computeroutput>fi=</computeroutput> [Cachegrind]</para>
563 <para><computeroutput>fe=</computeroutput> [Cachegrind]</para>
564 <para>The source file including the code which is responsible for
565 the cost of next cost lines. "fi="/"fe=" is used when the source
566 file changes inside of a function, i.e. for inlined code.</para>
570 <para><computeroutput>fn=</computeroutput> [Cachegrind]</para>
571 <para>The name of the function where the cost of next cost lines
576 <para><computeroutput>cob=</computeroutput> [Callgrind]</para>
577 <para>The ELF object of the target of the next call cost lines.</para>
581 <para><computeroutput>cfi=</computeroutput> [Callgrind]</para>
582 <para>The source file including the code of the target of the
583 next call cost lines.</para>
587 <para><computeroutput>cfl=</computeroutput> [Callgrind]</para>
588 <para>Alternative spelling for <computeroutput>cfi=</computeroutput>
589 specification (because of historical reasons).</para>
593 <para><computeroutput>cfn=</computeroutput> [Callgrind]</para>
594 <para>The name of the target function of the next call cost
600 <para>The last type of body line provides specific costs not just
601 related to one position as regular cost lines. It starts with specific
602 strings similar to position name specifications.</para>
607 <para><computeroutput>calls=count target-position</computeroutput> [Callgrind]</para>
608 <para>Call executed "count" times to "target-position".
609 After a "calls=" line there MUST be a cost line. This provides the source position
610 of the call and the cost spent in the called function in total.</para>
614 <para><computeroutput>jump=count target-position</computeroutput> [Callgrind]</para>
615 <para>Unconditional jump, executed "count" times, to "target-position".</para>
619 <para><computeroutput>jcnd=exe-count jump-count target-position</computeroutput> [Callgrind]</para>
620 <para>Conditional jump, executed "exe-count" times with "jump-count" jumps
621 happening (rest is fall-through) to "target-position".</para>