1 llvm-exegesis - LLVM Machine Instruction Benchmark
2 ==================================================
7 :program:`llvm-exegesis` [*options*]
12 :program:`llvm-exegesis` is a benchmarking tool that uses information available
13 in LLVM to measure host machine instruction characteristics like latency or port
16 Given an LLVM opcode name and a benchmarking mode, :program:`llvm-exegesis`
17 generates a code snippet that makes execution as serial (resp. as parallel) as
18 possible so that we can measure the latency (resp. uop decomposition) of the
20 The code snippet is jitted and executed on the host subtarget. The time taken
21 (resp. resource usage) is measured using hardware performance counters. The
22 result is printed out as YAML to the standard output.
24 The main goal of this tool is to automatically (in)validate the LLVM's TableDef
25 scheduling models. To that end, we also provide analysis of the results.
27 EXAMPLES: benchmarking
28 ----------------------
30 Assume you have an X86-64 machine. To measure the latency of a single
35 $ llvm-exegesis -mode=latency -opcode-name=ADD64rr
37 Measuring the uop decomposition of an instruction works similarly:
41 $ llvm-exegesis -mode=uops -opcode-name=ADD64rr
43 The output is a YAML document (the default is to write to stdout, but you can
44 redirect the output to a file using `-benchmarks-file`):
54 llvm_triple: x86_64-unknown-linux-gnu
55 num_repetitions: 10000
57 - { key: latency, value: 1.0058, debug_string: '' }
59 info: 'explicit self cycles, selecting one aliasing configuration.
65 To measure the latency of all instructions for the host architecture, run:
70 readonly INSTRUCTIONS=$(($(grep INSTRUCTION_LIST_END build/lib/Target/X86/X86GenInstrInfo.inc | cut -f2 -d=) - 1))
71 for INSTRUCTION in $(seq 1 ${INSTRUCTIONS});
73 ./build/bin/llvm-exegesis -mode=latency -opcode-index=${INSTRUCTION} | sed -n '/---/,$p'
76 FIXME: Provide an :program:`llvm-exegesis` option to test all instructions.
79 ----------------------
81 Assuming you have a set of benchmarked instructions (either latency or uops) as
82 YAML in file `/tmp/benchmarks.yaml`, you can analyze the results using the
87 $ llvm-exegesis -mode=analysis \
88 -benchmarks-file=/tmp/benchmarks.yaml \
89 -analysis-clusters-output-file=/tmp/clusters.csv \
90 -analysis-inconsistencies-output-file=/tmp/inconsistencies.txt
92 This will group the instructions into clusters with the same performance
93 characteristics. The clusters will be written out to `/tmp/clusters.csv` in the
98 cluster_id,opcode_name,config,sched_class
100 2,ADD32ri8_DB,,WriteALU,1.00
101 2,ADD32ri_DB,,WriteALU,1.01
102 2,ADD32rr,,WriteALU,1.01
103 2,ADD32rr_DB,,WriteALU,1.00
104 2,ADD32rr_REV,,WriteALU,1.00
105 2,ADD64i32,,WriteALU,1.01
106 2,ADD64ri32,,WriteALU,1.01
107 2,MOVSX64rr32,,BSWAP32r_BSWAP64r_MOVSX64rr32,1.00
108 2,VPADDQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.02
109 2,VPSUBQYrr,,VPADDBYrr_VPADDDYrr_VPADDQYrr_VPADDWYrr_VPSUBBYrr_VPSUBDYrr_VPSUBQYrr_VPSUBWYrr,1.01
110 2,ADD64ri8,,WriteALU,1.00
111 2,SETBr,,WriteSETCC,1.01
114 :program:`llvm-exegesis` will also analyze the clusters to point out
115 inconsistencies in the scheduling information. The output is an html file. For
116 example, `/tmp/inconsistencies.html` will contain messages like the following :
118 .. image:: llvm-exegesis-analysis.png
121 Note that the scheduling class names will be resolved only when
122 :program:`llvm-exegesis` is compiled in debug mode, else only the class id will
123 be shown. This does not invalidate any of the analysis results though.
131 Print a summary of command line options.
133 .. option:: -opcode-index=<LLVM opcode index>
135 Specify the opcode to measure, by index.
136 Either `opcode-index` or `opcode-name` must be set.
138 .. option:: -opcode-name=<LLVM opcode name>
140 Specify the opcode to measure, by name.
141 Either `opcode-index` or `opcode-name` must be set.
143 .. option:: -mode=[latency|uops|analysis]
145 Specify the run mode.
147 .. option:: -num-repetitions=<Number of repetition>
149 Specify the number of repetitions of the asm snippet.
150 Higher values lead to more accurate measurements but lengthen the benchmark.
152 .. option:: -benchmarks-file=</path/to/file>
154 File to read (`analysis` mode) or write (`latency`/`uops` modes) benchmark
155 results. "-" uses stdin/stdout.
157 .. option:: -analysis-clusters-output-file=</path/to/file>
159 If provided, write the analysis clusters as CSV to this file. "-" prints to
162 .. option:: -analysis-inconsistencies-output-file=</path/to/file>
164 If non-empty, write inconsistencies found during analysis to this file. `-`
167 .. option:: -analysis-numpoints=<dbscan numPoints parameter>
169 Specify the numPoints parameters to be used for DBSCAN clustering
172 .. option:: -analysis-espilon=<dbscan epsilon parameter>
174 Specify the numPoints parameters to be used for DBSCAN clustering
181 :program:`llvm-exegesis` returns 0 on success. Otherwise, an error message is
182 printed to standard error, and the tool returns a non 0 value.