2 ------------------------------
3 ****** perf by examples ******
4 ------------------------------
6 [ From an e-mail by Ingo Molnar, https://lore.kernel.org/lkml/20090804195717.GA5998@elte.hu ]
9 First, discovery/enumeration of available counters can be done via
14 kmem:kmalloc [Tracepoint event]
15 kmem:kmem_cache_alloc [Tracepoint event]
16 kmem:kmalloc_node [Tracepoint event]
17 kmem:kmem_cache_alloc_node [Tracepoint event]
18 kmem:kfree [Tracepoint event]
19 kmem:kmem_cache_free [Tracepoint event]
20 kmem:mm_page_free [Tracepoint event]
21 kmem:mm_page_free_batched [Tracepoint event]
22 kmem:mm_page_alloc [Tracepoint event]
23 kmem:mm_page_alloc_zone_locked [Tracepoint event]
24 kmem:mm_page_pcpu_drain [Tracepoint event]
25 kmem:mm_page_alloc_extfrag [Tracepoint event]
27 Then any (or all) of the above event sources can be activated and
28 measured. For example the page alloc/free properties of a 'hackbench
31 titan:~> perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc
32 -e kmem:mm_page_free_batched -e kmem:mm_page_free ./hackbench 10
35 Performance counter stats for './hackbench 10':
37 13857 kmem:mm_page_pcpu_drain
38 27576 kmem:mm_page_alloc
39 6025 kmem:mm_page_free_batched
40 20934 kmem:mm_page_free
42 0.613972165 seconds time elapsed
44 You can observe the statistical properties as well, by using the
45 'repeat the workload N times' feature of perf stat:
47 titan:~> perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e
48 kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
49 kmem:mm_page_free ./hackbench 10
56 Performance counter stats for './hackbench 10' (5 runs):
58 12920 kmem:mm_page_pcpu_drain ( +- 3.359% )
59 25035 kmem:mm_page_alloc ( +- 3.783% )
60 6104 kmem:mm_page_free_batched ( +- 0.934% )
61 18376 kmem:mm_page_free ( +- 4.941% )
63 0.643954516 seconds time elapsed ( +- 2.363% )
65 Furthermore, these tracepoints can be used to sample the workload as
66 well. For example the page allocations done by a 'git gc' can be
67 captured the following way:
69 titan:~/git> perf record -e kmem:mm_page_alloc -c 1 ./git gc
70 Counting objects: 1148, done.
71 Delta compression using up to 2 threads.
72 Compressing objects: 100% (450/450), done.
73 Writing objects: 100% (1148/1148), done.
74 Total 1148 (delta 690), reused 1148 (delta 690)
75 [ perf record: Captured and wrote 0.267 MB perf.data (~11679 samples) ]
77 To check which functions generated page allocations:
79 titan:~/git> perf report
82 # Overhead Command Shared Object
83 # ........ ............... ..........................
85 23.57% git-repack /lib64/libc-2.5.so
86 21.81% git /lib64/libc-2.5.so
88 11.79% git-repack ./git
89 7.12% git /lib64/ld-2.5.so
90 3.16% git-repack /lib64/libpthread-2.5.so
91 2.09% git-repack /bin/bash
92 1.97% rm /lib64/libc-2.5.so
93 1.39% mv /lib64/ld-2.5.so
94 1.37% mv /lib64/libc-2.5.so
95 1.12% git-repack /lib64/ld-2.5.so
96 0.95% rm /lib64/ld-2.5.so
97 0.90% git-update-serv /lib64/libc-2.5.so
98 0.73% git-update-serv /lib64/ld-2.5.so
99 0.68% perf /lib64/libpthread-2.5.so
100 0.64% git-repack /usr/lib64/libz.so.1.2.3
102 Or to see it on a more finegrained level:
104 titan:~/git> perf report --sort comm,dso,symbol
107 # Overhead Command Shared Object Symbol
108 # ........ ............... .......................... ......
110 9.35% git-repack ./git [.] insert_obj_hash
111 9.12% git ./git [.] insert_obj_hash
112 7.31% git /lib64/libc-2.5.so [.] memcpy
113 6.34% git-repack /lib64/libc-2.5.so [.] _int_malloc
114 6.24% git-repack /lib64/libc-2.5.so [.] memcpy
115 5.82% git-repack /lib64/libc-2.5.so [.] __GI___fork
116 5.47% git /lib64/libc-2.5.so [.] _int_malloc
117 2.99% git /lib64/libc-2.5.so [.] memset
119 Furthermore, call-graph sampling can be done too, of page
120 allocations - to see precisely what kind of page allocations there
123 titan:~/git> perf record -g -e kmem:mm_page_alloc -c 1 ./git gc
124 Counting objects: 1148, done.
125 Delta compression using up to 2 threads.
126 Compressing objects: 100% (450/450), done.
127 Writing objects: 100% (1148/1148), done.
128 Total 1148 (delta 690), reused 1148 (delta 690)
129 [ perf record: Captured and wrote 0.963 MB perf.data (~42069 samples) ]
131 titan:~/git> perf report -g
134 # Overhead Command Shared Object
135 # ........ ............... ..........................
137 23.25% git-repack /lib64/libc-2.5.so
139 |--50.00%-- _int_free
141 |--37.50%-- __GI___fork
144 |--12.50%-- ptmalloc_unlock_all2
147 --6.25%-- __GI_strcpy
148 21.61% git /lib64/libc-2.5.so
150 |--30.00%-- __GI_read
152 | --83.33%-- git_config_from_file
157 Or you can observe the whole system's page allocations for 10
160 titan:~/git> perf stat -a -e kmem:mm_page_pcpu_drain -e
161 kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
162 kmem:mm_page_free sleep 10
164 Performance counter stats for 'sleep 10':
166 171585 kmem:mm_page_pcpu_drain
167 322114 kmem:mm_page_alloc
168 73623 kmem:mm_page_free_batched
169 254115 kmem:mm_page_free
171 10.000591410 seconds time elapsed
173 Or observe how fluctuating the page allocations are, via statistical
174 analysis done over ten 1-second intervals:
176 titan:~/git> perf stat --repeat 10 -a -e kmem:mm_page_pcpu_drain -e
177 kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
178 kmem:mm_page_free sleep 1
180 Performance counter stats for 'sleep 1' (10 runs):
182 17254 kmem:mm_page_pcpu_drain ( +- 3.709% )
183 34394 kmem:mm_page_alloc ( +- 4.617% )
184 7509 kmem:mm_page_free_batched ( +- 4.820% )
185 25653 kmem:mm_page_free ( +- 3.672% )
187 1.058135029 seconds time elapsed ( +- 3.089% )
189 Or you can annotate the recorded 'git gc' run on a per symbol basis
190 and check which instructions/source-code generated page allocations:
192 titan:~/git> perf annotate __GI___fork
193 ------------------------------------------------
194 Percent | Source code & Disassembly of libc-2.5.so
195 ------------------------------------------------
198 : Disassembly of section .plt:
199 : Disassembly of section .text:
201 : 00000031a2e95560 <__fork>:
203 0.00 : 31a2e95602: b8 38 00 00 00 mov $0x38,%eax
204 0.00 : 31a2e95607: 0f 05 syscall
205 83.42 : 31a2e95609: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
206 0.00 : 31a2e9560f: 0f 87 4d 01 00 00 ja 31a2e95762 <__fork+0x202>
207 0.00 : 31a2e95615: 85 c0 test %eax,%eax
209 ( this shows that 83.42% of __GI___fork's page allocations come from
210 the 0x38 system call it performs. )
212 etc. etc. - a lot more is possible. I could list a dozen of
213 other different usecases straight away - neither of which is
214 possible via /proc/vmstat.
216 /proc/vmstat is not in the same league really, in terms of
217 expressive power of system analysis and performance
220 All that the above results needed were those new tracepoints
221 in include/tracing/events/kmem.h.