4 <title>pprof and Remote Servers
</title>
9 <h1><code>pprof
</code> and Remote Servers
</h1>
11 <p>In mid-
2006, we added an experimental facility to
<A
12 HREF=
"cpu_profiler.html">pprof
</A>, the tool that analyzes CPU and
13 heap profiles. This facility allows you to collect profile
14 information from running applications. It makes it easy to collect
15 profile information without having to stop the program first, and
16 without having to log into the machine where the application is
17 running. This is meant to be used on webservers, but will work on any
18 application that can be modified to accept TCP connections on a port
19 of its choosing, and to respond to HTTP requests on that port.
</p>
21 <p>We do not currently have infrastructure, such as apache modules,
22 that you can pop into a webserver or other application to get the
23 necessary functionality
"for free." However, it's easy to generate
24 the necessary data, which should allow the interested developer to add
25 the necessary support into his or her applications.
</p>
27 <p>To use
<code>pprof
</code> in this experimental
"server" mode, you
28 give the script a host and port it should query, replacing the normal
29 commandline arguments of application + profile file:
</p>
31 % pprof internalweb.mycompany.com:
80
34 <p>The host must be listening on that port, and be able to accept HTTP/
1.0
35 requests -- sent via
<code>wget
</code> and
<code>curl
</code> -- for
36 several urls. The following sections list the urls that
37 <code>pprof
</code> can send, and the responses it expects in
40 <p>Here are examples that pprof will recognize, when you give them
41 on the commandline, are urls. In general, you
42 specify the host and a port (the port-number is required), and put
43 the service-name at the end of the url.:
</p>
45 http://myhost:
80/pprof/heap # retrieves a heap profile
46 http://myhost:
8008/pprof/profile # retrieves a CPU profile
47 http://myhost:
80 # retrieves a CPU profile (the default)
48 http://myhost:
8080/ # retrieves a CPU profile (the default)
49 myhost:
8088/pprof/growth #
"http://" is optional, but port is not
50 http://myhost:
80/myservice/pprof/heap # /pprof/heap just has to come at the end
51 http://myhost:
80/pprof/pmuprofile # CPU profile using performance counters
54 <h2> <code><b>/pprof/heap
</b></code> </h2>
56 <p><code>pprof
</code> asks for the url
<code>/pprof/heap
</code> to
57 get heap information. The actual url is controlled via the variable
58 <code>HEAP_PAGE
</code> in the
<code>pprof
</code> script, so you
59 can change it if you'd like.
</p>
61 <p>There are two ways to get this data. The first is to call
</p>
63 MallocExtension::instance()-
>GetHeapSample(&output);
65 <p>and have the server send
<code>output
</code> back as an HTTP
66 response to
<code>pprof
</code>.
<code>MallocExtension
</code> is
67 defined in the header file
<code>gperftools/malloc_extension.h
</code>.
</p>
69 <p>Note this will only only work if the binary is being run with
70 sampling turned on (which is not the default). To do this, set the
71 environment variable
<code>TCMALLOC_SAMPLE_PARAMETER
</code> to a
72 positive value, such as
524288, before running.
</p>
74 <p>The other way is to call
<code>HeapProfileStart(filename)
</code>
75 (from
<code>heap-profiler.h
</code>), continue to do work, and then,
76 some number of seconds later, call
<code>GetHeapProfile()
</code>
77 (followed by
<code>HeapProfilerStop()
</code>). The server can send
78 the output of
<code>GetHeapProfile
</code> back as the HTTP response to
79 pprof. (Note you must
<code>free()
</code> this data after using it.)
80 This is similar to how
<A HREF=
"#profile">profile requests
</A> are
81 handled, below. This technique does not require the application to
82 run with sampling turned on.
</p>
84 <p>Here's an example of what the output should look like:
</p>
86 heap profile:
1923:
127923432 [
1923:
127923432] @ heap_v2/
524288
87 1:
312 [
1:
312] @
0x2aaaabaf5ccc 0x2aaaaba4cd2c 0x2aaaac08c09a
88 928:
122586016 [
928:
122586016] @
0x2aaaabaf682c 0x400680 0x400bdd 0x2aaaab1c368a 0x2aaaab1c8f77 0x2aaaab1c0396 0x2aaaab1c86ed 0x4007ff 0x2aaaaca62afa
89 1:
16 [
1:
16] @
0x2aaaabaf5ccc 0x2aaaabb04bac 0x2aaaabc1b262 0x2aaaabc21496 0x2aaaabc214bb
94 <p> Older code may produce
"version 1" heap profiles which look like this:
<p/>
96 heap profile:
14933:
791700132 [
14933:
791700132] @ heap
97 1:
848688 [
1:
848688] @
0xa4b142 0x7f5bfc 0x87065e 0x4056e9 0x4125f8 0x42b4f1 0x45b1ba 0x463248 0x460871 0x45cb7c 0x5f1744 0x607cee 0x5f4a5e 0x40080f 0x2aaaabad7afa
98 1:
1048576 [
1:
1048576] @
0xa4a9b2 0x7fd025 0x4ca6d8 0x4ca814 0x4caa88 0x2aaaab104cf0 0x404e20 0x4125f8 0x42b4f1 0x45b1ba 0x463248 0x460871 0x45cb7c 0x5f1744 0x607cee 0x5f4a5e 0x40080f 0x2aaaabad7afa
99 2942:
388629374 [
2942:
388629374] @
0xa4b142 0x4006a0 0x400bed 0x5f0cfa 0x5f1744 0x607cee 0x5f4a5e 0x40080f 0x2aaaabad7afa
102 <p>pprof accepts both old and new heap profiles and automatically
103 detects which one you are using.
</p>
105 <h2> <code><b>/pprof/growth
</b></code> </h2>
107 <p><code>pprof
</code> asks for the url
<code>/pprof/growth
</code> to
108 get heap-profiling delta (growth) information. The actual url is
109 controlled via the variable
<code>GROWTH_PAGE
</code> in the
110 <code>pprof
</code> script, so you can change it if you'd like.
</p>
112 <p>The server should respond by calling
</p>
114 MallocExtension::instance()-
>GetHeapGrowthStacks(&output);
116 <p>and sending
<code>output
</code> back as an HTTP response to
117 <code>pprof
</code>.
<code>MallocExtension
</code> is defined in the
118 header file
<code>gperftools/malloc_extension.h
</code>.
</p>
120 <p>Here's an example, from an actual Google webserver, of what the
121 output should look like:
</p>
123 heap profile:
741:
812122112 [
741:
812122112] @ growth
124 1:
1572864 [
1:
1572864] @
0x87da564 0x87db8a3 0x84787a4 0x846e851 0x836d12f 0x834cd1c 0x8349ba5 0x10a3177 0x8349961
125 1:
1048576 [
1:
1048576] @
0x87d92e8 0x87d9213 0x87d9178 0x87d94d3 0x87da9da 0x8a364ff 0x8a437e7 0x8ab7d23 0x8ab7da9 0x8ac7454 0x8348465 0x10a3161 0x8349961
130 <h2> <A NAME=
"profile"><code><b>/pprof/profile
</b></code></A> </h2>
132 <p><code>pprof
</code> asks for the url
133 <code>/pprof/profile?seconds=XX
</code> to get cpu-profiling
134 information. The actual url is controlled via the variable
135 <code>PROFILE_PAGE
</code> in the
<code>pprof
</code> script, so you can
136 change it if you'd like.
</p>
138 <p>The server should respond by calling
139 <code>ProfilerStart(filename)
</code>, continuing to do its work, and
140 then, XX seconds later, calling
<code>ProfilerStop()
</code>. (These
141 functions are declared in
<code>gperftools/profiler.h
</code>.) The
142 application is responsible for picking a unique filename for
143 <code>ProfilerStart()
</code>. After calling
144 <code>ProfilerStop()
</code>, the server should read the contents of
145 <code>filename
</code> and send them back as an HTTP response to
146 <code>pprof
</code>.
</p>
148 <p>Obviously, to get useful profile information the application must
149 continue to run in the XX seconds that the profiler is running. Thus,
150 the profile start-stop calls should be done in a separate thread, or
151 be otherwise non-blocking.
</p>
153 <p>The profiler output file is binary, but near the end of it, it
154 should have lines of text somewhat like this:
</p>
156 01016000-
01017000 rw-p
00015000 03:
01 59314 /lib/ld-
2.2.2.so
159 <h2> <code><b>/pprof/pmuprofile
</b></code> </h2>
161 <code>pprof
</code> asks for a url of the form
162 <code>/pprof/pmuprofile?event=hw_event:unit_mask&period=nnn&seconds=xxx
</code>
163 to get cpu-profiling information. The actual url is controlled via the variable
164 <code>PMUPROFILE_PAGE
</code> in the
<code>pprof
</code> script, so you can
165 change it if you'd like.
</p>
168 This is similar to pprof, but is meant to be used with your CPU's hardware
169 performance counters. The server could be implemented on top of a library
170 such as
<a href=
"http://perfmon2.sourceforge.net/">
171 <code>libpfm
</code></a>. It should collect a sample every nnn occurences
172 of the event and stop the sampling after xxx seconds. Much of the code
173 for
<code>/pprof/profile
</code> can be reused for this purpose.
176 <p>The server side routines (the equivalent of
177 ProfilerStart/ProfilerStart) are not available as part of perftools,
178 so this URL is unlikely to be that useful.
</p>
180 <h2> <code><b>/pprof/contention
</b></code> </h2>
182 <p>This is intended to be able to profile (thread) lock contention in
183 addition to CPU and memory use. It's not yet usable.
</p>
186 <h2> <code><b>/pprof/cmdline
</b></code> </h2>
188 <p><code>pprof
</code> asks for the url
<code>/pprof/cmdline
</code> to
189 figure out what application it's profiling. The actual url is
190 controlled via the variable
<code>PROGRAM_NAME_PAGE
</code> in the
191 <code>pprof
</code> script, so you can change it if you'd like.
</p>
193 <p>The server should respond by reading the contents of
194 <code>/proc/self/cmdline
</code>, converting all internal NUL (\
0)
195 characters to newlines, and sending the result back as an HTTP
196 response to
<code>pprof
</code>.
</p>
198 <p>Here's an example return value:
<p>
200 /root/server/custom_webserver
202 --configfile=/root/server/ws.config
206 <h2> <code><b>/pprof/symbol
</b></code> </h2>
208 <p><code>pprof
</code> asks for the url
<code>/pprof/symbol
</code> to
209 map from hex addresses to variable names. The actual url is
210 controlled via the variable
<code>SYMBOL_PAGE
</code> in the
211 <code>pprof
</code> script, so you can change it if you'd like.
</p>
213 <p>When the server receives a GET request for
214 <code>/pprof/symbol
</code>, it should return a line formatted like
219 <p>where
<code>###
</code> is the number of symbols found in the
220 binary. (For now, the only important distinction is whether the value
221 is
0, which it is for executables that lack debug information, or
224 <p>This is perhaps the hardest request to write code for, because in
225 addition to the GET request for this url, the server must accept POST
226 requests. This means that after the HTTP headers, pprof will pass in
227 a list of hex addresses connected by
<code>+
</code>, like so:
</p>
229 curl -d '
0x0824d061+
0x0824d1cf' http://remote_host:
80/pprof/symbol
232 <p>The server should read the POST data, which will be in one line,
233 and for each hex value, should write one line of output to the output
236 <hex address
><tab
><function name
>
243 <p>The other reason this is the most difficult request to implement,
244 is that the application will have to figure out for itself how to map
245 from address to function name. One possibility is to run
<code>nm -C
246 -n
<program name
></code> to get the mappings at
247 program-compile-time. Another, at least on Linux, is to call out to
248 addr2line for every
<code>pprof/symbol
</code> call, for instance
249 <code>addr2line -Cfse /proc/
<getpid>/exe
0x12345678 0x876543210</code>
250 (presumably with some caching!)
</p>
252 <p><code>pprof
</code> itself does just this for local profiles (not
253 ones that talk to remote servers); look at the subroutine
254 <code>GetProcedureBoundaries
</code>.
</p>
258 Last modified: Mon Jun
12 21:
30:
14 PDT
2006