3 Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules
5 [published in SIGPLAN Notices 23,8 (Aug. 1988), 49-62]
10 [now: Siemens AG, AUT E 51]
18 1. Why a Version 2 of Dhrystone?
20 The Dhrystone benchmark program [1] has become a popular benchmark for
21 CPU/compiler performance measurement, in particular in the area of
22 minicomputers, workstations, PC's and microprocesors. It apparently satisfies
23 a need for an easy-to-use integer benchmark; it gives a first performance
24 indication which is more meaningful than MIPS numbers which, in their literal
25 meaning (million instructions per second), cannot be used across different
26 instruction sets (e.g. RISC vs. CISC). With the increasing use of the
27 benchmark, it seems necessary to reconsider the benchmark and to check whether
28 it can still fulfill this function. Version 2 of Dhrystone is the result of
29 such a re-evaluation, it has been made for two reasons:
31 o Dhrystone has been published in Ada [1], and Versions in Ada, Pascal and C
32 have been distributed by Reinhold Weicker via floppy disk. However, the
33 version that was used most often for benchmarking has been the version made
34 by Rick Richardson by another translation from the Ada version into the C
35 programming language, this has been the version distributed via the UNIX
38 There is an obvious need for a common C version of Dhrystone, since C is at
39 present the most popular system programming language for the class of
40 systems (microcomputers, minicomputers, workstations) where Dhrystone is
41 used most. There should be, as far as possible, only one C version of
42 Dhrystone such that results can be compared without restrictions. In the
43 past, the C versions distributed by Rick Richardson (Version 1.1) and by
44 Reinhold Weicker had small (though not significant) differences.
46 Together with the new C version, the Ada and Pascal versions have been
49 o As far as it is possible without changes to the Dhrystone statistics,
50 optimizing compilers should be prevented from removing significant
51 statements. It has turned out in the past that optimizing compilers
52 suppressed code generation for too many statements (by "dead code removal"
53 or "dead variable elimination"). This has lead to the danger that
54 benchmarking results obtained by a naive application of Dhrystone - without
55 inspection of the code that was generated - could become meaningless.
57 The overall policiy for version 2 has been that the distribution of
58 statements, operand types and operand locality described in [1] should remain
59 unchanged as much as possible. (Very few changes were necessary; their impact
60 should be negligible.) Also, the order of statements should remain unchanged.
61 Although I am aware of some critical remarks on the benchmark - I agree with
62 several of them - and know some suggestions for improvement, I didn't want to
63 change the benchmark into something different from what has become known as
64 "Dhrystone"; the confusion generated by such a change would probably outweight
65 the benefits. If I were to write a new benchmark program, I wouldn't give it
66 the name "Dhrystone" since this denotes the program published in [1].
67 However, I do recognize the need for a larger number of representative
68 programs that can be used as benchmarks; users should always be encouraged to
69 use more than just one benchmark.
71 The new versions (version 2.1 for C, Pascal and Ada) will be distributed as
72 widely as possible. (Version 2.1 differs from version 2.0 distributed via the
73 UNIX Network Usenet in March 1988 only in a few corrections for minor
74 deficiencies found by users of version 2.0.) Readers who want to use the
75 benchmark for their own measurements can obtain a copy in machine-readable
76 form on floppy disk (MS-DOS or XENIX format) from the author.
79 2. Overall Characteristics of Version 2
81 In general, version 2 follows - in the parts that are significant for
82 performance measurement, i.e. within the measurement loop - the published
83 (Ada) version and the C versions previously distributed. Where the versions
84 distributed by Rick Richardson [2] and Reinhold Weicker have been different,
85 it follows the version distributed by Reinhold Weicker. (However, the
86 differences have been so small that their impact on execution time in all
87 likelihood has been negligible.) The initialization and UNIX instrumentation
88 part - which had been omitted in [1] - follows mostly the ideas of Rick
89 Richardson [2]. However, any changes in the initialization part and in the
90 printing of the result have no impact on performance measurement since they
91 are outside the measaurement loop. As a concession to older compilers, names
92 have been made unique within the first 8 characters for the C version.
94 The original publication of Dhrystone did not contain any statements for time
95 measurement since they are necessarily system-dependent. However, it turned
96 out that it is not enough just to inclose the main procedure of Dhrystone in a
97 loop and to measure the execution time. If the variables that are computed
98 are not used somehow, there is the danger that the compiler considers them as
99 "dead variables" and suppresses code generation for a part of the statements.
100 Therefore in version 2 all variables of "main" are printed at the end of the
101 program. This also permits some plausibility control for correct execution of
104 At several places in the benchmark, code has been added, but only in branches
105 that are not executed. The intention is that optimizing compilers should be
106 prevented from moving code out of the measurement loop, or from removing code
107 altogether. Statements that are executed have been changed in very few places
108 only. In these cases, only the role of some operands has been changed, and it
109 was made sure that the numbers defining the "Dhrystone distribution"
110 (distribution of statements, operand types and locality) still hold as much as
111 possible. Except for sophisticated optimizing compilers, execution times for
112 version 2.1 should be the same as for previous versions.
114 Because of the self-imposed limitation that the order and distribution of the
115 executed statements should not be changed, there are still cases where
116 optimizing compilers may not generate code for some statements. To a certain
117 degree, this is unavoidable for small synthetic benchmarks. Users of the
118 benchmark are advised to check code listings whether code is generated for all
119 statements of Dhrystone.
121 Contrary to the suggestion in the published paper and its realization in the
122 versions previously distributed, no attempt has been made to subtract the time
123 for the measurement loop overhead. (This calculation has proven difficult to
124 implement in a correct way, and its omission makes the program simpler.)
125 However, since the loop check is now part of the benchmark, this does have an
126 impact - though a very minor one - on the distribution statistics which have
127 been updated for this version.
130 3. Discussion of Individual Changes
132 In this section, all changes are described that affect the measurement loop
133 and that are not just renamings of variables. All remarks refer to the C
134 version; the other language versions have been updated similarly.
136 In addition to adding the measurement loop and the printout statements,
137 changes have been made at the following places:
139 o In procedure "main", three statements have been added in the non-executed
140 "then" part of the statement
142 if (Enum_Loc == Func_1 (Ch_Index, 'C'))
146 strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
147 Int_2_Loc = Run_Index;
148 Int_Glob = Run_Index;
150 The string assignment prevents movement of the preceding assignment to
151 Str_2_Loc (5'th statement of "main") out of the measurement loop (This
152 probably will not happen for the C version, but it did happen with another
153 language and compiler.) The assignment to Int_2_Loc prevents value
154 propagation for Int_2_Loc, and the assignment to Int_Glob makes the value of
155 Int_Glob possibly dependent from the value of Run_Index.
157 o In the three arithmetic computations at the end of the measurement loop in
158 "main ", the role of some variables has been exchanged, to prevent the
159 division from just cancelling out the multiplication as it was in [1]. A
160 very smart compiler might have recognized this and suppressed code
161 generation for the division.
163 o For Proc_2, no code has been changed, but the values of the actual parameter
164 have changed due to changes in "main".
166 o In Proc_4, the second assignment has been changed from
168 Bool_Loc = Bool_Loc | Bool_Glob;
172 Bool_Glob = Bool_Loc | Bool_Glob;
174 It now assigns a value to a global variable instead of a local variable
175 (Bool_Loc); Bool_Loc would be a "dead variable" which is not used
178 o In Func_1, the statement
180 Ch_1_Glob = Ch_1_Loc;
182 was added in the non-executed "else" part of the "if" statement, to prevent
183 the suppression of code generation for the assignment to Ch_1_Loc.
185 o In Func_2, the second character comparison statement has been changed to
189 ('R' instead of 'X') because a comparison with 'X' is implied in the
190 preceding "if" statement.
192 Also in Func_2, the statement
196 has been added in the non-executed part of the last "if" statement, in order
197 to prevent Int_Loc from becoming a dead variable.
199 o In Func_3, a non-executed "else" part has been added to the "if" statement.
200 While the program would not be incorrect without this "else" part, it is
201 considered bad programming practice if a function can be left without a
204 To compensate for this change, the (non-executed) "else" part in the "if"
205 statement of Proc_3 was removed.
207 The distribution statistics have been changed only by the addition of the
208 measurement loop iteration (1 additional statement, 4 additional local integer
209 operands) and by the change in Proc_4 (one operand changed from local to
210 global). The distribution statistics in the comment headers have been updated
216 The string operations (string assignment and string comparison) have not been
217 changed, to keep the program consistent with the original version.
219 There has been some concern that the string operations are over-represented in
220 the program, and that execution time is dominated by these operations. This
221 was true in particular when optimizing compilers removed too much code in the
222 main part of the program, this should have been mitigated in version 2.
224 It should be noted that this is a language-dependent issue: Dhrystone was
225 first published in Ada, and with Ada or Pascal semantics, the time spent in
226 the string operations is, at least in all implementations known to me,
227 considerably smaller. In Ada and Pascal, assignment and comparison of strings
228 are operators defined in the language, and the upper bounds of the strings
229 occuring in Dhrystone are part of the type information known at compilation
230 time. The compilers can therefore generate efficient inline code. In C,
231 string assignemt and comparisons are not part of the language, so the string
232 operations must be expressed in terms of the C library functions "strcpy" and
233 "strcmp". (ANSI C allows an implementation to use inline code for these
234 functions.) In addition to the overhead caused by additional function calls,
235 these functions are defined for null-terminated strings where the length of
236 the strings is not known at compilation time; the function has to check every
237 byte for the termination condition (the null byte).
239 Obviously, a C library which includes efficiently coded "strcpy" and "strcmp"
240 functions helps to obtain good Dhrystone results. However, I don't think that
241 this is unfair since string functions do occur quite frequently in real
242 programs (editors, command interpreters, etc.). If the strings functions are
243 implemented efficiently, this helps real programs as well as benchmark
246 I admit that the string comparison in Dhrystone terminates later (after
247 scanning 20 characters) than most string comparisons in real programs. For
248 consistency with the original benchmark, I didn't change the program despite
252 5. Intended Use of Dhrystone
254 When Dhrystone is used, the following "ground rules" apply:
256 o Separate compilation (Ada and C versions)
258 As mentioned in [1], Dhrystone was written to reflect actual programming
259 practice in systems programming. The division into several compilation
260 units (5 in the Ada version, 2 in the C version) is intended, as is the
261 distribution of inter-module and intra-module subprogram calls. Although on
262 many systems there will be no difference in execution time to a Dhrystone
263 version where all compilation units are merged into one file, the rule is
264 that separate compilation should be used. The intention is that real
265 programming practice, where programs consist of several independently
266 compiled units, should be reflected. This also has implies that the
267 compiler, while compiling one unit, has no information about the use of
268 variables, register allocation etc. occuring in other compilation units.
269 Although in real life compilation units will probably be larger, the
270 intention is that these effects of separate compilation are modeled in
273 A few language systems have post-linkage optimization available (e.g., final
274 register allocation is performed after linkage). This is a borderline case:
275 Post-linkage optimization involves additional program preparation time
276 (although not as much as compilation in one unit) which may prevent its
277 general use in practical programming. I think that since it defeats the
278 intentions given above, it should not be used for Dhrystone.
280 Unfortunately, ISO/ANSI Pascal does not contain language features for
281 separate compilation. Although most commercial Pascal compilers provide
282 separate compilation in some way, we cannot use it for Dhrystone since such
283 a version would not be portable. Therefore, no attempt has been made to
284 provide a Pascal version with several compilation units.
286 o No procedure merging
288 Although Dhrystone contains some very short procedures where execution would
289 benefit from procedure merging (inlining, macro expansion of procedures),
290 procedure merging is not to be used. The reason is that the percentage of
291 procedure and function calls is part of the "Dhrystone distribution" of
292 statements contained in [1]. This restriction does not hold for the string
293 functions of the C version since ANSI C allows an implementation to use
294 inline code for these functions.
296 o Other optimizations are allowed, but they should be indicated
298 It is often hard to draw an exact line between "normal code generation" and
299 "optimization" in compilers: Some compilers perform operations by default
300 that are invoked in other compilers only when optimization is explicitly
301 requested. Also, we cannot avoid that in benchmarking people try to achieve
302 results that look as good as possible. Therefore, optimizations performed
303 by compilers - other than those listed above - are not forbidden when
304 Dhrystone execution times are measured. Dhrystone is not intended to be
305 non-optimizable but is intended to be similarly optimizable as normal
306 programs. For example, there are several places in Dhrystone where
307 performance benefits from optimizations like common subexpression
308 elimination, value propagation etc., but normal programs usually also
309 benefit from these optimizations. Therefore, no effort was made to
310 artificially prevent such optimizations. However, measurement reports
311 should indicate which compiler optimization levels have been used, and
312 reporting results with different levels of compiler optimization for the
313 same hardware is encouraged.
315 o Default results are those without "register" declarations (C version)
317 When Dhrystone results are quoted without additional qualification, they
318 should be understood as results obtained without use of the "register"
319 attribute. Good compilers should be able to make good use of registers even
320 without explicit register declarations ([3], p. 193).
322 Of course, for experimental purposes, post-linkage optimization, procedure
323 merging and/or compilation in one unit can be done to determine their effects.
324 However, Dhrystone numbers obtained under these conditions should be
325 explicitly marked as such; "normal" Dhrystone results should be understood as
326 results obtained following the ground rules listed above.
328 In any case, for serious performance evaluation, users are advised to ask for
329 code listings and to check them carefully. In this way, when results for
330 different systems are compared, the reader can get a feeling how much
331 performance difference is due to compiler optimization and how much is due to
337 The C version 2.1 of Dhrystone has been developed in cooperation with Rick
338 Richardson (Tinton Falls, NJ), it incorporates many ideas from the "Version
339 1.1" distributed previously by him over the UNIX network Usenet. Through his
340 activity with Usenet, Rick Richardson has made a very valuable contribution to
341 the dissemination of the benchmark. I also thank Chaim Benedelac (National
342 Semiconductor), David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan
343 Smith and Rafael Saavedra-Barrera (UC at Berkeley) for their help with
344 comments on earlier versions of the benchmark.
350 Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark.
351 Communications of the ACM 27, 10 (Oct. 1984), 1013-1030
354 Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text)
355 Informal Distribution via "Usenet", Last Version Known to me: Sept. 21,
359 Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language.
360 Prentice-Hall, Englewood Cliffs (NJ) 1978