* better
[mascara-docs.git] / lang / C / the.ansi.c.programming.language / c.programming.notes.int / sx4ab.html
blob31811f3e022ad6a7089edb6b0f3e29ed223c6686
1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
3 <!-- This material may be freely redistributed and used -->
4 <!-- but may not be republished or sold without permission. -->
5 <html>
6 <head>
7 <link rev="owner" href="mailto:scs@eskimo.com">
8 <link rev="made" href="mailto:scs@eskimo.com">
9 <title>18.2.1: Bitwise Operators</title>
10 <link href="sx4b.html" rev=precedes>
11 <link href="sx4bb.html" rel=precedes>
12 <link href="sx4b.html" rev=subdocument>
13 </head>
14 <body>
15 <H3>18.2.1: Bitwise Operators</H3>
17 <p>[This section corresponds to K&amp;R Sec. 2.9]
18 </p><p>The bitwise operators operate on numbers
19 (always integers)
20 as if they were
21 sequences
22 of binary bits
23 (which, of course, internally to the computer they
24 are).
25 These operators will make the most sense,
26 therefore,
27 if we consider integers as represented in binary, octal, or hexadecimal
28 (bases 2, 8, or 16),
29 not decimal (base 10).
30 Remember,
32 you can use octal constants in C
33 by prefixing them with an extra <TT>0</TT> (zero),
34 and you can use hexadecimal constants
35 by prefixing them with <TT>0x</TT>
36 (or <TT>0X</TT>).
37 </p><p>The <TT>&amp;</TT> operator
38 performs a bitwise AND on two integers.
39 Each bit in the result is 1 only if
40 both corresponding bits in the two input operands are 1.
41 For example, <TT>0x56 &amp; 0x32</TT> is <TT>0x12</TT>,
42 because (in binary):
43 <pre>
44 0 1 0 1 0 1 1 0
45 &amp; 0 0 1 1 0 0 1 0
46 ---------------
47 0 0 0 1 0 0 1 0
48 </pre>
49 </p><p>The <TT>|</TT> (vertical bar) operator
50 performs a bitwise OR on two integers.
51 Each bit in the result is 1 if
52 either of the corresponding bits in the two input operands is 1.
53 For example, <TT>0x56 | 0x32</TT> is <TT>0x76</TT>,
54 because:
55 <pre>
56 0 1 0 1 0 1 1 0
57 | 0 0 1 1 0 0 1 0
58 ---------------
59 0 1 1 1 0 1 1 0
60 </pre>
61 </p><p>The <TT>^</TT> (caret) operator
62 performs a bitwise exclusive-OR on two integers.
63 Each bit in the result is 1 if
64 one, but not both,
65 of the corresponding bits in the two input operands is 1.
66 For example, <TT>0x56 ^ 0x32</TT> is <TT>0x64</TT>:
67 <pre>
68 0 1 0 1 0 1 1 0
69 ^ 0 0 1 1 0 0 1 0
70 ---------------
71 0 1 1 0 0 1 0 0
72 </pre>
73 </p><p>The <TT>~</TT> (tilde) operator
74 performs a bitwise complement on its single integer operand.
75 (The <TT>~</TT> operator is therefore a unary operator,
76 like <TT>!</TT>
77 and the unary <TT>-</TT>, <TT>&amp;</TT>, and <TT>*</TT> operators.)
78 Complementing a number means to change all the
79 0 bits to 1
80 and all the 1s to 0s.
81 For example, assuming 16-bit integers, <TT>~0x56</TT> is <TT>0xffa9</TT>:
82 <pre>
83 ~ 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0
84 -------------------------------
85 1 1 1 1 1 1 1 1 1 0 1 0 1 0 0 1
86 </pre>
87 </p><p>The <TT>&lt;&lt;</TT> operator
88 shifts its first operand left by a number of bits
89 given by its second operand,
90 filling in new 0 bits at the right.
91 Similarly, the <TT>&gt;&gt;</TT> operator
92 shifts its first operand right.
93 If the first operand is <TT>unsigned</TT>,
94 <TT>&gt;&gt;</TT> fills in 0 bits from the left,
95 but if the first operand is signed,
96 <TT>&gt;&gt;</TT> might fill in 1 bits if the high-order bit was already 1.
97 (Uncertainty like this
98 is one reason why it's usually a good idea
99 to use all <TT>unsigned</TT> operands
100 when working with the bitwise operators.)
101 For example, <TT>0x56 &lt;&lt; 2</TT> is <TT>0x158</TT>:
102 <pre>
103 0 1 0 1 0 1 1 0 &lt;&lt; 2
104 -------------------
105 0 1 0 1 0 1 1 0 0 0
106 </pre>
107 And <TT>0x56 &gt;&gt; 1</TT> is <TT>0x2b</TT>:
108 <pre>
109 0 1 0 1 0 1 1 0 &gt;&gt; 1
110 ---------------
111 0 1 0 1 0 1 1
112 </pre>
113 For both of the shift operators,
114 bits that scroll ``off the end'' are discarded;
115 they don't wrap around.
116 (Therefore, <TT>0x56 &gt;&gt; 3</TT> is <TT>0x0a</TT>.)
117 </p><p>The bitwise operators will make more sense
118 if we take a look at some of the ways they're typically used.
119 We can use <TT>&amp;</TT> to test if a certain bit is 1 or not.
120 For example, <TT>0x56 &amp; 0x40</TT> is <TT>0x40</TT>,
121 but <TT>0x32 &amp; 0x40</TT> is <TT>0x00</TT>:
122 <pre>
123 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0
124 &amp; 0 1 0 0 0 0 0 0 &amp; 0 1 0 0 0 0 0 0
125 --------------- ---------------
126 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
127 </pre>
128 Since any nonzero result is considered ``true'' in C,
129 we can use an expression involving <TT>&amp;</TT> directly
130 to test some condition,
131 for example:
132 <pre>
133 if(x &amp; 0x04)
134 <I>do something</I> ;
135 </pre>
136 (If we didn't like testing against the bitwise result,
137 we could equivalently say
138 <TT>if((x &amp; 0x04) != 0)</TT> .
139 The extra parentheses are important, as we'll explain below.)
140 </p><p>Notice that
141 the value <TT>0x40</TT>
142 has exactly one 1 bit in its binary representation,
143 which makes it useful for testing
144 for the presence of a certain bit.
145 Such a value
146 is often called a <dfn>bit mask</dfn>.
147 Often, we'll define a series of bit masks,
148 all targeting different bits,
149 and then treat a single integer value as a set of <dfn>flags</dfn>.
150 A ``flag'' is an on-off, yes-no condition,
151 so we only need one bit to record it, not the 16 or 32 bits
152 (or more) of an <TT>int</TT>.
153 Storing a set of flags in a single <TT>int</TT>
154 does more than just save space,
155 it also makes it convenient to assign a set of flags all at once
156 from one flag variable to another,
157 using the conventional assignment operator <TT>=</TT>.
158 For example, if we made these definitions:
159 <pre>
160 #define DIRTY 0x01
161 #define OPEN 0x02
162 #define VERBOSE 0x04
163 #define RED 0x08
164 #define SEASICK 0x10
165 </pre>
166 we would have set up 5 different bits
167 as keeping track of those 5 different conditions
168 (``dirty,'' ``open,'' etc.).
169 If we had a variable
170 <pre>
171 unsigned int flags;
172 </pre>
173 which contained a set of these flags,
174 we could write tests like
175 <pre>
176 if(flags &amp; DIRTY)
177 { /* code for dirty case */ }
178 </pre>
179 <pre>
180 if(!(flags &amp; OPEN))
181 { /* code for closed case */ }
182 </pre>
183 <pre>
184 if(flags &amp; VERBOSE)
185 { /* code for verbose case */ }
186 else { /* code for quiet case */ }
187 </pre>
188 A condition like <TT>if(flags &amp; DIRTY)</TT>
189 can be read as
190 ``if the <TT>DIRTY</TT> bit is on''.
191 </p><p>These bitmasks would also be useful for <em>setting</em> the flags.
192 To ``turn on the <TT>DIRTY</TT> bit,''
193 we'd say
194 <pre>
195 flags = flags | DIRTY; /* set DIRTY bit */
196 </pre>
197 How would we ``turn off'' a bit?
198 The way to do it is to leave on every bit but the one we're turning off,
199 if they were on already.
200 We do this with the <TT>&amp;</TT> and <TT>~</TT> operators:
201 <pre>
202 flags = flags &amp; ~DIRTY; /* clear DIRTY bit */
203 </pre>
204 This may be easier to see if we look at it in binary.
205 If the <TT>DIRTY</TT>, <TT>RED</TT>, and <TT>SEASICK</TT> bits
206 were already on,
207 <TT>flags</TT> would be <TT>0x19</TT>, and we'd have
208 <pre>
209 0 0 0 1 1 0 0 1
210 &amp; 1 1 1 1 1 1 1 0
211 ---------------
212 0 0 0 1 1 0 0 0
213 </pre>
214 As you can see,
215 both the <TT>|</TT> operator when turning bits on
216 and the <TT>&amp;</TT> (plus <TT>~</TT>) operator when turning bits off
217 have no effect if the targeted bit were already on or off, respectively.
218 </p><p>The definition of the exclusive-OR operator
219 means that you can use it to <dfn>toggle</dfn> a bit,
220 that is,
221 to turn it to 1 if it was 0 and to 0 if it was one:
222 <pre>
223 flags = flags ^ VERBOSE; /* toggle VERBOSE bit */
224 </pre>
225 </p><p>It's common to use the ``<I>op</I><TT>=</TT>'' shorthand forms
226 when doing all of these operations:
227 <pre>
228 flags |= DIRTY; /* set DIRTY bit */
229 flags &amp;= ~OPEN; /* clear OPEN bit */
230 flags ^= VERBOSE; /* toggle VERBOSE bit */
231 </pre>
232 </p><p>We can also use the bitwise operators to extract
233 subsets of bits from the middle of
234 an integer.
235 For example,
236 to extract the second-to-last hexadecimal ``digit,''
237 we could use
238 <pre>
239 (i &amp; 0xf0) &gt;&gt; 4
240 </pre>
241 If <TT>i</TT> was <TT>0x56</TT>, we have:
242 <pre>
243 i 0 1 0 1 0 1 1 0
244 &amp; 0x56 &amp; 1 1 1 1 0 0 0 0
245 ---------------
246 0 1 0 1 0 0 0 0
247 </pre>
248 and shifting this result right by 4 bits
249 gives us <TT>0 1 0 1</TT>,
250 or 5,
251 as we wished.
252 Replacing
253 (or overwriting)
254 a subset of bits
255 is a bit more complicated;
256 we must first use <TT>&amp;</TT> and <TT>~</TT>
257 to clear all of the destination bits,
258 then use <TT>&lt;&lt;</TT> and <TT>|</TT> to ``OR in''
259 the new bits.
260 For example,
261 to replace that second-to-last hexadecimal digit
262 with some new bits,
263 we might use:
264 <pre>
265 (i &amp; ~0xf0) | (newbits &lt;&lt; 4)
266 </pre>
267 If <TT>i</TT> was still <TT>0x56</TT> and <TT>newbits</TT> was 6,
268 this would give us
269 <pre>
270 i 0 1 0 1 0 1 1 0
271 &amp; ~0xf0 &amp; 0 0 0 0 1 1 1 1
272 ---------------
273 0 0 0 0 0 1 1 0
274 | (newbits &lt;&lt; 4) | 0 1 1 0 0 0 0 0
275 ---------------
276 0 1 1 0 0 1 1 0
277 </pre>
278 resulting in <TT>0x66</TT>, as desired.
279 </p><p>We've been using extra parentheses
280 in several of these bitwise expressions
281 because it turns out that
282 (for the usual, hoary sort of ``historical reasons'')
283 the precedence of
284 the bitwise <TT>&amp;</TT>, <TT>|</TT>, and <TT>^</TT>
285 operators is low, usually lower than we'd want.
286 (The reason that they're low is that, once upon a time,
287 C didn't have the logical operators <TT>&amp;&amp;</TT> and <TT>||</TT>,
288 and the bitwise operators <TT>&amp;</TT> and <TT>|</TT> did double duty.)
289 However, since the precedence of <TT>&amp;</TT> and <TT>|</TT>
290 (and <TT>^</TT>)
291 is lower than <TT>==</TT>, <TT>!=</TT>,
292 <TT>&lt;&lt;, and </TT><TT>&gt;&gt;</TT>,
293 expressions like
294 <pre>
295 if(a &amp; 0x04 != 0) /* WRONG */
296 </pre>
298 <pre>
299 i &amp; 0xf0 &gt;&gt; 4 /* WRONG */
300 </pre>
301 would <em>not</em> work as desired;
302 these last two would be equivalent to
303 <pre>
304 if(a &amp; (0x04 != 0))
305 i &amp; (0xf0 &gt;&gt; 4)
306 </pre>
307 and would not do the bit test or subset extraction that we wanted.
308 </p><p>[The rest of this section is somewhat advanced and may be skipped.]
309 </p><p>Because of the nature of base-2 arithmetic,
310 it turns out that shifting left and shifting right
311 are equivalent to multiplying and dividing by two.
312 These operations are equivalent for the same reason that
313 tacking zeroes on to the right of a number in base 10
314 is the same as multiplying by 10,
315 and deleting digits from the right is the same as dividing by 10.
316 You can convince yourself that
317 <TT>0x56 &lt;&lt; 2</TT>
318 is the same as <TT>0x56 * 4</TT>,
319 and that
320 <TT>0x56 &gt;&gt; 1</TT>
321 is the same as <TT>0x56 / 2</TT>.
322 It's also the case that masking off all but the low-order bits
323 is the same as taking a remainder;
324 for example,
325 <TT>0x56 &amp; 0x07</TT>
326 is the same as <TT>0x56 % 8</TT>.
327 Some programmers therefore use
328 <TT>&lt;&lt;</TT>, <TT>&gt;&gt;</TT>, and <TT>&amp;</TT>
329 in preference to
330 <TT>*</TT>, <TT>/</TT>, and <TT>%</TT>
331 when powers of two are involved,
332 on the grounds that the bitwise operators are ``more efficient.''
333 Usually it isn't worth worrying about this, though,
334 because most compilers are smart enough
335 to perform these optimizations anyway
336 (that is, if you write <TT>x * 4</TT>,
337 the compiler might generate a left shift instruction all by itself),
338 they're not always as readable,
339 and they're not always correct for negative numbers.
340 </p><p>The issue of negative numbers, by the way,
341 explains why the right-shift operator <TT>&gt;&gt;</TT>
342 is not precisely defined
343 when the high-order bit of the value being shifted is 1.
344 For signed values,
345 if the high-order bit is a 1,
346 the number is negative.
347 (This is true for 1's complement, 2's complement,
348 and sign-magnitude representations.)
349 If you were using a right shift to implement division,
350 you'd want a negative number to stay negative,
351 so on some computers,
352 under some compilers,
353 when you shift a signed value right
354 and the high-order bit is 1,
356 1 bits are shifted in at the left instead of 0s.
357 However, you can't depend on this,
358 because not all computers and compilers implement right shift this way.
359 In any case,
360 shifting negative numbers to the right
361 (even if the high-order 1 bit propagates)
362 gives you an incorrect answer if there's a remainder involved:
363 in 2's complement, 16-bit arithmetic, -15 is <TT>0xfff1</TT>,
364 so <TT>-15 &gt;&gt; 1</TT> might give you
365 <TT>0xfff8</TT>shifted
366 which is -8.
367 But integer division is supposed to discard the remainder,
368 so <TT>-15 / 2</TT> would have given you -7.
369 (If you're having trouble seeing the way the shift worked,
370 <TT>0xfff1</TT> is 1111111111110001<tt>&lt;sub&gt;</tt>2<tt>&lt;/sub&gt;</tt> and
371 <TT>0xfff8</TT> is 1111111111111000<tt>&lt;sub&gt;</tt>2<tt>&lt;/sub&gt;</tt>.
372 The low-order 1 bit got shifted off to the right,
373 but because the high-order bit was 1,
374 a 1 got shifted in at the left.)
375 </p><hr>
377 Read sequentially:
378 <a href="sx4b.html" rev=precedes>prev</a>
379 <a href="sx4bb.html" rel=precedes>next</a>
380 <a href="sx4b.html" rev=subdocument>up</a>
381 <a href="top.html">top</a>
382 </p>
384 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
385 // <a href="copyright.html">Copyright</a> 1996-1999
386 // <a href="mailto:scs@eskimo.com">mail feedback</a>
387 </p>
388 </body>
389 </html>