lang/C/the.ansi.c.programming.language/c.programming.notes.int/sx1e.html

   1 <!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 2.0//EN">
   2 <!-- This collection of hypertext pages is Copyright 1995-7 by Steve Summit. -->
   3 <!-- This material may be freely redistributed and used -->
   4 <!-- but may not be republished or sold without permission. -->
   5 <html>
   6 <head>
   7 <link rev="owner" href="mailto:scs@eskimo.com">
   8 <link rev="made" href="mailto:scs@eskimo.com">
   9 <title>15.5: Linked Data Structures</title>
  10 <link href="sx1d.html" rev=precedes>
  11 <link href="sx2.html" rel=precedes>
  12 <link href="sx1.html" rev=subdocument>
  13 </head>
  14 <body>
  15 <H2>15.5: Linked Data Structures</H2>
  16
  17 <p>[This section corresponds to K&amp;R Sec. 6.5]
  18 </p><p>One reason that pointers to structures are useful and common
  19 is that they can be used to build linked data structures,
  20 in which a structure contains
  21 a pointer to another instance of the same structure
  22 (or perhaps a different structure).
  23 The simplest example is a singly-linked list,
  24 which we might
  25 declare like this:
  26 <pre>
  27 struct listnode
  28         {
  29         char *item;
  30         struct listnode *next;
  31         };
  32 </pre>
  33 This structure describes one <dfn>node</dfn> in a list;
  34 a list may consist of many nodes,
  35 one for each item in the list.
  36 Here, each item in the list
  37 (the field we've named <TT>item</TT>)
  38 will be a string,
  39 represented
  40 (i.e. pointed to)
  41 by a <TT>char *</TT>.
  42 Each node in the list is linked to its successor by its
  43 <TT>next</TT> field,
  44 which is a pointer to another <TT>struct listnode</TT>.
  45 (The compiler is perfectly happy to
  46 place a pointer to a structure inside that very same structure;
  47 it would only complain
  48 if you tried to stuff an entire <TT>struct listnode</TT>
  49 into a <TT>struct listnode</TT>,
  50 which would tend to make the <TT>struct listnode</TT> explode.)
  51 We'll use a null pointer as the <TT>next</TT> field
  52 of the last node in the list,
  53 since by definition a null pointer doesn't point anywhere.
  54 </p><p>We could set up a tiny list with these declarations:
  55 <pre>
  56 struct listnode node2 = {"world", NULL};
  57 struct listnode node1 = {"hello", &amp;node2};
  58 struct listnode *head = &amp;node1;
  59 </pre>
  60 A box-and-arrows picture of the resulting list would look like this:
  61 <br>
  62 <center><img src="fig15.3.gif"></center>
  63 <br>
  64 The list has two nodes, allocated by the compiler in response to
  65 the first two declarations we gave.
  66 We've also allocated a pointer variable, <TT>head</TT>,
  67 which points at the ``head'' of the list.
  68 </p><p>Once we've built a list, we'll naturally want to do things with it.
  69 One of the simplest operations is to print the list back out.
  70 The code to do so is very simple.
  71 We declare another list pointer <TT>lp</TT>
  72 and then cause it to step over each node in the list, in turn:
  73 <pre>
  74         struct listnode *lp;
  75         for(lp = head; lp != NULL; lp = lp-&gt;next)
  76                 printf("%s\n", lp-&gt;item);
  77 </pre>
  78 This <TT>for</TT> loop deserves some attention,
  79 especially if you haven't seen one like it before.
  80 Many <TT>for</TT> loops step an <TT>int</TT> variable
  81 (often called <TT>i</TT>)
  82 through a series of integer values.
  83 However,
  84 the three controlling expressions of a <TT>for</TT> loop
  85 are <em>not</em> limited to that pattern;
  86 you may in fact use any expressions at all,
  87 although it's best if they conform
  88 to the expected initialize;test;increment pattern.
  89 The list-printing loop above certainly does:
  90 the expression <TT>lp = head</TT> initializes <TT>lp</TT>
  91 to point to the head of the loop;
  92 the expression <TT>lp != NULL</TT> tests whether
  93 <TT>lp</TT> still points to a real node
  94 (or whether it has reached the null pointer which marks the end of the list);
  95 and the expression <TT>lp = lp-&gt;next</TT> sets
  96 <TT>lp</TT> to point to the next node in the list,
  97 one past where it did before.
  98 </p><p>The two-element list above is pretty useless;
  99 its only worth is as a first example.
 100 The real power of linked lists
 101 (and other linked data structures)
 102 is that they can grow on demand,
 103 in response to the data that your program finds itself working
 104 with.
 105 For a linked list to grow on demand,
 106 however,
 107 we'll have to allocate its nodes dynamically,
 108 because we won't know in advance how many of them we'll need.
 109 (In the first example,
 110 we had two static nodes,
 111 because we knew in advance,
 112 at compile time,
 113 that our list would have two elements.
 114 But that static allocation won't do for a dynamic list.
 115 How would we know how many <TT>struct listnode</TT>
 116 variables to allocate?)
 117 </p><p>The general solution, of course, is to call <TT>malloc</TT>.
 118 Here is a scrap of code which inserts a new word at the
 119 head of a list:
 120 <pre>
 121         #include &lt;stdio.h&gt;                /* for fprintf, stderr */
 122         #include &lt;stdlib.h&gt;               /* for malloc */
 123
 124         char *newword = "test";
 125         struct listnode *newnode = malloc(sizeof(struct listnode));
 126         if(newnode == NULL)
 127                 {
 128                 fprintf(stderr, "out of memory\n");
 129                 exit(1);
 130                 }
 131
 132         newnode-&gt;item = newword;
 133         newnode-&gt;next = head;
 134         head = newnode;
 135 </pre>
 136 The expression <TT>sizeof(struct listnode)</TT>
 137 in the call to <TT>malloc</TT>
 138 asks the compiler to compute the number of bytes required
 139 to store one <TT>struct listnode</TT>,
 140 and that's exactly how many bytes of memory we ask for from <TT>malloc</TT>.
 141 Make sure you see how the last two lines work
 142 to splice the new node in at the head of the list,
 143 by making the new node's <TT>next</TT> pointer
 144 point at the old head of the list,
 145 and then resetting the head of the list to be the new node.
 146 (Another word for a list where we always add new items
 147 at the beginning is a <dfn>stack</dfn>.)
 148
 149 </p><p>Naturally, we'd like to encapsulate this operation of prepending
 150 an item to a list as a function.
 151 Doing so is just a little bit tricky,
 152 because the list's head pointer is modified every time.
 153 There are several ways to achieve this modification;
 154 the way we'll do it is to have our list-prepending function
 155 return a pointer to the new head of the list.
 156 Here is the function:
 157 <pre>
 158 #include &lt;stdio.h&gt;                /* for fprintf, stderr */
 159 #include &lt;stdlib.h&gt;               /* for malloc, exit */
 160 #include &lt;string.h&gt;               /* for strlen, strcpy */
 161
 162 struct listnode *prepend(char *newword, struct listnode *oldhead)
 163 {
 164 struct listnode *newnode = malloc(sizeof(struct listnode));
 165 if(newnode == NULL)
 166         {
 167         fprintf(stderr, "out of memory\n");
 168         exit(1);
 169         }
 170
 171 newnode-&gt;item = malloc(strlen(newword) + 1);
 172 if(newnode-&gt;item == NULL)
 173         {
 174         fprintf(stderr, "out of memory\n");
 175         exit(1);
 176         }
 177 strcpy(newnode-&gt;item, newword);
 178
 179 newnode-&gt;next = oldhead;
 180 return newnode;
 181 }
 182 </pre>
 183 Since we want this to be a general-purpose function,
 184 we also allocate new space for the new string (word, item)
 185 being stored.
 186 Otherwise,
 187 we'd be depending on the caller to arrange that the pointer to
 188 the new string remain valid for as long as the list was in use.
 189 As we'll see, that's not always a safe assumption.
 190 By allocating our own memory,
 191 which ``belongs'' to the list,
 192 we ensure that the list isn't dependent on the caller in this way.
 193 (Notice, too, that the number of bytes we ask for is
 194 <TT>strlen(newword) + 1</TT>.)
 195 </p><p>(As an aside, it's a mild blemish on the above code that it
 196 contains two identical calls to <TT>fprintf</TT>,
 197 complaining about two separate potential failures in two calls
 198 to <TT>malloc</TT>.
 199 It's quite possible to combine these two cases,
 200 and many C programmers prefer to do so,
 201 although the expression may be a bit scary-looking at first:
 202 <pre>
 203 struct listnode *newnode;
 204 if((newnode = malloc(sizeof(struct listnode))) == NULL ||
 205                 (newnode-&gt;item = malloc(strlen(newword) + 1)) == NULL)
 206         {
 207         fprintf(stderr, "out of memory\n");
 208         exit(1);
 209         }
 210 </pre>
 211 How does this work?
 212 First it calls <TT>malloc(sizeof(struct listnode))</TT>,
 213 and assigns the result to <TT>newnode</TT>.
 214 Then it calls <TT>malloc(strlen(newword) + 1)</TT>,
 215 and assigns the result to <TT>newnode-&gt;item</TT>.
 216 This code relies on two special guarantees of C's <TT>||</TT> operator,
 217 namely that it always evaluates its left-hand side first,
 218 and that if the left-hand side evaluates as true,
 219 it doesn't bother to evaluate the right-hand side,
 220 because once the left-hand side is true,
 221 the final result is definitely going to be true.
 222 Therefore,
 223 we're guaranteed that we'll allocate space for <TT>newnode</TT>
 224 to point to before we try to fill in <TT>newnode-&gt;item</TT>,
 225 but if the first call to <TT>malloc</TT> fails,
 226 we won't
 227 call <TT>malloc</TT> a second time or
 228 try to fill in <TT>newnode-&gt;item</TT> at all.
 229 These guarantees--of left-to-right evaluation,
 230 and of skipping the evaluation of the right-hand side if the
 231 left-hand side determines the result--are
 232 unique to the <TT>||</TT> and <TT>&amp;&amp;</TT> operators.
 233
 234
 235 It's perfectly acceptable to rely on these guarantees when using
 236 <TT>||</TT> and <TT>&amp;&amp;</TT>,
 237 but don't assume that other operators will operate
 238 deterministically from left to right,
 239 because most of them don't.)
 240 </p><p>Now that we have our <TT>prepend</TT> function,
 241 we can build a list by calling
 242 it
 243
 244 several times in succession:
 245 <pre>
 246         struct listnode *head = NULL;
 247         head = prepend("world", head);
 248         head = prepend("hello", head);
 249 </pre>
 250 This code builds essentially the same list as our first, static example.
 251 Notice how we initialize the list head pointer with a null pointer,
 252 which is synonymous with an empty list.
 253 (Notice also that the code we wrote up above,
 254 for printing a list,
 255 would also deal correctly with an empty list.)
 256 </p><p>Using static calls to <TT>prepend</TT> is hardly more
 257 interesting than building a static link by hand.
 258 To make things truly interesting,
 259 let's read words or strings typed by the user
 260 (or redirected from a file),
 261 and prepend those to a list.
 262 The code is not hard:
 263 <pre>
 264         #define MAXLINE 200
 265
 266         char line[MAXLINE];
 267         struct listnode *head = NULL;
 268         struct listnode *lp;
 269
 270         while(getline(line, MAXLINE) != EOF)
 271                 head = prepend(line, head);
 272
 273         for(lp = head; lp != NULL; lp = lp-&gt;next)
 274                 printf("%s\n", lp-&gt;item);
 275 </pre>
 276 (<TT>getline</TT> is the line-reading function we've been using.
 277
 278 If you don't have a copy handy,
 279 you can use the <TT>fgets</TT> function from the standard
 280 library.)
 281 </p><p>If you type in this code and run it,
 282 you will find that it prints lines back out in the reverse order
 283 that you typed them.
 284 (In doing so, of course,
 285 it slurps all the lines into memory,
 286 so you might run out of memory if you tried to use this
 287 technique for reversing all the lines in a huge file.)
 288 Notice that when we call <TT>prepend</TT> in this way,
 289 it <em>is</em> important that <TT>prepend</TT> allocate memory for,
 290 and stash away, each string.
 291 Can you see what would happen if <TT>prepend</TT> did not,
 292 that is,
 293 if it simply said <TT>newnode-&gt;item = newword</TT>?
 294 </p><p></p><p>Linked lists are only the simplest example of linked data structures.
 295 There are also queues, doubly-linked lists, trees,
 296 and circular lists.
 297 We'll see
 298 more
 299 concrete examples of linked structures in the adventure game example.
 300 The set of objects sitting in a room
 301 (or held by the player)
 302 will be represented by a linked list of objects,
 303 and the rooms will be linked to each other
 304 to indicate the passages between rooms.
 305 </p><hr>
 306 <p>
 307 Read sequentially:
 308 <a href="sx1d.html" rev=precedes>prev</a>
 309 <a href="sx2.html" rel=precedes>next</a>
 310 <a href="sx1.html" rev=subdocument>up</a>
 311 <a href="top.html">top</a>
 312 </p>
 313 <p>
 314 This page by <a href="http://www.eskimo.com/~scs/">Steve Summit</a>
 315 // <a href="copyright.html">Copyright</a> 1996-1999
 316 // <a href="mailto:scs@eskimo.com">mail feedback</a>
 317 </p>
 318 </body>
 319 </html>