Objects/dictobject.c

   1
   2 /* Dictionary object implementation using a hash table */
   3
   4 #include "Python.h"
   5
   6 typedef PyDictEntry dictentry;
   7 typedef PyDictObject dictobject;
   8
   9 /* Define this out if you don't want conversion statistics on exit. */
  10 #undef SHOW_CONVERSION_COUNTS
  11
  12 /* See large comment block below.  This must be >= 1. */
  13 #define PERTURB_SHIFT 5
  14
  15 /*
  16 Major subtleties ahead:  Most hash schemes depend on having a "good" hash
  17 function, in the sense of simulating randomness.  Python doesn't:  its most
  18 important hash functions (for strings and ints) are very regular in common
  19 cases:
  20
  21 >>> map(hash, (0, 1, 2, 3))
  22 [0, 1, 2, 3]
  23 >>> map(hash, ("namea", "nameb", "namec", "named"))
  24 [-1658398457, -1658398460, -1658398459, -1658398462]
  25 >>>
  26
  27 This isn't necessarily bad!  To the contrary, in a table of size 2**i, taking
  28 the low-order i bits as the initial table index is extremely fast, and there
  29 are no collisions at all for dicts indexed by a contiguous range of ints.
  30 The same is approximately true when keys are "consecutive" strings.  So this
  31 gives better-than-random behavior in common cases, and that's very desirable.
  32
  33 OTOH, when collisions occur, the tendency to fill contiguous slices of the
  34 hash table makes a good collision resolution strategy crucial.  Taking only
  35 the last i bits of the hash code is also vulnerable:  for example, consider
  36 [i << 16 for i in range(20000)] as a set of keys.  Since ints are their own
  37 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
  38 hash code are all 0:  they *all* map to the same table index.
  39
  40 But catering to unusual cases should not slow the usual ones, so we just take
  41 the last i bits anyway.  It's up to collision resolution to do the rest.  If
  42 we *usually* find the key we're looking for on the first try (and, it turns
  43 out, we usually do -- the table load factor is kept under 2/3, so the odds
  44 are solidly in our favor), then it makes best sense to keep the initial index
  45 computation dirt cheap.
  46
  47 The first half of collision resolution is to visit table indices via this
  48 recurrence:
  49
  50     j = ((5*j) + 1) mod 2**i
  51
  52 For any initial j in range(2**i), repeating that 2**i times generates each
  53 int in range(2**i) exactly once (see any text on random-number generation for
  54 proof).  By itself, this doesn't help much:  like linear probing (setting
  55 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
  56 order.  This would be bad, except that's not the only thing we do, and it's
  57 actually *good* in the common cases where hash keys are consecutive.  In an
  58 example that's really too small to make this entirely clear, for a table of
  59 size 2**3 the order of indices is:
  60
  61     0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
  62
  63 If two things come in at index 5, the first place we look after is index 2,
  64 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
  65 Linear probing is deadly in this case because there the fixed probe order
  66 is the *same* as the order consecutive keys are likely to arrive.  But it's
  67 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
  68 and certain that consecutive hash codes do not.
  69
  70 The other half of the strategy is to get the other bits of the hash code
  71 into play.  This is done by initializing a (unsigned) vrbl "perturb" to the
  72 full hash code, and changing the recurrence to:
  73
  74     j = (5*j) + 1 + perturb;
  75     perturb >>= PERTURB_SHIFT;
  76     use j % 2**i as the next table index;
  77
  78 Now the probe sequence depends (eventually) on every bit in the hash code,
  79 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
  80 because it quickly magnifies small differences in the bits that didn't affect
  81 the initial index.  Note that because perturb is unsigned, if the recurrence
  82 is executed often enough perturb eventually becomes and remains 0.  At that
  83 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
  84 that's certain to find an empty slot eventually (since it generates every int
  85 in range(2**i), and we make sure there's always at least one empty slot).
  86
  87 Selecting a good value for PERTURB_SHIFT is a balancing act.  You want it
  88 small so that the high bits of the hash code continue to affect the probe
  89 sequence across iterations; but you want it large so that in really bad cases
  90 the high-order hash bits have an effect on early iterations.  5 was "the
  91 best" in minimizing total collisions across experiments Tim Peters ran (on
  92 both normal and pathological cases), but 4 and 6 weren't significantly worse.
  93
  94 Historical:  Reimer Behrends contributed the idea of using a polynomial-based
  95 approach, using repeated multiplication by x in GF(2**n) where an irreducible
  96 polynomial for each table size was chosen such that x was a primitive root.
  97 Christian Tismer later extended that to use division by x instead, as an
  98 efficient way to get the high bits of the hash code into play.  This scheme
  99 also gave excellent collision statistics, but was more expensive:  two
 100 if-tests were required inside the loop; computing "the next" index took about
 101 the same number of operations but without as much potential parallelism
 102 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
 103 above, and then shifting perturb can be done while the table index is being
 104 masked); and the dictobject struct required a member to hold the table's
 105 polynomial.  In Tim's experiments the current scheme ran faster, produced
 106 equally good collision statistics, needed less code & used less memory.
 107 */
 108
 109 /* Object used as dummy key to fill deleted entries */
 110 static PyObject *dummy; /* Initialized by first call to newdictobject() */
 111
 112 /* forward declarations */
 113 static dictentry *
 114 lookdict_string(dictobject *mp, PyObject *key, long hash);
 115
 116 #ifdef SHOW_CONVERSION_COUNTS
 117 static long created = 0L;
 118 static long converted = 0L;
 119
 120 static void
 121 show_counts(void)
 122 {
 123         fprintf(stderr, "created %ld string dicts\n", created);
 124         fprintf(stderr, "converted %ld to normal dicts\n", converted);
 125         fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
 126 }
 127 #endif
 128
 129 /* Initialization macros.
 130    There are two ways to create a dict:  PyDict_New() is the main C API
 131    function, and the tp_new slot maps to dict_new().  In the latter case we
 132    can save a little time over what PyDict_New does because it's guaranteed
 133    that the PyDictObject struct is already zeroed out.
 134    Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
 135    an excellent reason not to).
 136 */
 137
 138 #define INIT_NONZERO_DICT_SLOTS(mp) do {                                \
 139         (mp)->ma_table = (mp)->ma_smalltable;                           \
 140         (mp)->ma_mask = PyDict_MINSIZE - 1;                             \
 141     } while(0)
 142
 143 #define EMPTY_TO_MINSIZE(mp) do {                                       \
 144         memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable));    \
 145         (mp)->ma_used = (mp)->ma_fill = 0;                              \
 146         INIT_NONZERO_DICT_SLOTS(mp);                                    \
 147     } while(0)
 148
 149 PyObject *
 150 PyDict_New(void)
 151 {
 152         register dictobject *mp;
 153         if (dummy == NULL) { /* Auto-initialize dummy */
 154                 dummy = PyString_FromString("<dummy key>");
 155                 if (dummy == NULL)
 156                         return NULL;
 157 #ifdef SHOW_CONVERSION_COUNTS
 158                 Py_AtExit(show_counts);
 159 #endif
 160         }
 161         mp = PyObject_GC_New(dictobject, &PyDict_Type);
 162         if (mp == NULL)
 163                 return NULL;
 164         EMPTY_TO_MINSIZE(mp);
 165         mp->ma_lookup = lookdict_string;
 166 #ifdef SHOW_CONVERSION_COUNTS
 167         ++created;
 168 #endif
 169         _PyObject_GC_TRACK(mp);
 170         return (PyObject *)mp;
 171 }
 172
 173 /*
 174 The basic lookup function used by all operations.
 175 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
 176 Open addressing is preferred over chaining since the link overhead for
 177 chaining would be substantial (100% with typical malloc overhead).
 178
 179 The initial probe index is computed as hash mod the table size. Subsequent
 180 probe indices are computed as explained earlier.
 181
 182 All arithmetic on hash should ignore overflow.
 183
 184 (The details in this version are due to Tim Peters, building on many past
 185 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
 186 Christian Tismer).
 187
 188 This function must never return NULL; failures are indicated by returning
 189 a dictentry* for which the me_value field is NULL.  Exceptions are never
 190 reported by this function, and outstanding exceptions are maintained.
 191 */
 192
 193 static dictentry *
 194 lookdict(dictobject *mp, PyObject *key, register long hash)
 195 {
 196         register int i;
 197         register unsigned int perturb;
 198         register dictentry *freeslot;
 199         register unsigned int mask = mp->ma_mask;
 200         dictentry *ep0 = mp->ma_table;
 201         register dictentry *ep;
 202         register int restore_error;
 203         register int checked_error;
 204         register int cmp;
 205         PyObject *err_type, *err_value, *err_tb;
 206         PyObject *startkey;
 207
 208         i = hash & mask;
 209         ep = &ep0[i];
 210         if (ep->me_key == NULL || ep->me_key == key)
 211                 return ep;
 212
 213         restore_error = checked_error = 0;
 214         if (ep->me_key == dummy)
 215                 freeslot = ep;
 216         else {
 217                 if (ep->me_hash == hash) {
 218                         /* error can't have been checked yet */
 219                         checked_error = 1;
 220                         if (PyErr_Occurred()) {
 221                                 restore_error = 1;
 222                                 PyErr_Fetch(&err_type, &err_value, &err_tb);
 223                         }
 224                         startkey = ep->me_key;
 225                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 226                         if (cmp < 0)
 227                                 PyErr_Clear();
 228                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 229                                 if (cmp > 0)
 230                                         goto Done;
 231                         }
 232                         else {
 233                                 /* The compare did major nasty stuff to the
 234                                  * dict:  start over.
 235                                  * XXX A clever adversary could prevent this
 236                                  * XXX from terminating.
 237                                  */
 238                                 ep = lookdict(mp, key, hash);
 239                                 goto Done;
 240                         }
 241                 }
 242                 freeslot = NULL;
 243         }
 244
 245         /* In the loop, me_key == dummy is by far (factor of 100s) the
 246            least likely outcome, so test for that last. */
 247         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 248                 i = (i << 2) + i + perturb + 1;
 249                 ep = &ep0[i & mask];
 250                 if (ep->me_key == NULL) {
 251                         if (freeslot != NULL)
 252                                 ep = freeslot;
 253                         break;
 254                 }
 255                 if (ep->me_key == key)
 256                         break;
 257                 if (ep->me_hash == hash && ep->me_key != dummy) {
 258                         if (!checked_error) {
 259                                 checked_error = 1;
 260                                 if (PyErr_Occurred()) {
 261                                         restore_error = 1;
 262                                         PyErr_Fetch(&err_type, &err_value,
 263                                                     &err_tb);
 264                                 }
 265                         }
 266                         startkey = ep->me_key;
 267                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 268                         if (cmp < 0)
 269                                 PyErr_Clear();
 270                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 271                                 if (cmp > 0)
 272                                         break;
 273                         }
 274                         else {
 275                                 /* The compare did major nasty stuff to the
 276                                  * dict:  start over.
 277                                  * XXX A clever adversary could prevent this
 278                                  * XXX from terminating.
 279                                  */
 280                                 ep = lookdict(mp, key, hash);
 281                                 break;
 282                         }
 283                 }
 284                 else if (ep->me_key == dummy && freeslot == NULL)
 285                         freeslot = ep;
 286         }
 287
 288 Done:
 289         if (restore_error)
 290                 PyErr_Restore(err_type, err_value, err_tb);
 291         return ep;
 292 }
 293
 294 /*
 295  * Hacked up version of lookdict which can assume keys are always strings;
 296  * this assumption allows testing for errors during PyObject_Compare() to
 297  * be dropped; string-string comparisons never raise exceptions.  This also
 298  * means we don't need to go through PyObject_Compare(); we can always use
 299  * _PyString_Eq directly.
 300  *
 301  * This is valuable because the general-case error handling in lookdict() is
 302  * expensive, and dicts with pure-string keys are very common.
 303  */
 304 static dictentry *
 305 lookdict_string(dictobject *mp, PyObject *key, register long hash)
 306 {
 307         register int i;
 308         register unsigned int perturb;
 309         register dictentry *freeslot;
 310         register unsigned int mask = mp->ma_mask;
 311         dictentry *ep0 = mp->ma_table;
 312         register dictentry *ep;
 313
 314         /* Make sure this function doesn't have to handle non-string keys,
 315            including subclasses of str; e.g., one reason to subclass
 316            strings is to override __eq__, and for speed we don't cater to
 317            that here. */
 318         if (!PyString_CheckExact(key)) {
 319 #ifdef SHOW_CONVERSION_COUNTS
 320                 ++converted;
 321 #endif
 322                 mp->ma_lookup = lookdict;
 323                 return lookdict(mp, key, hash);
 324         }
 325         i = hash & mask;
 326         ep = &ep0[i];
 327         if (ep->me_key == NULL || ep->me_key == key)
 328                 return ep;
 329         if (ep->me_key == dummy)
 330                 freeslot = ep;
 331         else {
 332                 if (ep->me_hash == hash
 333                     && _PyString_Eq(ep->me_key, key)) {
 334                         return ep;
 335                 }
 336                 freeslot = NULL;
 337         }
 338
 339         /* In the loop, me_key == dummy is by far (factor of 100s) the
 340            least likely outcome, so test for that last. */
 341         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 342                 i = (i << 2) + i + perturb + 1;
 343                 ep = &ep0[i & mask];
 344                 if (ep->me_key == NULL)
 345                         return freeslot == NULL ? ep : freeslot;
 346                 if (ep->me_key == key
 347                     || (ep->me_hash == hash
 348                         && ep->me_key != dummy
 349                         && _PyString_Eq(ep->me_key, key)))
 350                         return ep;
 351                 if (ep->me_key == dummy && freeslot == NULL)
 352                         freeslot = ep;
 353         }
 354 }
 355
 356 /*
 357 Internal routine to insert a new item into the table.
 358 Used both by the internal resize routine and by the public insert routine.
 359 Eats a reference to key and one to value.
 360 */
 361 static void
 362 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
 363 {
 364         PyObject *old_value;
 365         register dictentry *ep;
 366         typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
 367
 368         assert(mp->ma_lookup != NULL);
 369         ep = mp->ma_lookup(mp, key, hash);
 370         if (ep->me_value != NULL) {
 371                 old_value = ep->me_value;
 372                 ep->me_value = value;
 373                 Py_DECREF(old_value); /* which **CAN** re-enter */
 374                 Py_DECREF(key);
 375         }
 376         else {
 377                 if (ep->me_key == NULL)
 378                         mp->ma_fill++;
 379                 else
 380                         Py_DECREF(ep->me_key);
 381                 ep->me_key = key;
 382                 ep->me_hash = hash;
 383                 ep->me_value = value;
 384                 mp->ma_used++;
 385         }
 386 }
 387
 388 /*
 389 Restructure the table by allocating a new table and reinserting all
 390 items again.  When entries have been deleted, the new table may
 391 actually be smaller than the old one.
 392 */
 393 static int
 394 dictresize(dictobject *mp, int minused)
 395 {
 396         int newsize;
 397         dictentry *oldtable, *newtable, *ep;
 398         int i;
 399         int is_oldtable_malloced;
 400         dictentry small_copy[PyDict_MINSIZE];
 401
 402         assert(minused >= 0);
 403
 404         /* Find the smallest table size > minused. */
 405         for (newsize = PyDict_MINSIZE;
 406              newsize <= minused && newsize > 0;
 407              newsize <<= 1)
 408                 ;
 409         if (newsize <= 0) {
 410                 PyErr_NoMemory();
 411                 return -1;
 412         }
 413
 414         /* Get space for a new table. */
 415         oldtable = mp->ma_table;
 416         assert(oldtable != NULL);
 417         is_oldtable_malloced = oldtable != mp->ma_smalltable;
 418
 419         if (newsize == PyDict_MINSIZE) {
 420                 /* A large table is shrinking, or we can't get any smaller. */
 421                 newtable = mp->ma_smalltable;
 422                 if (newtable == oldtable) {
 423                         if (mp->ma_fill == mp->ma_used) {
 424                                 /* No dummies, so no point doing anything. */
 425                                 return 0;
 426                         }
 427                         /* We're not going to resize it, but rebuild the
 428                            table anyway to purge old dummy entries.
 429                            Subtle:  This is *necessary* if fill==size,
 430                            as lookdict needs at least one virgin slot to
 431                            terminate failing searches.  If fill < size, it's
 432                            merely desirable, as dummies slow searches. */
 433                         assert(mp->ma_fill > mp->ma_used);
 434                         memcpy(small_copy, oldtable, sizeof(small_copy));
 435                         oldtable = small_copy;
 436                 }
 437         }
 438         else {
 439                 newtable = PyMem_NEW(dictentry, newsize);
 440                 if (newtable == NULL) {
 441                         PyErr_NoMemory();
 442                         return -1;
 443                 }
 444         }
 445
 446         /* Make the dict empty, using the new table. */
 447         assert(newtable != oldtable);
 448         mp->ma_table = newtable;
 449         mp->ma_mask = newsize - 1;
 450         memset(newtable, 0, sizeof(dictentry) * newsize);
 451         mp->ma_used = 0;
 452         i = mp->ma_fill;
 453         mp->ma_fill = 0;
 454
 455         /* Copy the data over; this is refcount-neutral for active entries;
 456            dummy entries aren't copied over, of course */
 457         for (ep = oldtable; i > 0; ep++) {
 458                 if (ep->me_value != NULL) {     /* active entry */
 459                         --i;
 460                         insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
 461                 }
 462                 else if (ep->me_key != NULL) {  /* dummy entry */
 463                         --i;
 464                         assert(ep->me_key == dummy);
 465                         Py_DECREF(ep->me_key);
 466                 }
 467                 /* else key == value == NULL:  nothing to do */
 468         }
 469
 470         if (is_oldtable_malloced)
 471                 PyMem_DEL(oldtable);
 472         return 0;
 473 }
 474
 475 PyObject *
 476 PyDict_GetItem(PyObject *op, PyObject *key)
 477 {
 478         long hash;
 479         dictobject *mp = (dictobject *)op;
 480         if (!PyDict_Check(op)) {
 481                 return NULL;
 482         }
 483 #ifdef CACHE_HASH
 484         if (!PyString_CheckExact(key) ||
 485             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 486 #endif
 487         {
 488                 hash = PyObject_Hash(key);
 489                 if (hash == -1) {
 490                         PyErr_Clear();
 491                         return NULL;
 492                 }
 493         }
 494         return (mp->ma_lookup)(mp, key, hash)->me_value;
 495 }
 496
 497 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
 498  * dictionary if it is merely replacing the value for an existing key.
 499  * This is means that it's safe to loop over a dictionary with
 500  * PyDict_Next() and occasionally replace a value -- but you can't
 501  * insert new keys or remove them.
 502  */
 503 int
 504 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
 505 {
 506         register dictobject *mp;
 507         register long hash;
 508         register int n_used;
 509
 510         if (!PyDict_Check(op)) {
 511                 PyErr_BadInternalCall();
 512                 return -1;
 513         }
 514         mp = (dictobject *)op;
 515 #ifdef CACHE_HASH
 516         if (PyString_CheckExact(key)) {
 517 #ifdef INTERN_STRINGS
 518                 if (((PyStringObject *)key)->ob_sinterned != NULL) {
 519                         key = ((PyStringObject *)key)->ob_sinterned;
 520                         hash = ((PyStringObject *)key)->ob_shash;
 521                 }
 522                 else
 523 #endif
 524                 {
 525                         hash = ((PyStringObject *)key)->ob_shash;
 526                         if (hash == -1)
 527                                 hash = PyObject_Hash(key);
 528                 }
 529         }
 530         else
 531 #endif
 532         {
 533                 hash = PyObject_Hash(key);
 534                 if (hash == -1)
 535                         return -1;
 536         }
 537         assert(mp->ma_fill <= mp->ma_mask);  /* at least one empty slot */
 538         n_used = mp->ma_used;
 539         Py_INCREF(value);
 540         Py_INCREF(key);
 541         insertdict(mp, key, hash, value);
 542         /* If we added a key, we can safely resize.  Otherwise skip this!
 543          * If fill >= 2/3 size, adjust size.  Normally, this doubles the
 544          * size, but it's also possible for the dict to shrink (if ma_fill is
 545          * much larger than ma_used, meaning a lot of dict keys have been
 546          * deleted).
 547          */
 548         if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) {
 549                 if (dictresize(mp, mp->ma_used*2) != 0)
 550                         return -1;
 551         }
 552         return 0;
 553 }
 554
 555 int
 556 PyDict_DelItem(PyObject *op, PyObject *key)
 557 {
 558         register dictobject *mp;
 559         register long hash;
 560         register dictentry *ep;
 561         PyObject *old_value, *old_key;
 562
 563         if (!PyDict_Check(op)) {
 564                 PyErr_BadInternalCall();
 565                 return -1;
 566         }
 567 #ifdef CACHE_HASH
 568         if (!PyString_CheckExact(key) ||
 569             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 570 #endif
 571         {
 572                 hash = PyObject_Hash(key);
 573                 if (hash == -1)
 574                         return -1;
 575         }
 576         mp = (dictobject *)op;
 577         ep = (mp->ma_lookup)(mp, key, hash);
 578         if (ep->me_value == NULL) {
 579                 PyErr_SetObject(PyExc_KeyError, key);
 580                 return -1;
 581         }
 582         old_key = ep->me_key;
 583         Py_INCREF(dummy);
 584         ep->me_key = dummy;
 585         old_value = ep->me_value;
 586         ep->me_value = NULL;
 587         mp->ma_used--;
 588         Py_DECREF(old_value);
 589         Py_DECREF(old_key);
 590         return 0;
 591 }
 592
 593 void
 594 PyDict_Clear(PyObject *op)
 595 {
 596         dictobject *mp;
 597         dictentry *ep, *table;
 598         int table_is_malloced;
 599         int fill;
 600         dictentry small_copy[PyDict_MINSIZE];
 601 #ifdef Py_DEBUG
 602         int i, n;
 603 #endif
 604
 605         if (!PyDict_Check(op))
 606                 return;
 607         mp = (dictobject *)op;
 608 #ifdef Py_DEBUG
 609         n = mp->ma_mask + 1;
 610         i = 0;
 611 #endif
 612
 613         table = mp->ma_table;
 614         assert(table != NULL);
 615         table_is_malloced = table != mp->ma_smalltable;
 616
 617         /* This is delicate.  During the process of clearing the dict,
 618          * decrefs can cause the dict to mutate.  To avoid fatal confusion
 619          * (voice of experience), we have to make the dict empty before
 620          * clearing the slots, and never refer to anything via mp->xxx while
 621          * clearing.
 622          */
 623         fill = mp->ma_fill;
 624         if (table_is_malloced)
 625                 EMPTY_TO_MINSIZE(mp);
 626
 627         else if (fill > 0) {
 628                 /* It's a small table with something that needs to be cleared.
 629                  * Afraid the only safe way is to copy the dict entries into
 630                  * another small table first.
 631                  */
 632                 memcpy(small_copy, table, sizeof(small_copy));
 633                 table = small_copy;
 634                 EMPTY_TO_MINSIZE(mp);
 635         }
 636         /* else it's a small table that's already empty */
 637
 638         /* Now we can finally clear things.  If C had refcounts, we could
 639          * assert that the refcount on table is 1 now, i.e. that this function
 640          * has unique access to it, so decref side-effects can't alter it.
 641          */
 642         for (ep = table; fill > 0; ++ep) {
 643 #ifdef Py_DEBUG
 644                 assert(i < n);
 645                 ++i;
 646 #endif
 647                 if (ep->me_key) {
 648                         --fill;
 649                         Py_DECREF(ep->me_key);
 650                         Py_XDECREF(ep->me_value);
 651                 }
 652 #ifdef Py_DEBUG
 653                 else
 654                         assert(ep->me_value == NULL);
 655 #endif
 656         }
 657
 658         if (table_is_malloced)
 659                 PyMem_DEL(table);
 660 }
 661
 662 /* CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
 663  * mutates the dict.  One exception:  it is safe if the loop merely changes
 664  * the values associated with the keys (but doesn't insert new keys or
 665  * delete keys), via PyDict_SetItem().
 666  */
 667 int
 668 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
 669 {
 670         int i;
 671         register dictobject *mp;
 672         if (!PyDict_Check(op))
 673                 return 0;
 674         mp = (dictobject *)op;
 675         i = *ppos;
 676         if (i < 0)
 677                 return 0;
 678         while (i <= mp->ma_mask && mp->ma_table[i].me_value == NULL)
 679                 i++;
 680         *ppos = i+1;
 681         if (i > mp->ma_mask)
 682                 return 0;
 683         if (pkey)
 684                 *pkey = mp->ma_table[i].me_key;
 685         if (pvalue)
 686                 *pvalue = mp->ma_table[i].me_value;
 687         return 1;
 688 }
 689
 690 /* Methods */
 691
 692 static void
 693 dict_dealloc(register dictobject *mp)
 694 {
 695         register dictentry *ep;
 696         int fill = mp->ma_fill;
 697         Py_TRASHCAN_SAFE_BEGIN(mp)
 698         _PyObject_GC_UNTRACK(mp);
 699         for (ep = mp->ma_table; fill > 0; ep++) {
 700                 if (ep->me_key) {
 701                         --fill;
 702                         Py_DECREF(ep->me_key);
 703                         Py_XDECREF(ep->me_value);
 704                 }
 705         }
 706         if (mp->ma_table != mp->ma_smalltable)
 707                 PyMem_DEL(mp->ma_table);
 708         mp->ob_type->tp_free((PyObject *)mp);
 709         Py_TRASHCAN_SAFE_END(mp)
 710 }
 711
 712 static int
 713 dict_print(register dictobject *mp, register FILE *fp, register int flags)
 714 {
 715         register int i;
 716         register int any;
 717
 718         i = Py_ReprEnter((PyObject*)mp);
 719         if (i != 0) {
 720                 if (i < 0)
 721                         return i;
 722                 fprintf(fp, "{...}");
 723                 return 0;
 724         }
 725
 726         fprintf(fp, "{");
 727         any = 0;
 728         for (i = 0; i <= mp->ma_mask; i++) {
 729                 dictentry *ep = mp->ma_table + i;
 730                 PyObject *pvalue = ep->me_value;
 731                 if (pvalue != NULL) {
 732                         /* Prevent PyObject_Repr from deleting value during
 733                            key format */
 734                         Py_INCREF(pvalue);
 735                         if (any++ > 0)
 736                                 fprintf(fp, ", ");
 737                         if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
 738                                 Py_DECREF(pvalue);
 739                                 Py_ReprLeave((PyObject*)mp);
 740                                 return -1;
 741                         }
 742                         fprintf(fp, ": ");
 743                         if (PyObject_Print(pvalue, fp, 0) != 0) {
 744                                 Py_DECREF(pvalue);
 745                                 Py_ReprLeave((PyObject*)mp);
 746                                 return -1;
 747                         }
 748                         Py_DECREF(pvalue);
 749                 }
 750         }
 751         fprintf(fp, "}");
 752         Py_ReprLeave((PyObject*)mp);
 753         return 0;
 754 }
 755
 756 static PyObject *
 757 dict_repr(dictobject *mp)
 758 {
 759         int i;
 760         PyObject *s, *temp, *colon = NULL;
 761         PyObject *pieces = NULL, *result = NULL;
 762         PyObject *key, *value;
 763
 764         i = Py_ReprEnter((PyObject *)mp);
 765         if (i != 0) {
 766                 return i > 0 ? PyString_FromString("{...}") : NULL;
 767         }
 768
 769         if (mp->ma_used == 0) {
 770                 result = PyString_FromString("{}");
 771                 goto Done;
 772         }
 773
 774         pieces = PyList_New(0);
 775         if (pieces == NULL)
 776                 goto Done;
 777
 778         colon = PyString_FromString(": ");
 779         if (colon == NULL)
 780                 goto Done;
 781
 782         /* Do repr() on each key+value pair, and insert ": " between them.
 783            Note that repr may mutate the dict. */
 784         i = 0;
 785         while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
 786                 int status;
 787                 /* Prevent repr from deleting value during key format. */
 788                 Py_INCREF(value);
 789                 s = PyObject_Repr(key);
 790                 PyString_Concat(&s, colon);
 791                 PyString_ConcatAndDel(&s, PyObject_Repr(value));
 792                 Py_DECREF(value);
 793                 if (s == NULL)
 794                         goto Done;
 795                 status = PyList_Append(pieces, s);
 796                 Py_DECREF(s);  /* append created a new ref */
 797                 if (status < 0)
 798                         goto Done;
 799         }
 800
 801         /* Add "{}" decorations to the first and last items. */
 802         assert(PyList_GET_SIZE(pieces) > 0);
 803         s = PyString_FromString("{");
 804         if (s == NULL)
 805                 goto Done;
 806         temp = PyList_GET_ITEM(pieces, 0);
 807         PyString_ConcatAndDel(&s, temp);
 808         PyList_SET_ITEM(pieces, 0, s);
 809         if (s == NULL)
 810                 goto Done;
 811
 812         s = PyString_FromString("}");
 813         if (s == NULL)
 814                 goto Done;
 815         temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
 816         PyString_ConcatAndDel(&temp, s);
 817         PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
 818         if (temp == NULL)
 819                 goto Done;
 820
 821         /* Paste them all together with ", " between. */
 822         s = PyString_FromString(", ");
 823         if (s == NULL)
 824                 goto Done;
 825         result = _PyString_Join(s, pieces);
 826         Py_DECREF(s);
 827
 828 Done:
 829         Py_XDECREF(pieces);
 830         Py_XDECREF(colon);
 831         Py_ReprLeave((PyObject *)mp);
 832         return result;
 833 }
 834
 835 static int
 836 dict_length(dictobject *mp)
 837 {
 838         return mp->ma_used;
 839 }
 840
 841 static PyObject *
 842 dict_subscript(dictobject *mp, register PyObject *key)
 843 {
 844         PyObject *v;
 845         long hash;
 846         assert(mp->ma_table != NULL);
 847 #ifdef CACHE_HASH
 848         if (!PyString_CheckExact(key) ||
 849             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 850 #endif
 851         {
 852                 hash = PyObject_Hash(key);
 853                 if (hash == -1)
 854                         return NULL;
 855         }
 856         v = (mp->ma_lookup)(mp, key, hash) -> me_value;
 857         if (v == NULL)
 858                 PyErr_SetObject(PyExc_KeyError, key);
 859         else
 860                 Py_INCREF(v);
 861         return v;
 862 }
 863
 864 static int
 865 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
 866 {
 867         if (w == NULL)
 868                 return PyDict_DelItem((PyObject *)mp, v);
 869         else
 870                 return PyDict_SetItem((PyObject *)mp, v, w);
 871 }
 872
 873 static PyMappingMethods dict_as_mapping = {
 874         (inquiry)dict_length, /*mp_length*/
 875         (binaryfunc)dict_subscript, /*mp_subscript*/
 876         (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
 877 };
 878
 879 static PyObject *
 880 dict_keys(register dictobject *mp)
 881 {
 882         register PyObject *v;
 883         register int i, j, n;
 884
 885   again:
 886         n = mp->ma_used;
 887         v = PyList_New(n);
 888         if (v == NULL)
 889                 return NULL;
 890         if (n != mp->ma_used) {
 891                 /* Durnit.  The allocations caused the dict to resize.
 892                  * Just start over, this shouldn't normally happen.
 893                  */
 894                 Py_DECREF(v);
 895                 goto again;
 896         }
 897         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 898                 if (mp->ma_table[i].me_value != NULL) {
 899                         PyObject *key = mp->ma_table[i].me_key;
 900                         Py_INCREF(key);
 901                         PyList_SET_ITEM(v, j, key);
 902                         j++;
 903                 }
 904         }
 905         return v;
 906 }
 907
 908 static PyObject *
 909 dict_values(register dictobject *mp)
 910 {
 911         register PyObject *v;
 912         register int i, j, n;
 913
 914   again:
 915         n = mp->ma_used;
 916         v = PyList_New(n);
 917         if (v == NULL)
 918                 return NULL;
 919         if (n != mp->ma_used) {
 920                 /* Durnit.  The allocations caused the dict to resize.
 921                  * Just start over, this shouldn't normally happen.
 922                  */
 923                 Py_DECREF(v);
 924                 goto again;
 925         }
 926         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 927                 if (mp->ma_table[i].me_value != NULL) {
 928                         PyObject *value = mp->ma_table[i].me_value;
 929                         Py_INCREF(value);
 930                         PyList_SET_ITEM(v, j, value);
 931                         j++;
 932                 }
 933         }
 934         return v;
 935 }
 936
 937 static PyObject *
 938 dict_items(register dictobject *mp)
 939 {
 940         register PyObject *v;
 941         register int i, j, n;
 942         PyObject *item, *key, *value;
 943
 944         /* Preallocate the list of tuples, to avoid allocations during
 945          * the loop over the items, which could trigger GC, which
 946          * could resize the dict. :-(
 947          */
 948   again:
 949         n = mp->ma_used;
 950         v = PyList_New(n);
 951         if (v == NULL)
 952                 return NULL;
 953         for (i = 0; i < n; i++) {
 954                 item = PyTuple_New(2);
 955                 if (item == NULL) {
 956                         Py_DECREF(v);
 957                         return NULL;
 958                 }
 959                 PyList_SET_ITEM(v, i, item);
 960         }
 961         if (n != mp->ma_used) {
 962                 /* Durnit.  The allocations caused the dict to resize.
 963                  * Just start over, this shouldn't normally happen.
 964                  */
 965                 Py_DECREF(v);
 966                 goto again;
 967         }
 968         /* Nothing we do below makes any function calls. */
 969         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 970                 if (mp->ma_table[i].me_value != NULL) {
 971                         key = mp->ma_table[i].me_key;
 972                         value = mp->ma_table[i].me_value;
 973                         item = PyList_GET_ITEM(v, j);
 974                         Py_INCREF(key);
 975                         PyTuple_SET_ITEM(item, 0, key);
 976                         Py_INCREF(value);
 977                         PyTuple_SET_ITEM(item, 1, value);
 978                         j++;
 979                 }
 980         }
 981         assert(j == n);
 982         return v;
 983 }
 984
 985 static PyObject *
 986 dict_update(PyObject *mp, PyObject *other)
 987 {
 988         if (PyDict_Update(mp, other) < 0)
 989                 return NULL;
 990         Py_INCREF(Py_None);
 991         return Py_None;
 992 }
 993
 994 /* Update unconditionally replaces existing items.
 995    Merge has a 3rd argument 'override'; if set, it acts like Update,
 996    otherwise it leaves existing items unchanged.
 997
 998    PyDict_{Update,Merge} update/merge from a mapping object.
 999
1000    PyDict_MergeFromSeq2 updates/merges from any iterable object
1001    producing iterable objects of length 2.
1002 */
1003
1004 int
1005 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1006 {
1007         PyObject *it;   /* iter(seq2) */
1008         int i;          /* index into seq2 of current element */
1009         PyObject *item; /* seq2[i] */
1010         PyObject *fast; /* item as a 2-tuple or 2-list */
1011
1012         assert(d != NULL);
1013         assert(PyDict_Check(d));
1014         assert(seq2 != NULL);
1015
1016         it = PyObject_GetIter(seq2);
1017         if (it == NULL)
1018                 return -1;
1019
1020         for (i = 0; ; ++i) {
1021                 PyObject *key, *value;
1022                 int n;
1023
1024                 fast = NULL;
1025                 item = PyIter_Next(it);
1026                 if (item == NULL) {
1027                         if (PyErr_Occurred())
1028                                 goto Fail;
1029                         break;
1030                 }
1031
1032                 /* Convert item to sequence, and verify length 2. */
1033                 fast = PySequence_Fast(item, "");
1034                 if (fast == NULL) {
1035                         if (PyErr_ExceptionMatches(PyExc_TypeError))
1036                                 PyErr_Format(PyExc_TypeError,
1037                                         "cannot convert dictionary update "
1038                                         "sequence element #%d to a sequence",
1039                                         i);
1040                         goto Fail;
1041                 }
1042                 n = PySequence_Fast_GET_SIZE(fast);
1043                 if (n != 2) {
1044                         PyErr_Format(PyExc_ValueError,
1045                                      "dictionary update sequence element #%d "
1046                                      "has length %d; 2 is required",
1047                                      i, n);
1048                         goto Fail;
1049                 }
1050
1051                 /* Update/merge with this (key, value) pair. */
1052                 key = PySequence_Fast_GET_ITEM(fast, 0);
1053                 value = PySequence_Fast_GET_ITEM(fast, 1);
1054                 if (override || PyDict_GetItem(d, key) == NULL) {
1055                         int status = PyDict_SetItem(d, key, value);
1056                         if (status < 0)
1057                                 goto Fail;
1058                 }
1059                 Py_DECREF(fast);
1060                 Py_DECREF(item);
1061         }
1062
1063         i = 0;
1064         goto Return;
1065 Fail:
1066         Py_XDECREF(item);
1067         Py_XDECREF(fast);
1068         i = -1;
1069 Return:
1070         Py_DECREF(it);
1071         return i;
1072 }
1073
1074 int
1075 PyDict_Update(PyObject *a, PyObject *b)
1076 {
1077         return PyDict_Merge(a, b, 1);
1078 }
1079
1080 int
1081 PyDict_Merge(PyObject *a, PyObject *b, int override)
1082 {
1083         register PyDictObject *mp, *other;
1084         register int i;
1085         dictentry *entry;
1086
1087         /* We accept for the argument either a concrete dictionary object,
1088          * or an abstract "mapping" object.  For the former, we can do
1089          * things quite efficiently.  For the latter, we only require that
1090          * PyMapping_Keys() and PyObject_GetItem() be supported.
1091          */
1092         if (a == NULL || !PyDict_Check(a) || b == NULL) {
1093                 PyErr_BadInternalCall();
1094                 return -1;
1095         }
1096         mp = (dictobject*)a;
1097         if (PyDict_Check(b)) {
1098                 other = (dictobject*)b;
1099                 if (other == mp || other->ma_used == 0)
1100                         /* a.update(a) or a.update({}); nothing to do */
1101                         return 0;
1102                 /* Do one big resize at the start, rather than
1103                  * incrementally resizing as we insert new items.  Expect
1104                  * that there will be no (or few) overlapping keys.
1105                  */
1106                 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1107                    if (dictresize(mp, (mp->ma_used + other->ma_used)*3/2) != 0)
1108                            return -1;
1109                 }
1110                 for (i = 0; i <= other->ma_mask; i++) {
1111                         entry = &other->ma_table[i];
1112                         if (entry->me_value != NULL &&
1113                             (override ||
1114                              PyDict_GetItem(a, entry->me_key) == NULL)) {
1115                                 Py_INCREF(entry->me_key);
1116                                 Py_INCREF(entry->me_value);
1117                                 insertdict(mp, entry->me_key, entry->me_hash,
1118                                            entry->me_value);
1119                         }
1120                 }
1121         }
1122         else {
1123                 /* Do it the generic, slower way */
1124                 PyObject *keys = PyMapping_Keys(b);
1125                 PyObject *iter;
1126                 PyObject *key, *value;
1127                 int status;
1128
1129                 if (keys == NULL)
1130                         /* Docstring says this is equivalent to E.keys() so
1131                          * if E doesn't have a .keys() method we want
1132                          * AttributeError to percolate up.  Might as well
1133                          * do the same for any other error.
1134                          */
1135                         return -1;
1136
1137                 iter = PyObject_GetIter(keys);
1138                 Py_DECREF(keys);
1139                 if (iter == NULL)
1140                         return -1;
1141
1142                 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1143                         if (!override && PyDict_GetItem(a, key) != NULL) {
1144                                 Py_DECREF(key);
1145                                 continue;
1146                         }
1147                         value = PyObject_GetItem(b, key);
1148                         if (value == NULL) {
1149                                 Py_DECREF(iter);
1150                                 Py_DECREF(key);
1151                                 return -1;
1152                         }
1153                         status = PyDict_SetItem(a, key, value);
1154                         Py_DECREF(key);
1155                         Py_DECREF(value);
1156                         if (status < 0) {
1157                                 Py_DECREF(iter);
1158                                 return -1;
1159                         }
1160                 }
1161                 Py_DECREF(iter);
1162                 if (PyErr_Occurred())
1163                         /* Iterator completed, via error */
1164                         return -1;
1165         }
1166         return 0;
1167 }
1168
1169 static PyObject *
1170 dict_copy(register dictobject *mp)
1171 {
1172         return PyDict_Copy((PyObject*)mp);
1173 }
1174
1175 PyObject *
1176 PyDict_Copy(PyObject *o)
1177 {
1178         register dictobject *mp;
1179         register int i;
1180         dictobject *copy;
1181         dictentry *entry;
1182
1183         if (o == NULL || !PyDict_Check(o)) {
1184                 PyErr_BadInternalCall();
1185                 return NULL;
1186         }
1187         mp = (dictobject *)o;
1188         copy = (dictobject *)PyDict_New();
1189         if (copy == NULL)
1190                 return NULL;
1191         if (mp->ma_used > 0) {
1192                 if (dictresize(copy, mp->ma_used*3/2) != 0)
1193                         return NULL;
1194                 for (i = 0; i <= mp->ma_mask; i++) {
1195                         entry = &mp->ma_table[i];
1196                         if (entry->me_value != NULL) {
1197                                 Py_INCREF(entry->me_key);
1198                                 Py_INCREF(entry->me_value);
1199                                 insertdict(copy, entry->me_key, entry->me_hash,
1200                                            entry->me_value);
1201                         }
1202                 }
1203         }
1204         return (PyObject *)copy;
1205 }
1206
1207 int
1208 PyDict_Size(PyObject *mp)
1209 {
1210         if (mp == NULL || !PyDict_Check(mp)) {
1211                 PyErr_BadInternalCall();
1212                 return 0;
1213         }
1214         return ((dictobject *)mp)->ma_used;
1215 }
1216
1217 PyObject *
1218 PyDict_Keys(PyObject *mp)
1219 {
1220         if (mp == NULL || !PyDict_Check(mp)) {
1221                 PyErr_BadInternalCall();
1222                 return NULL;
1223         }
1224         return dict_keys((dictobject *)mp);
1225 }
1226
1227 PyObject *
1228 PyDict_Values(PyObject *mp)
1229 {
1230         if (mp == NULL || !PyDict_Check(mp)) {
1231                 PyErr_BadInternalCall();
1232                 return NULL;
1233         }
1234         return dict_values((dictobject *)mp);
1235 }
1236
1237 PyObject *
1238 PyDict_Items(PyObject *mp)
1239 {
1240         if (mp == NULL || !PyDict_Check(mp)) {
1241                 PyErr_BadInternalCall();
1242                 return NULL;
1243         }
1244         return dict_items((dictobject *)mp);
1245 }
1246
1247 /* Subroutine which returns the smallest key in a for which b's value
1248    is different or absent.  The value is returned too, through the
1249    pval argument.  Both are NULL if no key in a is found for which b's status
1250    differs.  The refcounts on (and only on) non-NULL *pval and function return
1251    values must be decremented by the caller (characterize() increments them
1252    to ensure that mutating comparison and PyDict_GetItem calls can't delete
1253    them before the caller is done looking at them). */
1254
1255 static PyObject *
1256 characterize(dictobject *a, dictobject *b, PyObject **pval)
1257 {
1258         PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1259         PyObject *aval = NULL; /* a[akey] */
1260         int i, cmp;
1261
1262         for (i = 0; i <= a->ma_mask; i++) {
1263                 PyObject *thiskey, *thisaval, *thisbval;
1264                 if (a->ma_table[i].me_value == NULL)
1265                         continue;
1266                 thiskey = a->ma_table[i].me_key;
1267                 Py_INCREF(thiskey);  /* keep alive across compares */
1268                 if (akey != NULL) {
1269                         cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1270                         if (cmp < 0) {
1271                                 Py_DECREF(thiskey);
1272                                 goto Fail;
1273                         }
1274                         if (cmp > 0 ||
1275                             i > a->ma_mask ||
1276                             a->ma_table[i].me_value == NULL)
1277                         {
1278                                 /* Not the *smallest* a key; or maybe it is
1279                                  * but the compare shrunk the dict so we can't
1280                                  * find its associated value anymore; or
1281                                  * maybe it is but the compare deleted the
1282                                  * a[thiskey] entry.
1283                                  */
1284                                 Py_DECREF(thiskey);
1285                                 continue;
1286                         }
1287                 }
1288
1289                 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1290                 thisaval = a->ma_table[i].me_value;
1291                 assert(thisaval);
1292                 Py_INCREF(thisaval);   /* keep alive */
1293                 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1294                 if (thisbval == NULL)
1295                         cmp = 0;
1296                 else {
1297                         /* both dicts have thiskey:  same values? */
1298                         cmp = PyObject_RichCompareBool(
1299                                                 thisaval, thisbval, Py_EQ);
1300                         if (cmp < 0) {
1301                                 Py_DECREF(thiskey);
1302                                 Py_DECREF(thisaval);
1303                                 goto Fail;
1304                         }
1305                 }
1306                 if (cmp == 0) {
1307                         /* New winner. */
1308                         Py_XDECREF(akey);
1309                         Py_XDECREF(aval);
1310                         akey = thiskey;
1311                         aval = thisaval;
1312                 }
1313                 else {
1314                         Py_DECREF(thiskey);
1315                         Py_DECREF(thisaval);
1316                 }
1317         }
1318         *pval = aval;
1319         return akey;
1320
1321 Fail:
1322         Py_XDECREF(akey);
1323         Py_XDECREF(aval);
1324         *pval = NULL;
1325         return NULL;
1326 }
1327
1328 static int
1329 dict_compare(dictobject *a, dictobject *b)
1330 {
1331         PyObject *adiff, *bdiff, *aval, *bval;
1332         int res;
1333
1334         /* Compare lengths first */
1335         if (a->ma_used < b->ma_used)
1336                 return -1;      /* a is shorter */
1337         else if (a->ma_used > b->ma_used)
1338                 return 1;       /* b is shorter */
1339
1340         /* Same length -- check all keys */
1341         bdiff = bval = NULL;
1342         adiff = characterize(a, b, &aval);
1343         if (adiff == NULL) {
1344                 assert(!aval);
1345                 /* Either an error, or a is a subset with the same length so
1346                  * must be equal.
1347                  */
1348                 res = PyErr_Occurred() ? -1 : 0;
1349                 goto Finished;
1350         }
1351         bdiff = characterize(b, a, &bval);
1352         if (bdiff == NULL && PyErr_Occurred()) {
1353                 assert(!bval);
1354                 res = -1;
1355                 goto Finished;
1356         }
1357         res = 0;
1358         if (bdiff) {
1359                 /* bdiff == NULL "should be" impossible now, but perhaps
1360                  * the last comparison done by the characterize() on a had
1361                  * the side effect of making the dicts equal!
1362                  */
1363                 res = PyObject_Compare(adiff, bdiff);
1364         }
1365         if (res == 0 && bval != NULL)
1366                 res = PyObject_Compare(aval, bval);
1367
1368 Finished:
1369         Py_XDECREF(adiff);
1370         Py_XDECREF(bdiff);
1371         Py_XDECREF(aval);
1372         Py_XDECREF(bval);
1373         return res;
1374 }
1375
1376 /* Return 1 if dicts equal, 0 if not, -1 if error.
1377  * Gets out as soon as any difference is detected.
1378  * Uses only Py_EQ comparison.
1379  */
1380 static int
1381 dict_equal(dictobject *a, dictobject *b)
1382 {
1383         int i;
1384
1385         if (a->ma_used != b->ma_used)
1386                 /* can't be equal if # of entries differ */
1387                 return 0;
1388
1389         /* Same # of entries -- check all of 'em.  Exit early on any diff. */
1390         for (i = 0; i <= a->ma_mask; i++) {
1391                 PyObject *aval = a->ma_table[i].me_value;
1392                 if (aval != NULL) {
1393                         int cmp;
1394                         PyObject *bval;
1395                         PyObject *key = a->ma_table[i].me_key;
1396                         /* temporarily bump aval's refcount to ensure it stays
1397                            alive until we're done with it */
1398                         Py_INCREF(aval);
1399                         bval = PyDict_GetItem((PyObject *)b, key);
1400                         if (bval == NULL) {
1401                                 Py_DECREF(aval);
1402                                 return 0;
1403                         }
1404                         cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1405                         Py_DECREF(aval);
1406                         if (cmp <= 0)  /* error or not equal */
1407                                 return cmp;
1408                 }
1409         }
1410         return 1;
1411  }
1412
1413 static PyObject *
1414 dict_richcompare(PyObject *v, PyObject *w, int op)
1415 {
1416         int cmp;
1417         PyObject *res;
1418
1419         if (!PyDict_Check(v) || !PyDict_Check(w)) {
1420                 res = Py_NotImplemented;
1421         }
1422         else if (op == Py_EQ || op == Py_NE) {
1423                 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1424                 if (cmp < 0)
1425                         return NULL;
1426                 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1427         }
1428         else
1429                 res = Py_NotImplemented;
1430         Py_INCREF(res);
1431         return res;
1432  }
1433
1434 static PyObject *
1435 dict_has_key(register dictobject *mp, PyObject *key)
1436 {
1437         long hash;
1438         register long ok;
1439 #ifdef CACHE_HASH
1440         if (!PyString_CheckExact(key) ||
1441             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1442 #endif
1443         {
1444                 hash = PyObject_Hash(key);
1445                 if (hash == -1)
1446                         return NULL;
1447         }
1448         ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1449         return PyInt_FromLong(ok);
1450 }
1451
1452 static PyObject *
1453 dict_get(register dictobject *mp, PyObject *args)
1454 {
1455         PyObject *key;
1456         PyObject *failobj = Py_None;
1457         PyObject *val = NULL;
1458         long hash;
1459
1460         if (!PyArg_ParseTuple(args, "O|O:get", &key, &failobj))
1461                 return NULL;
1462
1463 #ifdef CACHE_HASH
1464         if (!PyString_CheckExact(key) ||
1465             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1466 #endif
1467         {
1468                 hash = PyObject_Hash(key);
1469                 if (hash == -1)
1470                         return NULL;
1471         }
1472         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1473
1474         if (val == NULL)
1475                 val = failobj;
1476         Py_INCREF(val);
1477         return val;
1478 }
1479
1480
1481 static PyObject *
1482 dict_setdefault(register dictobject *mp, PyObject *args)
1483 {
1484         PyObject *key;
1485         PyObject *failobj = Py_None;
1486         PyObject *val = NULL;
1487         long hash;
1488
1489         if (!PyArg_ParseTuple(args, "O|O:setdefault", &key, &failobj))
1490                 return NULL;
1491
1492 #ifdef CACHE_HASH
1493         if (!PyString_CheckExact(key) ||
1494             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1495 #endif
1496         {
1497                 hash = PyObject_Hash(key);
1498                 if (hash == -1)
1499                         return NULL;
1500         }
1501         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1502         if (val == NULL) {
1503                 val = failobj;
1504                 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1505                         val = NULL;
1506         }
1507         Py_XINCREF(val);
1508         return val;
1509 }
1510
1511
1512 static PyObject *
1513 dict_clear(register dictobject *mp)
1514 {
1515         PyDict_Clear((PyObject *)mp);
1516         Py_INCREF(Py_None);
1517         return Py_None;
1518 }
1519
1520 static PyObject *
1521 dict_popitem(dictobject *mp)
1522 {
1523         int i = 0;
1524         dictentry *ep;
1525         PyObject *res;
1526
1527         /* Allocate the result tuple before checking the size.  Believe it
1528          * or not, this allocation could trigger a garbage collection which
1529          * could empty the dict, so if we checked the size first and that
1530          * happened, the result would be an infinite loop (searching for an
1531          * entry that no longer exists).  Note that the usual popitem()
1532          * idiom is "while d: k, v = d.popitem()". so needing to throw the
1533          * tuple away  if the dict *is* empty isn't a significant
1534          * inefficiency -- possible, but unlikely in practice.
1535          */
1536         res = PyTuple_New(2);
1537         if (res == NULL)
1538                 return NULL;
1539         if (mp->ma_used == 0) {
1540                 Py_DECREF(res);
1541                 PyErr_SetString(PyExc_KeyError,
1542                                 "popitem(): dictionary is empty");
1543                 return NULL;
1544         }
1545         /* Set ep to "the first" dict entry with a value.  We abuse the hash
1546          * field of slot 0 to hold a search finger:
1547          * If slot 0 has a value, use slot 0.
1548          * Else slot 0 is being used to hold a search finger,
1549          * and we use its hash value as the first index to look.
1550          */
1551         ep = &mp->ma_table[0];
1552         if (ep->me_value == NULL) {
1553                 i = (int)ep->me_hash;
1554                 /* The hash field may be a real hash value, or it may be a
1555                  * legit search finger, or it may be a once-legit search
1556                  * finger that's out of bounds now because it wrapped around
1557                  * or the table shrunk -- simply make sure it's in bounds now.
1558                  */
1559                 if (i > mp->ma_mask || i < 1)
1560                         i = 1;  /* skip slot 0 */
1561                 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1562                         i++;
1563                         if (i > mp->ma_mask)
1564                                 i = 1;
1565                 }
1566         }
1567         PyTuple_SET_ITEM(res, 0, ep->me_key);
1568         PyTuple_SET_ITEM(res, 1, ep->me_value);
1569         Py_INCREF(dummy);
1570         ep->me_key = dummy;
1571         ep->me_value = NULL;
1572         mp->ma_used--;
1573         assert(mp->ma_table[0].me_value == NULL);
1574         mp->ma_table[0].me_hash = i + 1;  /* next place to start */
1575         return res;
1576 }
1577
1578 static int
1579 dict_traverse(PyObject *op, visitproc visit, void *arg)
1580 {
1581         int i = 0, err;
1582         PyObject *pk;
1583         PyObject *pv;
1584
1585         while (PyDict_Next(op, &i, &pk, &pv)) {
1586                 err = visit(pk, arg);
1587                 if (err)
1588                         return err;
1589                 err = visit(pv, arg);
1590                 if (err)
1591                         return err;
1592         }
1593         return 0;
1594 }
1595
1596 static int
1597 dict_tp_clear(PyObject *op)
1598 {
1599         PyDict_Clear(op);
1600         return 0;
1601 }
1602
1603
1604 staticforward PyObject *dictiter_new(dictobject *, binaryfunc);
1605
1606 static PyObject *
1607 select_key(PyObject *key, PyObject *value)
1608 {
1609         Py_INCREF(key);
1610         return key;
1611 }
1612
1613 static PyObject *
1614 select_value(PyObject *key, PyObject *value)
1615 {
1616         Py_INCREF(value);
1617         return value;
1618 }
1619
1620 static PyObject *
1621 select_item(PyObject *key, PyObject *value)
1622 {
1623         PyObject *res = PyTuple_New(2);
1624
1625         if (res != NULL) {
1626                 Py_INCREF(key);
1627                 Py_INCREF(value);
1628                 PyTuple_SET_ITEM(res, 0, key);
1629                 PyTuple_SET_ITEM(res, 1, value);
1630         }
1631         return res;
1632 }
1633
1634 static PyObject *
1635 dict_iterkeys(dictobject *dict)
1636 {
1637         return dictiter_new(dict, select_key);
1638 }
1639
1640 static PyObject *
1641 dict_itervalues(dictobject *dict)
1642 {
1643         return dictiter_new(dict, select_value);
1644 }
1645
1646 static PyObject *
1647 dict_iteritems(dictobject *dict)
1648 {
1649         return dictiter_new(dict, select_item);
1650 }
1651
1652
1653 static char has_key__doc__[] =
1654 "D.has_key(k) -> 1 if D has a key k, else 0";
1655
1656 static char get__doc__[] =
1657 "D.get(k[,d]) -> D[k] if D.has_key(k), else d.  d defaults to None.";
1658
1659 static char setdefault_doc__[] =
1660 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if not D.has_key(k)";
1661
1662 static char popitem__doc__[] =
1663 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1664 2-tuple; but raise KeyError if D is empty";
1665
1666 static char keys__doc__[] =
1667 "D.keys() -> list of D's keys";
1668
1669 static char items__doc__[] =
1670 "D.items() -> list of D's (key, value) pairs, as 2-tuples";
1671
1672 static char values__doc__[] =
1673 "D.values() -> list of D's values";
1674
1675 static char update__doc__[] =
1676 "D.update(E) -> None.  Update D from E: for k in E.keys(): D[k] = E[k]";
1677
1678 static char clear__doc__[] =
1679 "D.clear() -> None.  Remove all items from D.";
1680
1681 static char copy__doc__[] =
1682 "D.copy() -> a shallow copy of D";
1683
1684 static char iterkeys__doc__[] =
1685 "D.iterkeys() -> an iterator over the keys of D";
1686
1687 static char itervalues__doc__[] =
1688 "D.itervalues() -> an iterator over the values of D";
1689
1690 static char iteritems__doc__[] =
1691 "D.iteritems() -> an iterator over the (key, value) items of D";
1692
1693 static PyMethodDef mapp_methods[] = {
1694         {"has_key",     (PyCFunction)dict_has_key,      METH_O,
1695          has_key__doc__},
1696         {"get",         (PyCFunction)dict_get,          METH_VARARGS,
1697          get__doc__},
1698         {"setdefault",  (PyCFunction)dict_setdefault,   METH_VARARGS,
1699          setdefault_doc__},
1700         {"popitem",     (PyCFunction)dict_popitem,      METH_NOARGS,
1701          popitem__doc__},
1702         {"keys",        (PyCFunction)dict_keys,         METH_NOARGS,
1703         keys__doc__},
1704         {"items",       (PyCFunction)dict_items,        METH_NOARGS,
1705          items__doc__},
1706         {"values",      (PyCFunction)dict_values,       METH_NOARGS,
1707          values__doc__},
1708         {"update",      (PyCFunction)dict_update,       METH_O,
1709          update__doc__},
1710         {"clear",       (PyCFunction)dict_clear,        METH_NOARGS,
1711          clear__doc__},
1712         {"copy",        (PyCFunction)dict_copy,         METH_NOARGS,
1713          copy__doc__},
1714         {"iterkeys",    (PyCFunction)dict_iterkeys,     METH_NOARGS,
1715          iterkeys__doc__},
1716         {"itervalues",  (PyCFunction)dict_itervalues,   METH_NOARGS,
1717          itervalues__doc__},
1718         {"iteritems",   (PyCFunction)dict_iteritems,    METH_NOARGS,
1719          iteritems__doc__},
1720         {NULL,          NULL}   /* sentinel */
1721 };
1722
1723 static int
1724 dict_contains(dictobject *mp, PyObject *key)
1725 {
1726         long hash;
1727
1728 #ifdef CACHE_HASH
1729         if (!PyString_CheckExact(key) ||
1730             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1731 #endif
1732         {
1733                 hash = PyObject_Hash(key);
1734                 if (hash == -1)
1735                         return -1;
1736         }
1737         return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1738 }
1739
1740 /* Hack to implement "key in dict" */
1741 static PySequenceMethods dict_as_sequence = {
1742         0,                                      /* sq_length */
1743         0,                                      /* sq_concat */
1744         0,                                      /* sq_repeat */
1745         0,                                      /* sq_item */
1746         0,                                      /* sq_slice */
1747         0,                                      /* sq_ass_item */
1748         0,                                      /* sq_ass_slice */
1749         (objobjproc)dict_contains,              /* sq_contains */
1750         0,                                      /* sq_inplace_concat */
1751         0,                                      /* sq_inplace_repeat */
1752 };
1753
1754 static PyObject *
1755 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1756 {
1757         PyObject *self;
1758
1759         assert(type != NULL && type->tp_alloc != NULL);
1760         self = type->tp_alloc(type, 0);
1761         if (self != NULL) {
1762                 PyDictObject *d = (PyDictObject *)self;
1763                 /* It's guaranteed that tp->alloc zeroed out the struct. */
1764                 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1765                 INIT_NONZERO_DICT_SLOTS(d);
1766                 d->ma_lookup = lookdict_string;
1767 #ifdef SHOW_CONVERSION_COUNTS
1768                 ++created;
1769 #endif
1770         }
1771         return self;
1772 }
1773
1774 static int
1775 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
1776 {
1777         PyObject *arg = NULL;
1778         static char *kwlist[] = {"items", 0};
1779         int result = 0;
1780
1781         if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:dict",
1782                                          kwlist, &arg))
1783                 result = -1;
1784
1785         else if (arg != NULL) {
1786                 if (PyObject_HasAttrString(arg, "keys"))
1787                         result = PyDict_Merge(self, arg, 1);
1788                 else
1789                         result = PyDict_MergeFromSeq2(self, arg, 1);
1790         }
1791         return result;
1792 }
1793
1794 static long
1795 dict_nohash(PyObject *self)
1796 {
1797         PyErr_SetString(PyExc_TypeError, "dict objects are unhashable");
1798         return -1;
1799 }
1800
1801 static PyObject *
1802 dict_iter(dictobject *dict)
1803 {
1804         return dictiter_new(dict, select_key);
1805 }
1806
1807 static char dictionary_doc[] =
1808 "dict() -> new empty dictionary.\n"
1809 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
1810 "    (key, value) pairs.\n"
1811 "dict(seq) -> new dictionary initialized as if via:\n"
1812 "    d = {}\n"
1813 "    for k, v in seq:\n"
1814 "        d[k] = v";
1815
1816 PyTypeObject PyDict_Type = {
1817         PyObject_HEAD_INIT(&PyType_Type)
1818         0,
1819         "dict",
1820         sizeof(dictobject),
1821         0,
1822         (destructor)dict_dealloc,               /* tp_dealloc */
1823         (printfunc)dict_print,                  /* tp_print */
1824         0,                                      /* tp_getattr */
1825         0,                                      /* tp_setattr */
1826         (cmpfunc)dict_compare,                  /* tp_compare */
1827         (reprfunc)dict_repr,                    /* tp_repr */
1828         0,                                      /* tp_as_number */
1829         &dict_as_sequence,                      /* tp_as_sequence */
1830         &dict_as_mapping,                       /* tp_as_mapping */
1831         dict_nohash,                            /* tp_hash */
1832         0,                                      /* tp_call */
1833         0,                                      /* tp_str */
1834         PyObject_GenericGetAttr,                /* tp_getattro */
1835         0,                                      /* tp_setattro */
1836         0,                                      /* tp_as_buffer */
1837         Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
1838                 Py_TPFLAGS_BASETYPE,            /* tp_flags */
1839         dictionary_doc,                         /* tp_doc */
1840         (traverseproc)dict_traverse,            /* tp_traverse */
1841         (inquiry)dict_tp_clear,                 /* tp_clear */
1842         dict_richcompare,                       /* tp_richcompare */
1843         0,                                      /* tp_weaklistoffset */
1844         (getiterfunc)dict_iter,                 /* tp_iter */
1845         0,                                      /* tp_iternext */
1846         mapp_methods,                           /* tp_methods */
1847         0,                                      /* tp_members */
1848         0,                                      /* tp_getset */
1849         0,                                      /* tp_base */
1850         0,                                      /* tp_dict */
1851         0,                                      /* tp_descr_get */
1852         0,                                      /* tp_descr_set */
1853         0,                                      /* tp_dictoffset */
1854         (initproc)dict_init,                    /* tp_init */
1855         PyType_GenericAlloc,                    /* tp_alloc */
1856         dict_new,                               /* tp_new */
1857         _PyObject_GC_Del,                       /* tp_free */
1858 };
1859
1860 /* For backward compatibility with old dictionary interface */
1861
1862 PyObject *
1863 PyDict_GetItemString(PyObject *v, char *key)
1864 {
1865         PyObject *kv, *rv;
1866         kv = PyString_FromString(key);
1867         if (kv == NULL)
1868                 return NULL;
1869         rv = PyDict_GetItem(v, kv);
1870         Py_DECREF(kv);
1871         return rv;
1872 }
1873
1874 int
1875 PyDict_SetItemString(PyObject *v, char *key, PyObject *item)
1876 {
1877         PyObject *kv;
1878         int err;
1879         kv = PyString_FromString(key);
1880         if (kv == NULL)
1881                 return -1;
1882         PyString_InternInPlace(&kv); /* XXX Should we really? */
1883         err = PyDict_SetItem(v, kv, item);
1884         Py_DECREF(kv);
1885         return err;
1886 }
1887
1888 int
1889 PyDict_DelItemString(PyObject *v, char *key)
1890 {
1891         PyObject *kv;
1892         int err;
1893         kv = PyString_FromString(key);
1894         if (kv == NULL)
1895                 return -1;
1896         err = PyDict_DelItem(v, kv);
1897         Py_DECREF(kv);
1898         return err;
1899 }
1900
1901 /* Dictionary iterator type */
1902
1903 extern PyTypeObject PyDictIter_Type; /* Forward */
1904
1905 typedef struct {
1906         PyObject_HEAD
1907         dictobject *di_dict;
1908         int di_used;
1909         int di_pos;
1910         binaryfunc di_select;
1911 } dictiterobject;
1912
1913 static PyObject *
1914 dictiter_new(dictobject *dict, binaryfunc select)
1915 {
1916         dictiterobject *di;
1917         di = PyObject_NEW(dictiterobject, &PyDictIter_Type);
1918         if (di == NULL)
1919                 return NULL;
1920         Py_INCREF(dict);
1921         di->di_dict = dict;
1922         di->di_used = dict->ma_used;
1923         di->di_pos = 0;
1924         di->di_select = select;
1925         return (PyObject *)di;
1926 }
1927
1928 static void
1929 dictiter_dealloc(dictiterobject *di)
1930 {
1931         Py_DECREF(di->di_dict);
1932         PyObject_DEL(di);
1933 }
1934
1935 static PyObject *
1936 dictiter_next(dictiterobject *di, PyObject *args)
1937 {
1938         PyObject *key, *value;
1939
1940         if (di->di_used != di->di_dict->ma_used) {
1941                 PyErr_SetString(PyExc_RuntimeError,
1942                                 "dictionary changed size during iteration");
1943                 return NULL;
1944         }
1945         if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1946                 return (*di->di_select)(key, value);
1947         }
1948         PyErr_SetObject(PyExc_StopIteration, Py_None);
1949         return NULL;
1950 }
1951
1952 static PyObject *
1953 dictiter_getiter(PyObject *it)
1954 {
1955         Py_INCREF(it);
1956         return it;
1957 }
1958
1959 static PyMethodDef dictiter_methods[] = {
1960         {"next",        (PyCFunction)dictiter_next,     METH_VARARGS,
1961          "it.next() -- get the next value, or raise StopIteration"},
1962         {NULL,          NULL}           /* sentinel */
1963 };
1964
1965 static PyObject *dictiter_iternext(dictiterobject *di)
1966 {
1967         PyObject *key, *value;
1968
1969         if (di->di_used != di->di_dict->ma_used) {
1970                 PyErr_SetString(PyExc_RuntimeError,
1971                                 "dictionary changed size during iteration");
1972                 return NULL;
1973         }
1974         if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1975                 return (*di->di_select)(key, value);
1976         }
1977         return NULL;
1978 }
1979
1980 PyTypeObject PyDictIter_Type = {
1981         PyObject_HEAD_INIT(&PyType_Type)
1982         0,                                      /* ob_size */
1983         "dictionary-iterator",                  /* tp_name */
1984         sizeof(dictiterobject),                 /* tp_basicsize */
1985         0,                                      /* tp_itemsize */
1986         /* methods */
1987         (destructor)dictiter_dealloc,           /* tp_dealloc */
1988         0,                                      /* tp_print */
1989         0,                                      /* tp_getattr */
1990         0,                                      /* tp_setattr */
1991         0,                                      /* tp_compare */
1992         0,                                      /* tp_repr */
1993         0,                                      /* tp_as_number */
1994         0,                                      /* tp_as_sequence */
1995         0,                                      /* tp_as_mapping */
1996         0,                                      /* tp_hash */
1997         0,                                      /* tp_call */
1998         0,                                      /* tp_str */
1999         PyObject_GenericGetAttr,                /* tp_getattro */
2000         0,                                      /* tp_setattro */
2001         0,                                      /* tp_as_buffer */
2002         Py_TPFLAGS_DEFAULT,                     /* tp_flags */
2003         0,                                      /* tp_doc */
2004         0,                                      /* tp_traverse */
2005         0,                                      /* tp_clear */
2006         0,                                      /* tp_richcompare */
2007         0,                                      /* tp_weaklistoffset */
2008         (getiterfunc)dictiter_getiter,          /* tp_iter */
2009         (iternextfunc)dictiter_iternext,        /* tp_iternext */
2010         dictiter_methods,                       /* tp_methods */
2011         0,                                      /* tp_members */
2012         0,                                      /* tp_getset */
2013         0,                                      /* tp_base */
2014         0,                                      /* tp_dict */
2015         0,                                      /* tp_descr_get */
2016         0,                                      /* tp_descr_set */
2017 };