Objects/dictobject.c

   1
   2 /* Dictionary object implementation using a hash table */
   3
   4 #include "Python.h"
   5
   6 typedef PyDictEntry dictentry;
   7 typedef PyDictObject dictobject;
   8
   9 /* Define this out if you don't want conversion statistics on exit. */
  10 #undef SHOW_CONVERSION_COUNTS
  11
  12 /* See large comment block below.  This must be >= 1. */
  13 #define PERTURB_SHIFT 5
  14
  15 /*
  16 Major subtleties ahead:  Most hash schemes depend on having a "good" hash
  17 function, in the sense of simulating randomness.  Python doesn't:  its most
  18 important hash functions (for strings and ints) are very regular in common
  19 cases:
  20
  21 >>> map(hash, (0, 1, 2, 3))
  22 [0, 1, 2, 3]
  23 >>> map(hash, ("namea", "nameb", "namec", "named"))
  24 [-1658398457, -1658398460, -1658398459, -1658398462]
  25 >>>
  26
  27 This isn't necessarily bad!  To the contrary, in a table of size 2**i, taking
  28 the low-order i bits as the initial table index is extremely fast, and there
  29 are no collisions at all for dicts indexed by a contiguous range of ints.
  30 The same is approximately true when keys are "consecutive" strings.  So this
  31 gives better-than-random behavior in common cases, and that's very desirable.
  32
  33 OTOH, when collisions occur, the tendency to fill contiguous slices of the
  34 hash table makes a good collision resolution strategy crucial.  Taking only
  35 the last i bits of the hash code is also vulnerable:  for example, consider
  36 [i << 16 for i in range(20000)] as a set of keys.  Since ints are their own
  37 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
  38 hash code are all 0:  they *all* map to the same table index.
  39
  40 But catering to unusual cases should not slow the usual ones, so we just take
  41 the last i bits anyway.  It's up to collision resolution to do the rest.  If
  42 we *usually* find the key we're looking for on the first try (and, it turns
  43 out, we usually do -- the table load factor is kept under 2/3, so the odds
  44 are solidly in our favor), then it makes best sense to keep the initial index
  45 computation dirt cheap.
  46
  47 The first half of collision resolution is to visit table indices via this
  48 recurrence:
  49
  50     j = ((5*j) + 1) mod 2**i
  51
  52 For any initial j in range(2**i), repeating that 2**i times generates each
  53 int in range(2**i) exactly once (see any text on random-number generation for
  54 proof).  By itself, this doesn't help much:  like linear probing (setting
  55 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
  56 order.  This would be bad, except that's not the only thing we do, and it's
  57 actually *good* in the common cases where hash keys are consecutive.  In an
  58 example that's really too small to make this entirely clear, for a table of
  59 size 2**3 the order of indices is:
  60
  61     0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
  62
  63 If two things come in at index 5, the first place we look after is index 2,
  64 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
  65 Linear probing is deadly in this case because there the fixed probe order
  66 is the *same* as the order consecutive keys are likely to arrive.  But it's
  67 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
  68 and certain that consecutive hash codes do not.
  69
  70 The other half of the strategy is to get the other bits of the hash code
  71 into play.  This is done by initializing a (unsigned) vrbl "perturb" to the
  72 full hash code, and changing the recurrence to:
  73
  74     j = (5*j) + 1 + perturb;
  75     perturb >>= PERTURB_SHIFT;
  76     use j % 2**i as the next table index;
  77
  78 Now the probe sequence depends (eventually) on every bit in the hash code,
  79 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
  80 because it quickly magnifies small differences in the bits that didn't affect
  81 the initial index.  Note that because perturb is unsigned, if the recurrence
  82 is executed often enough perturb eventually becomes and remains 0.  At that
  83 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
  84 that's certain to find an empty slot eventually (since it generates every int
  85 in range(2**i), and we make sure there's always at least one empty slot).
  86
  87 Selecting a good value for PERTURB_SHIFT is a balancing act.  You want it
  88 small so that the high bits of the hash code continue to affect the probe
  89 sequence across iterations; but you want it large so that in really bad cases
  90 the high-order hash bits have an effect on early iterations.  5 was "the
  91 best" in minimizing total collisions across experiments Tim Peters ran (on
  92 both normal and pathological cases), but 4 and 6 weren't significantly worse.
  93
  94 Historical:  Reimer Behrends contributed the idea of using a polynomial-based
  95 approach, using repeated multiplication by x in GF(2**n) where an irreducible
  96 polynomial for each table size was chosen such that x was a primitive root.
  97 Christian Tismer later extended that to use division by x instead, as an
  98 efficient way to get the high bits of the hash code into play.  This scheme
  99 also gave excellent collision statistics, but was more expensive:  two
 100 if-tests were required inside the loop; computing "the next" index took about
 101 the same number of operations but without as much potential parallelism
 102 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
 103 above, and then shifting perturb can be done while the table index is being
 104 masked); and the dictobject struct required a member to hold the table's
 105 polynomial.  In Tim's experiments the current scheme ran faster, produced
 106 equally good collision statistics, needed less code & used less memory.
 107 */
 108
 109 /* Object used as dummy key to fill deleted entries */
 110 static PyObject *dummy; /* Initialized by first call to newdictobject() */
 111
 112 /* forward declarations */
 113 static dictentry *
 114 lookdict_string(dictobject *mp, PyObject *key, long hash);
 115
 116 #ifdef SHOW_CONVERSION_COUNTS
 117 static long created = 0L;
 118 static long converted = 0L;
 119
 120 static void
 121 show_counts(void)
 122 {
 123         fprintf(stderr, "created %ld string dicts\n", created);
 124         fprintf(stderr, "converted %ld to normal dicts\n", converted);
 125         fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
 126 }
 127 #endif
 128
 129 /* Initialization macros.
 130    There are two ways to create a dict:  PyDict_New() is the main C API
 131    function, and the tp_new slot maps to dict_new().  In the latter case we
 132    can save a little time over what PyDict_New does because it's guaranteed
 133    that the PyDictObject struct is already zeroed out.
 134    Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
 135    an excellent reason not to).
 136 */
 137
 138 #define INIT_NONZERO_DICT_SLOTS(mp) do {                                \
 139         (mp)->ma_table = (mp)->ma_smalltable;                           \
 140         (mp)->ma_mask = PyDict_MINSIZE - 1;                             \
 141     } while(0)
 142
 143 #define EMPTY_TO_MINSIZE(mp) do {                                       \
 144         memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable));    \
 145         (mp)->ma_used = (mp)->ma_fill = 0;                              \
 146         INIT_NONZERO_DICT_SLOTS(mp);                                    \
 147     } while(0)
 148
 149 PyObject *
 150 PyDict_New(void)
 151 {
 152         register dictobject *mp;
 153         if (dummy == NULL) { /* Auto-initialize dummy */
 154                 dummy = PyString_FromString("<dummy key>");
 155                 if (dummy == NULL)
 156                         return NULL;
 157 #ifdef SHOW_CONVERSION_COUNTS
 158                 Py_AtExit(show_counts);
 159 #endif
 160         }
 161         mp = PyObject_NEW(dictobject, &PyDict_Type);
 162         if (mp == NULL)
 163                 return NULL;
 164         EMPTY_TO_MINSIZE(mp);
 165         mp->ma_lookup = lookdict_string;
 166 #ifdef SHOW_CONVERSION_COUNTS
 167         ++created;
 168 #endif
 169         PyObject_GC_Init(mp);
 170         return (PyObject *)mp;
 171 }
 172
 173 /*
 174 The basic lookup function used by all operations.
 175 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
 176 Open addressing is preferred over chaining since the link overhead for
 177 chaining would be substantial (100% with typical malloc overhead).
 178
 179 The initial probe index is computed as hash mod the table size. Subsequent
 180 probe indices are computed as explained earlier.
 181
 182 All arithmetic on hash should ignore overflow.
 183
 184 (The details in this version are due to Tim Peters, building on many past
 185 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
 186 Christian Tismer).
 187
 188 This function must never return NULL; failures are indicated by returning
 189 a dictentry* for which the me_value field is NULL.  Exceptions are never
 190 reported by this function, and outstanding exceptions are maintained.
 191 */
 192
 193 static dictentry *
 194 lookdict(dictobject *mp, PyObject *key, register long hash)
 195 {
 196         register int i;
 197         register unsigned int perturb;
 198         register dictentry *freeslot;
 199         register unsigned int mask = mp->ma_mask;
 200         dictentry *ep0 = mp->ma_table;
 201         register dictentry *ep;
 202         register int restore_error;
 203         register int checked_error;
 204         register int cmp;
 205         PyObject *err_type, *err_value, *err_tb;
 206         PyObject *startkey;
 207
 208         i = hash & mask;
 209         ep = &ep0[i];
 210         if (ep->me_key == NULL || ep->me_key == key)
 211                 return ep;
 212
 213         restore_error = checked_error = 0;
 214         if (ep->me_key == dummy)
 215                 freeslot = ep;
 216         else {
 217                 if (ep->me_hash == hash) {
 218                         /* error can't have been checked yet */
 219                         checked_error = 1;
 220                         if (PyErr_Occurred()) {
 221                                 restore_error = 1;
 222                                 PyErr_Fetch(&err_type, &err_value, &err_tb);
 223                         }
 224                         startkey = ep->me_key;
 225                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 226                         if (cmp < 0)
 227                                 PyErr_Clear();
 228                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 229                                 if (cmp > 0)
 230                                         goto Done;
 231                         }
 232                         else {
 233                                 /* The compare did major nasty stuff to the
 234                                  * dict:  start over.
 235                                  * XXX A clever adversary could prevent this
 236                                  * XXX from terminating.
 237                                  */
 238                                 ep = lookdict(mp, key, hash);
 239                                 goto Done;
 240                         }
 241                 }
 242                 freeslot = NULL;
 243         }
 244
 245         /* In the loop, me_key == dummy is by far (factor of 100s) the
 246            least likely outcome, so test for that last. */
 247         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 248                 i = (i << 2) + i + perturb + 1;
 249                 ep = &ep0[i & mask];
 250                 if (ep->me_key == NULL) {
 251                         if (freeslot != NULL)
 252                                 ep = freeslot;
 253                         break;
 254                 }
 255                 if (ep->me_key == key)
 256                         break;
 257                 if (ep->me_hash == hash && ep->me_key != dummy) {
 258                         if (!checked_error) {
 259                                 checked_error = 1;
 260                                 if (PyErr_Occurred()) {
 261                                         restore_error = 1;
 262                                         PyErr_Fetch(&err_type, &err_value,
 263                                                     &err_tb);
 264                                 }
 265                         }
 266                         startkey = ep->me_key;
 267                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 268                         if (cmp < 0)
 269                                 PyErr_Clear();
 270                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 271                                 if (cmp > 0)
 272                                         break;
 273                         }
 274                         else {
 275                                 /* The compare did major nasty stuff to the
 276                                  * dict:  start over.
 277                                  * XXX A clever adversary could prevent this
 278                                  * XXX from terminating.
 279                                  */
 280                                 ep = lookdict(mp, key, hash);
 281                                 break;
 282                         }
 283                 }
 284                 else if (ep->me_key == dummy && freeslot == NULL)
 285                         freeslot = ep;
 286         }
 287
 288 Done:
 289         if (restore_error)
 290                 PyErr_Restore(err_type, err_value, err_tb);
 291         return ep;
 292 }
 293
 294 /*
 295  * Hacked up version of lookdict which can assume keys are always strings;
 296  * this assumption allows testing for errors during PyObject_Compare() to
 297  * be dropped; string-string comparisons never raise exceptions.  This also
 298  * means we don't need to go through PyObject_Compare(); we can always use
 299  * _PyString_Eq directly.
 300  *
 301  * This really only becomes meaningful if proper error handling in lookdict()
 302  * is too expensive.
 303  */
 304 static dictentry *
 305 lookdict_string(dictobject *mp, PyObject *key, register long hash)
 306 {
 307         register int i;
 308         register unsigned int perturb;
 309         register dictentry *freeslot;
 310         register unsigned int mask = mp->ma_mask;
 311         dictentry *ep0 = mp->ma_table;
 312         register dictentry *ep;
 313
 314         /* make sure this function doesn't have to handle non-string keys */
 315         if (!PyString_Check(key)) {
 316 #ifdef SHOW_CONVERSION_COUNTS
 317                 ++converted;
 318 #endif
 319                 mp->ma_lookup = lookdict;
 320                 return lookdict(mp, key, hash);
 321         }
 322         i = hash & mask;
 323         ep = &ep0[i];
 324         if (ep->me_key == NULL || ep->me_key == key)
 325                 return ep;
 326         if (ep->me_key == dummy)
 327                 freeslot = ep;
 328         else {
 329                 if (ep->me_hash == hash
 330                     && _PyString_Eq(ep->me_key, key)) {
 331                         return ep;
 332                 }
 333                 freeslot = NULL;
 334         }
 335
 336         /* In the loop, me_key == dummy is by far (factor of 100s) the
 337            least likely outcome, so test for that last. */
 338         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 339                 i = (i << 2) + i + perturb + 1;
 340                 ep = &ep0[i & mask];
 341                 if (ep->me_key == NULL)
 342                         return freeslot == NULL ? ep : freeslot;
 343                 if (ep->me_key == key
 344                     || (ep->me_hash == hash
 345                         && ep->me_key != dummy
 346                         && _PyString_Eq(ep->me_key, key)))
 347                         return ep;
 348                 if (ep->me_key == dummy && freeslot == NULL)
 349                         freeslot = ep;
 350         }
 351 }
 352
 353 /*
 354 Internal routine to insert a new item into the table.
 355 Used both by the internal resize routine and by the public insert routine.
 356 Eats a reference to key and one to value.
 357 */
 358 static void
 359 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
 360 {
 361         PyObject *old_value;
 362         register dictentry *ep;
 363         typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
 364
 365         assert(mp->ma_lookup != NULL);
 366         ep = mp->ma_lookup(mp, key, hash);
 367         if (ep->me_value != NULL) {
 368                 old_value = ep->me_value;
 369                 ep->me_value = value;
 370                 Py_DECREF(old_value); /* which **CAN** re-enter */
 371                 Py_DECREF(key);
 372         }
 373         else {
 374                 if (ep->me_key == NULL)
 375                         mp->ma_fill++;
 376                 else
 377                         Py_DECREF(ep->me_key);
 378                 ep->me_key = key;
 379                 ep->me_hash = hash;
 380                 ep->me_value = value;
 381                 mp->ma_used++;
 382         }
 383 }
 384
 385 /*
 386 Restructure the table by allocating a new table and reinserting all
 387 items again.  When entries have been deleted, the new table may
 388 actually be smaller than the old one.
 389 */
 390 static int
 391 dictresize(dictobject *mp, int minused)
 392 {
 393         int newsize;
 394         dictentry *oldtable, *newtable, *ep;
 395         int i;
 396         int is_oldtable_malloced;
 397         dictentry small_copy[PyDict_MINSIZE];
 398
 399         assert(minused >= 0);
 400
 401         /* Find the smallest table size > minused. */
 402         for (newsize = PyDict_MINSIZE;
 403              newsize <= minused && newsize > 0;
 404              newsize <<= 1)
 405                 ;
 406         if (newsize <= 0) {
 407                 PyErr_NoMemory();
 408                 return -1;
 409         }
 410
 411         /* Get space for a new table. */
 412         oldtable = mp->ma_table;
 413         assert(oldtable != NULL);
 414         is_oldtable_malloced = oldtable != mp->ma_smalltable;
 415
 416         if (newsize == PyDict_MINSIZE) {
 417                 /* A large table is shrinking, or we can't get any smaller. */
 418                 newtable = mp->ma_smalltable;
 419                 if (newtable == oldtable) {
 420                         if (mp->ma_fill == mp->ma_used) {
 421                                 /* No dummies, so no point doing anything. */
 422                                 return 0;
 423                         }
 424                         /* We're not going to resize it, but rebuild the
 425                            table anyway to purge old dummy entries.
 426                            Subtle:  This is *necessary* if fill==size,
 427                            as lookdict needs at least one virgin slot to
 428                            terminate failing searches.  If fill < size, it's
 429                            merely desirable, as dummies slow searches. */
 430                         assert(mp->ma_fill > mp->ma_used);
 431                         memcpy(small_copy, oldtable, sizeof(small_copy));
 432                         oldtable = small_copy;
 433                 }
 434         }
 435         else {
 436                 newtable = PyMem_NEW(dictentry, newsize);
 437                 if (newtable == NULL) {
 438                         PyErr_NoMemory();
 439                         return -1;
 440                 }
 441         }
 442
 443         /* Make the dict empty, using the new table. */
 444         assert(newtable != oldtable);
 445         mp->ma_table = newtable;
 446         mp->ma_mask = newsize - 1;
 447         memset(newtable, 0, sizeof(dictentry) * newsize);
 448         mp->ma_used = 0;
 449         i = mp->ma_fill;
 450         mp->ma_fill = 0;
 451
 452         /* Copy the data over; this is refcount-neutral for active entries;
 453            dummy entries aren't copied over, of course */
 454         for (ep = oldtable; i > 0; ep++) {
 455                 if (ep->me_value != NULL) {     /* active entry */
 456                         --i;
 457                         insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
 458                 }
 459                 else if (ep->me_key != NULL) {  /* dummy entry */
 460                         --i;
 461                         assert(ep->me_key == dummy);
 462                         Py_DECREF(ep->me_key);
 463                 }
 464                 /* else key == value == NULL:  nothing to do */
 465         }
 466
 467         if (is_oldtable_malloced)
 468                 PyMem_DEL(oldtable);
 469         return 0;
 470 }
 471
 472 PyObject *
 473 PyDict_GetItem(PyObject *op, PyObject *key)
 474 {
 475         long hash;
 476         dictobject *mp = (dictobject *)op;
 477         if (!PyDict_Check(op)) {
 478                 return NULL;
 479         }
 480 #ifdef CACHE_HASH
 481         if (!PyString_Check(key) ||
 482             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 483 #endif
 484         {
 485                 hash = PyObject_Hash(key);
 486                 if (hash == -1) {
 487                         PyErr_Clear();
 488                         return NULL;
 489                 }
 490         }
 491         return (mp->ma_lookup)(mp, key, hash)->me_value;
 492 }
 493
 494 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
 495  * dictionary if it is merely replacing the value for an existing key.
 496  * This is means that it's safe to loop over a dictionary with
 497  * PyDict_Next() and occasionally replace a value -- but you can't
 498  * insert new keys or remove them.
 499  */
 500 int
 501 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
 502 {
 503         register dictobject *mp;
 504         register long hash;
 505         register int n_used;
 506
 507         if (!PyDict_Check(op)) {
 508                 PyErr_BadInternalCall();
 509                 return -1;
 510         }
 511         mp = (dictobject *)op;
 512 #ifdef CACHE_HASH
 513         if (PyString_Check(key)) {
 514 #ifdef INTERN_STRINGS
 515                 if (((PyStringObject *)key)->ob_sinterned != NULL) {
 516                         key = ((PyStringObject *)key)->ob_sinterned;
 517                         hash = ((PyStringObject *)key)->ob_shash;
 518                 }
 519                 else
 520 #endif
 521                 {
 522                         hash = ((PyStringObject *)key)->ob_shash;
 523                         if (hash == -1)
 524                                 hash = PyObject_Hash(key);
 525                 }
 526         }
 527         else
 528 #endif
 529         {
 530                 hash = PyObject_Hash(key);
 531                 if (hash == -1)
 532                         return -1;
 533         }
 534         assert(mp->ma_fill <= mp->ma_mask);  /* at least one empty slot */
 535         n_used = mp->ma_used;
 536         Py_INCREF(value);
 537         Py_INCREF(key);
 538         insertdict(mp, key, hash, value);
 539         /* If we added a key, we can safely resize.  Otherwise skip this!
 540          * If fill >= 2/3 size, adjust size.  Normally, this doubles the
 541          * size, but it's also possible for the dict to shrink (if ma_fill is
 542          * much larger than ma_used, meaning a lot of dict keys have been
 543          * deleted).
 544          */
 545         if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) {
 546                 if (dictresize(mp, mp->ma_used*2) != 0)
 547                         return -1;
 548         }
 549         return 0;
 550 }
 551
 552 int
 553 PyDict_DelItem(PyObject *op, PyObject *key)
 554 {
 555         register dictobject *mp;
 556         register long hash;
 557         register dictentry *ep;
 558         PyObject *old_value, *old_key;
 559
 560         if (!PyDict_Check(op)) {
 561                 PyErr_BadInternalCall();
 562                 return -1;
 563         }
 564 #ifdef CACHE_HASH
 565         if (!PyString_Check(key) ||
 566             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 567 #endif
 568         {
 569                 hash = PyObject_Hash(key);
 570                 if (hash == -1)
 571                         return -1;
 572         }
 573         mp = (dictobject *)op;
 574         ep = (mp->ma_lookup)(mp, key, hash);
 575         if (ep->me_value == NULL) {
 576                 PyErr_SetObject(PyExc_KeyError, key);
 577                 return -1;
 578         }
 579         old_key = ep->me_key;
 580         Py_INCREF(dummy);
 581         ep->me_key = dummy;
 582         old_value = ep->me_value;
 583         ep->me_value = NULL;
 584         mp->ma_used--;
 585         Py_DECREF(old_value);
 586         Py_DECREF(old_key);
 587         return 0;
 588 }
 589
 590 void
 591 PyDict_Clear(PyObject *op)
 592 {
 593         dictobject *mp;
 594         dictentry *ep, *table;
 595         int table_is_malloced;
 596         int fill;
 597         dictentry small_copy[PyDict_MINSIZE];
 598 #ifdef Py_DEBUG
 599         int i, n;
 600 #endif
 601
 602         if (!PyDict_Check(op))
 603                 return;
 604         mp = (dictobject *)op;
 605 #ifdef Py_DEBUG
 606         n = mp->ma_mask + 1;
 607         i = 0;
 608 #endif
 609
 610         table = mp->ma_table;
 611         assert(table != NULL);
 612         table_is_malloced = table != mp->ma_smalltable;
 613
 614         /* This is delicate.  During the process of clearing the dict,
 615          * decrefs can cause the dict to mutate.  To avoid fatal confusion
 616          * (voice of experience), we have to make the dict empty before
 617          * clearing the slots, and never refer to anything via mp->xxx while
 618          * clearing.
 619          */
 620         fill = mp->ma_fill;
 621         if (table_is_malloced)
 622                 EMPTY_TO_MINSIZE(mp);
 623
 624         else if (fill > 0) {
 625                 /* It's a small table with something that needs to be cleared.
 626                  * Afraid the only safe way is to copy the dict entries into
 627                  * another small table first.
 628                  */
 629                 memcpy(small_copy, table, sizeof(small_copy));
 630                 table = small_copy;
 631                 EMPTY_TO_MINSIZE(mp);
 632         }
 633         /* else it's a small table that's already empty */
 634
 635         /* Now we can finally clear things.  If C had refcounts, we could
 636          * assert that the refcount on table is 1 now, i.e. that this function
 637          * has unique access to it, so decref side-effects can't alter it.
 638          */
 639         for (ep = table; fill > 0; ++ep) {
 640 #ifdef Py_DEBUG
 641                 assert(i < n);
 642                 ++i;
 643 #endif
 644                 if (ep->me_key) {
 645                         --fill;
 646                         Py_DECREF(ep->me_key);
 647                         Py_XDECREF(ep->me_value);
 648                 }
 649 #ifdef Py_DEBUG
 650                 else
 651                         assert(ep->me_value == NULL);
 652 #endif
 653         }
 654
 655         if (table_is_malloced)
 656                 PyMem_DEL(table);
 657 }
 658
 659 /* CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
 660  * mutates the dict.  One exception:  it is safe if the loop merely changes
 661  * the values associated with the keys (but doesn't insert new keys or
 662  * delete keys), via PyDict_SetItem().
 663  */
 664 int
 665 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
 666 {
 667         int i;
 668         register dictobject *mp;
 669         if (!PyDict_Check(op))
 670                 return 0;
 671         mp = (dictobject *)op;
 672         i = *ppos;
 673         if (i < 0)
 674                 return 0;
 675         while (i <= mp->ma_mask && mp->ma_table[i].me_value == NULL)
 676                 i++;
 677         *ppos = i+1;
 678         if (i > mp->ma_mask)
 679                 return 0;
 680         if (pkey)
 681                 *pkey = mp->ma_table[i].me_key;
 682         if (pvalue)
 683                 *pvalue = mp->ma_table[i].me_value;
 684         return 1;
 685 }
 686
 687 /* Methods */
 688
 689 static void
 690 dict_dealloc(register dictobject *mp)
 691 {
 692         register dictentry *ep;
 693         int fill = mp->ma_fill;
 694         Py_TRASHCAN_SAFE_BEGIN(mp)
 695         PyObject_GC_Fini(mp);
 696         for (ep = mp->ma_table; fill > 0; ep++) {
 697                 if (ep->me_key) {
 698                         --fill;
 699                         Py_DECREF(ep->me_key);
 700                         Py_XDECREF(ep->me_value);
 701                 }
 702         }
 703         if (mp->ma_table != mp->ma_smalltable)
 704                 PyMem_DEL(mp->ma_table);
 705         mp = (dictobject *) PyObject_AS_GC(mp);
 706         PyObject_DEL(mp);
 707         Py_TRASHCAN_SAFE_END(mp)
 708 }
 709
 710 static int
 711 dict_print(register dictobject *mp, register FILE *fp, register int flags)
 712 {
 713         register int i;
 714         register int any;
 715
 716         i = Py_ReprEnter((PyObject*)mp);
 717         if (i != 0) {
 718                 if (i < 0)
 719                         return i;
 720                 fprintf(fp, "{...}");
 721                 return 0;
 722         }
 723
 724         fprintf(fp, "{");
 725         any = 0;
 726         for (i = 0; i <= mp->ma_mask; i++) {
 727                 dictentry *ep = mp->ma_table + i;
 728                 PyObject *pvalue = ep->me_value;
 729                 if (pvalue != NULL) {
 730                         /* Prevent PyObject_Repr from deleting value during
 731                            key format */
 732                         Py_INCREF(pvalue);
 733                         if (any++ > 0)
 734                                 fprintf(fp, ", ");
 735                         if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
 736                                 Py_DECREF(pvalue);
 737                                 Py_ReprLeave((PyObject*)mp);
 738                                 return -1;
 739                         }
 740                         fprintf(fp, ": ");
 741                         if (PyObject_Print(pvalue, fp, 0) != 0) {
 742                                 Py_DECREF(pvalue);
 743                                 Py_ReprLeave((PyObject*)mp);
 744                                 return -1;
 745                         }
 746                         Py_DECREF(pvalue);
 747                 }
 748         }
 749         fprintf(fp, "}");
 750         Py_ReprLeave((PyObject*)mp);
 751         return 0;
 752 }
 753
 754 static PyObject *
 755 dict_repr(dictobject *mp)
 756 {
 757         int i;
 758         PyObject *s, *temp, *colon = NULL;
 759         PyObject *pieces = NULL, *result = NULL;
 760         PyObject *key, *value;
 761
 762         i = Py_ReprEnter((PyObject *)mp);
 763         if (i != 0) {
 764                 return i > 0 ? PyString_FromString("{...}") : NULL;
 765         }
 766
 767         if (mp->ma_used == 0) {
 768                 result = PyString_FromString("{}");
 769                 goto Done;
 770         }
 771
 772         pieces = PyList_New(0);
 773         if (pieces == NULL)
 774                 goto Done;
 775
 776         colon = PyString_FromString(": ");
 777         if (colon == NULL)
 778                 goto Done;
 779
 780         /* Do repr() on each key+value pair, and insert ": " between them.
 781            Note that repr may mutate the dict. */
 782         i = 0;
 783         while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
 784                 int status;
 785                 /* Prevent repr from deleting value during key format. */
 786                 Py_INCREF(value);
 787                 s = PyObject_Repr(key);
 788                 PyString_Concat(&s, colon);
 789                 PyString_ConcatAndDel(&s, PyObject_Repr(value));
 790                 Py_DECREF(value);
 791                 if (s == NULL)
 792                         goto Done;
 793                 status = PyList_Append(pieces, s);
 794                 Py_DECREF(s);  /* append created a new ref */
 795                 if (status < 0)
 796                         goto Done;
 797         }
 798
 799         /* Add "{}" decorations to the first and last items. */
 800         assert(PyList_GET_SIZE(pieces) > 0);
 801         s = PyString_FromString("{");
 802         if (s == NULL)
 803                 goto Done;
 804         temp = PyList_GET_ITEM(pieces, 0);
 805         PyString_ConcatAndDel(&s, temp);
 806         PyList_SET_ITEM(pieces, 0, s);
 807         if (s == NULL)
 808                 goto Done;
 809
 810         s = PyString_FromString("}");
 811         if (s == NULL)
 812                 goto Done;
 813         temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
 814         PyString_ConcatAndDel(&temp, s);
 815         PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
 816         if (temp == NULL)
 817                 goto Done;
 818
 819         /* Paste them all together with ", " between. */
 820         s = PyString_FromString(", ");
 821         if (s == NULL)
 822                 goto Done;
 823         result = _PyString_Join(s, pieces);
 824         Py_DECREF(s);
 825
 826 Done:
 827         Py_XDECREF(pieces);
 828         Py_XDECREF(colon);
 829         Py_ReprLeave((PyObject *)mp);
 830         return result;
 831 }
 832
 833 static int
 834 dict_length(dictobject *mp)
 835 {
 836         return mp->ma_used;
 837 }
 838
 839 static PyObject *
 840 dict_subscript(dictobject *mp, register PyObject *key)
 841 {
 842         PyObject *v;
 843         long hash;
 844         assert(mp->ma_table != NULL);
 845 #ifdef CACHE_HASH
 846         if (!PyString_Check(key) ||
 847             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 848 #endif
 849         {
 850                 hash = PyObject_Hash(key);
 851                 if (hash == -1)
 852                         return NULL;
 853         }
 854         v = (mp->ma_lookup)(mp, key, hash) -> me_value;
 855         if (v == NULL)
 856                 PyErr_SetObject(PyExc_KeyError, key);
 857         else
 858                 Py_INCREF(v);
 859         return v;
 860 }
 861
 862 static int
 863 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
 864 {
 865         if (w == NULL)
 866                 return PyDict_DelItem((PyObject *)mp, v);
 867         else
 868                 return PyDict_SetItem((PyObject *)mp, v, w);
 869 }
 870
 871 static PyMappingMethods dict_as_mapping = {
 872         (inquiry)dict_length, /*mp_length*/
 873         (binaryfunc)dict_subscript, /*mp_subscript*/
 874         (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
 875 };
 876
 877 static PyObject *
 878 dict_keys(register dictobject *mp, PyObject *args)
 879 {
 880         register PyObject *v;
 881         register int i, j, n;
 882
 883         if (!PyArg_NoArgs(args))
 884                 return NULL;
 885   again:
 886         n = mp->ma_used;
 887         v = PyList_New(n);
 888         if (v == NULL)
 889                 return NULL;
 890         if (n != mp->ma_used) {
 891                 /* Durnit.  The allocations caused the dict to resize.
 892                  * Just start over, this shouldn't normally happen.
 893                  */
 894                 Py_DECREF(v);
 895                 goto again;
 896         }
 897         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 898                 if (mp->ma_table[i].me_value != NULL) {
 899                         PyObject *key = mp->ma_table[i].me_key;
 900                         Py_INCREF(key);
 901                         PyList_SET_ITEM(v, j, key);
 902                         j++;
 903                 }
 904         }
 905         return v;
 906 }
 907
 908 static PyObject *
 909 dict_values(register dictobject *mp, PyObject *args)
 910 {
 911         register PyObject *v;
 912         register int i, j, n;
 913
 914         if (!PyArg_NoArgs(args))
 915                 return NULL;
 916   again:
 917         n = mp->ma_used;
 918         v = PyList_New(n);
 919         if (v == NULL)
 920                 return NULL;
 921         if (n != mp->ma_used) {
 922                 /* Durnit.  The allocations caused the dict to resize.
 923                  * Just start over, this shouldn't normally happen.
 924                  */
 925                 Py_DECREF(v);
 926                 goto again;
 927         }
 928         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 929                 if (mp->ma_table[i].me_value != NULL) {
 930                         PyObject *value = mp->ma_table[i].me_value;
 931                         Py_INCREF(value);
 932                         PyList_SET_ITEM(v, j, value);
 933                         j++;
 934                 }
 935         }
 936         return v;
 937 }
 938
 939 static PyObject *
 940 dict_items(register dictobject *mp, PyObject *args)
 941 {
 942         register PyObject *v;
 943         register int i, j, n;
 944         PyObject *item, *key, *value;
 945
 946         if (!PyArg_NoArgs(args))
 947                 return NULL;
 948         /* Preallocate the list of tuples, to avoid allocations during
 949          * the loop over the items, which could trigger GC, which
 950          * could resize the dict. :-(
 951          */
 952   again:
 953         n = mp->ma_used;
 954         v = PyList_New(n);
 955         if (v == NULL)
 956                 return NULL;
 957         for (i = 0; i < n; i++) {
 958                 item = PyTuple_New(2);
 959                 if (item == NULL) {
 960                         Py_DECREF(v);
 961                         return NULL;
 962                 }
 963                 PyList_SET_ITEM(v, i, item);
 964         }
 965         if (n != mp->ma_used) {
 966                 /* Durnit.  The allocations caused the dict to resize.
 967                  * Just start over, this shouldn't normally happen.
 968                  */
 969                 Py_DECREF(v);
 970                 goto again;
 971         }
 972         /* Nothing we do below makes any function calls. */
 973         for (i = 0, j = 0; i <= mp->ma_mask; i++) {
 974                 if (mp->ma_table[i].me_value != NULL) {
 975                         key = mp->ma_table[i].me_key;
 976                         value = mp->ma_table[i].me_value;
 977                         item = PyList_GET_ITEM(v, j);
 978                         Py_INCREF(key);
 979                         PyTuple_SET_ITEM(item, 0, key);
 980                         Py_INCREF(value);
 981                         PyTuple_SET_ITEM(item, 1, value);
 982                         j++;
 983                 }
 984         }
 985         assert(j == n);
 986         return v;
 987 }
 988
 989 static PyObject *
 990 dict_update(PyObject *mp, PyObject *args)
 991 {
 992         PyObject *other;
 993
 994         if (!PyArg_ParseTuple(args, "O:update", &other))
 995                 return NULL;
 996         if (PyDict_Update(mp, other) < 0)
 997                 return NULL;
 998         Py_INCREF(Py_None);
 999         return Py_None;
1000 }
1001
1002 /* Update unconditionally replaces existing items.
1003    Merge has a 3rd argument 'override'; if set, it acts like Update,
1004    otherwise it leaves existing items unchanged. */
1005
1006 int
1007 PyDict_Update(PyObject *a, PyObject *b)
1008 {
1009         return PyDict_Merge(a, b, 1);
1010 }
1011
1012 int
1013 PyDict_Merge(PyObject *a, PyObject *b, int override)
1014 {
1015         register PyDictObject *mp, *other;
1016         register int i;
1017         dictentry *entry;
1018
1019         /* We accept for the argument either a concrete dictionary object,
1020          * or an abstract "mapping" object.  For the former, we can do
1021          * things quite efficiently.  For the latter, we only require that
1022          * PyMapping_Keys() and PyObject_GetItem() be supported.
1023          */
1024         if (a == NULL || !PyDict_Check(a) || b == NULL) {
1025                 PyErr_BadInternalCall();
1026                 return -1;
1027         }
1028         mp = (dictobject*)a;
1029         if (PyDict_Check(b)) {
1030                 other = (dictobject*)b;
1031                 if (other == mp || other->ma_used == 0)
1032                         /* a.update(a) or a.update({}); nothing to do */
1033                         return 0;
1034                 /* Do one big resize at the start, rather than
1035                  * incrementally resizing as we insert new items.  Expect
1036                  * that there will be no (or few) overlapping keys.
1037                  */
1038                 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1039                    if (dictresize(mp, (mp->ma_used + other->ma_used)*3/2) != 0)
1040                            return -1;
1041                 }
1042                 for (i = 0; i <= other->ma_mask; i++) {
1043                         entry = &other->ma_table[i];
1044                         if (entry->me_value != NULL &&
1045                             (override ||
1046                              PyDict_GetItem(a, entry->me_key) == NULL)) {
1047                                 Py_INCREF(entry->me_key);
1048                                 Py_INCREF(entry->me_value);
1049                                 insertdict(mp, entry->me_key, entry->me_hash,
1050                                            entry->me_value);
1051                         }
1052                 }
1053         }
1054         else {
1055                 /* Do it the generic, slower way */
1056                 PyObject *keys = PyMapping_Keys(b);
1057                 PyObject *iter;
1058                 PyObject *key, *value;
1059                 int status;
1060
1061                 if (keys == NULL)
1062                         /* Docstring says this is equivalent to E.keys() so
1063                          * if E doesn't have a .keys() method we want
1064                          * AttributeError to percolate up.  Might as well
1065                          * do the same for any other error.
1066                          */
1067                         return -1;
1068
1069                 iter = PyObject_GetIter(keys);
1070                 Py_DECREF(keys);
1071                 if (iter == NULL)
1072                         return -1;
1073
1074                 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1075                         if (!override && PyDict_GetItem(a, key) != NULL) {
1076                                 Py_DECREF(key);
1077                                 continue;
1078                         }
1079                         value = PyObject_GetItem(b, key);
1080                         if (value == NULL) {
1081                                 Py_DECREF(iter);
1082                                 Py_DECREF(key);
1083                                 return -1;
1084                         }
1085                         status = PyDict_SetItem(a, key, value);
1086                         Py_DECREF(key);
1087                         Py_DECREF(value);
1088                         if (status < 0) {
1089                                 Py_DECREF(iter);
1090                                 return -1;
1091                         }
1092                 }
1093                 Py_DECREF(iter);
1094                 if (PyErr_Occurred())
1095                         /* Iterator completed, via error */
1096                         return -1;
1097         }
1098         return 0;
1099 }
1100
1101 static PyObject *
1102 dict_copy(register dictobject *mp, PyObject *args)
1103 {
1104         if (!PyArg_Parse(args, ""))
1105                 return NULL;
1106         return PyDict_Copy((PyObject*)mp);
1107 }
1108
1109 PyObject *
1110 PyDict_Copy(PyObject *o)
1111 {
1112         register dictobject *mp;
1113         register int i;
1114         dictobject *copy;
1115         dictentry *entry;
1116
1117         if (o == NULL || !PyDict_Check(o)) {
1118                 PyErr_BadInternalCall();
1119                 return NULL;
1120         }
1121         mp = (dictobject *)o;
1122         copy = (dictobject *)PyDict_New();
1123         if (copy == NULL)
1124                 return NULL;
1125         if (mp->ma_used > 0) {
1126                 if (dictresize(copy, mp->ma_used*3/2) != 0)
1127                         return NULL;
1128                 for (i = 0; i <= mp->ma_mask; i++) {
1129                         entry = &mp->ma_table[i];
1130                         if (entry->me_value != NULL) {
1131                                 Py_INCREF(entry->me_key);
1132                                 Py_INCREF(entry->me_value);
1133                                 insertdict(copy, entry->me_key, entry->me_hash,
1134                                            entry->me_value);
1135                         }
1136                 }
1137         }
1138         return (PyObject *)copy;
1139 }
1140
1141 int
1142 PyDict_Size(PyObject *mp)
1143 {
1144         if (mp == NULL || !PyDict_Check(mp)) {
1145                 PyErr_BadInternalCall();
1146                 return 0;
1147         }
1148         return ((dictobject *)mp)->ma_used;
1149 }
1150
1151 PyObject *
1152 PyDict_Keys(PyObject *mp)
1153 {
1154         if (mp == NULL || !PyDict_Check(mp)) {
1155                 PyErr_BadInternalCall();
1156                 return NULL;
1157         }
1158         return dict_keys((dictobject *)mp, (PyObject *)NULL);
1159 }
1160
1161 PyObject *
1162 PyDict_Values(PyObject *mp)
1163 {
1164         if (mp == NULL || !PyDict_Check(mp)) {
1165                 PyErr_BadInternalCall();
1166                 return NULL;
1167         }
1168         return dict_values((dictobject *)mp, (PyObject *)NULL);
1169 }
1170
1171 PyObject *
1172 PyDict_Items(PyObject *mp)
1173 {
1174         if (mp == NULL || !PyDict_Check(mp)) {
1175                 PyErr_BadInternalCall();
1176                 return NULL;
1177         }
1178         return dict_items((dictobject *)mp, (PyObject *)NULL);
1179 }
1180
1181 /* Subroutine which returns the smallest key in a for which b's value
1182    is different or absent.  The value is returned too, through the
1183    pval argument.  Both are NULL if no key in a is found for which b's status
1184    differs.  The refcounts on (and only on) non-NULL *pval and function return
1185    values must be decremented by the caller (characterize() increments them
1186    to ensure that mutating comparison and PyDict_GetItem calls can't delete
1187    them before the caller is done looking at them). */
1188
1189 static PyObject *
1190 characterize(dictobject *a, dictobject *b, PyObject **pval)
1191 {
1192         PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1193         PyObject *aval = NULL; /* a[akey] */
1194         int i, cmp;
1195
1196         for (i = 0; i <= a->ma_mask; i++) {
1197                 PyObject *thiskey, *thisaval, *thisbval;
1198                 if (a->ma_table[i].me_value == NULL)
1199                         continue;
1200                 thiskey = a->ma_table[i].me_key;
1201                 Py_INCREF(thiskey);  /* keep alive across compares */
1202                 if (akey != NULL) {
1203                         cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1204                         if (cmp < 0) {
1205                                 Py_DECREF(thiskey);
1206                                 goto Fail;
1207                         }
1208                         if (cmp > 0 ||
1209                             i > a->ma_mask ||
1210                             a->ma_table[i].me_value == NULL)
1211                         {
1212                                 /* Not the *smallest* a key; or maybe it is
1213                                  * but the compare shrunk the dict so we can't
1214                                  * find its associated value anymore; or
1215                                  * maybe it is but the compare deleted the
1216                                  * a[thiskey] entry.
1217                                  */
1218                                 Py_DECREF(thiskey);
1219                                 continue;
1220                         }
1221                 }
1222
1223                 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1224                 thisaval = a->ma_table[i].me_value;
1225                 assert(thisaval);
1226                 Py_INCREF(thisaval);   /* keep alive */
1227                 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1228                 if (thisbval == NULL)
1229                         cmp = 0;
1230                 else {
1231                         /* both dicts have thiskey:  same values? */
1232                         cmp = PyObject_RichCompareBool(
1233                                                 thisaval, thisbval, Py_EQ);
1234                         if (cmp < 0) {
1235                                 Py_DECREF(thiskey);
1236                                 Py_DECREF(thisaval);
1237                                 goto Fail;
1238                         }
1239                 }
1240                 if (cmp == 0) {
1241                         /* New winner. */
1242                         Py_XDECREF(akey);
1243                         Py_XDECREF(aval);
1244                         akey = thiskey;
1245                         aval = thisaval;
1246                 }
1247                 else {
1248                         Py_DECREF(thiskey);
1249                         Py_DECREF(thisaval);
1250                 }
1251         }
1252         *pval = aval;
1253         return akey;
1254
1255 Fail:
1256         Py_XDECREF(akey);
1257         Py_XDECREF(aval);
1258         *pval = NULL;
1259         return NULL;
1260 }
1261
1262 static int
1263 dict_compare(dictobject *a, dictobject *b)
1264 {
1265         PyObject *adiff, *bdiff, *aval, *bval;
1266         int res;
1267
1268         /* Compare lengths first */
1269         if (a->ma_used < b->ma_used)
1270                 return -1;      /* a is shorter */
1271         else if (a->ma_used > b->ma_used)
1272                 return 1;       /* b is shorter */
1273
1274         /* Same length -- check all keys */
1275         bdiff = bval = NULL;
1276         adiff = characterize(a, b, &aval);
1277         if (adiff == NULL) {
1278                 assert(!aval);
1279                 /* Either an error, or a is a subset with the same length so
1280                  * must be equal.
1281                  */
1282                 res = PyErr_Occurred() ? -1 : 0;
1283                 goto Finished;
1284         }
1285         bdiff = characterize(b, a, &bval);
1286         if (bdiff == NULL && PyErr_Occurred()) {
1287                 assert(!bval);
1288                 res = -1;
1289                 goto Finished;
1290         }
1291         res = 0;
1292         if (bdiff) {
1293                 /* bdiff == NULL "should be" impossible now, but perhaps
1294                  * the last comparison done by the characterize() on a had
1295                  * the side effect of making the dicts equal!
1296                  */
1297                 res = PyObject_Compare(adiff, bdiff);
1298         }
1299         if (res == 0 && bval != NULL)
1300                 res = PyObject_Compare(aval, bval);
1301
1302 Finished:
1303         Py_XDECREF(adiff);
1304         Py_XDECREF(bdiff);
1305         Py_XDECREF(aval);
1306         Py_XDECREF(bval);
1307         return res;
1308 }
1309
1310 /* Return 1 if dicts equal, 0 if not, -1 if error.
1311  * Gets out as soon as any difference is detected.
1312  * Uses only Py_EQ comparison.
1313  */
1314 static int
1315 dict_equal(dictobject *a, dictobject *b)
1316 {
1317         int i;
1318
1319         if (a->ma_used != b->ma_used)
1320                 /* can't be equal if # of entries differ */
1321                 return 0;
1322
1323         /* Same # of entries -- check all of 'em.  Exit early on any diff. */
1324         for (i = 0; i <= a->ma_mask; i++) {
1325                 PyObject *aval = a->ma_table[i].me_value;
1326                 if (aval != NULL) {
1327                         int cmp;
1328                         PyObject *bval;
1329                         PyObject *key = a->ma_table[i].me_key;
1330                         /* temporarily bump aval's refcount to ensure it stays
1331                            alive until we're done with it */
1332                         Py_INCREF(aval);
1333                         bval = PyDict_GetItem((PyObject *)b, key);
1334                         if (bval == NULL) {
1335                                 Py_DECREF(aval);
1336                                 return 0;
1337                         }
1338                         cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1339                         Py_DECREF(aval);
1340                         if (cmp <= 0)  /* error or not equal */
1341                                 return cmp;
1342                 }
1343         }
1344         return 1;
1345  }
1346
1347 static PyObject *
1348 dict_richcompare(PyObject *v, PyObject *w, int op)
1349 {
1350         int cmp;
1351         PyObject *res;
1352
1353         if (!PyDict_Check(v) || !PyDict_Check(w)) {
1354                 res = Py_NotImplemented;
1355         }
1356         else if (op == Py_EQ || op == Py_NE) {
1357                 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1358                 if (cmp < 0)
1359                         return NULL;
1360                 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1361         }
1362         else
1363                 res = Py_NotImplemented;
1364         Py_INCREF(res);
1365         return res;
1366  }
1367
1368 static PyObject *
1369 dict_has_key(register dictobject *mp, PyObject *args)
1370 {
1371         PyObject *key;
1372         long hash;
1373         register long ok;
1374         if (!PyArg_ParseTuple(args, "O:has_key", &key))
1375                 return NULL;
1376 #ifdef CACHE_HASH
1377         if (!PyString_Check(key) ||
1378             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1379 #endif
1380         {
1381                 hash = PyObject_Hash(key);
1382                 if (hash == -1)
1383                         return NULL;
1384         }
1385         ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1386         return PyInt_FromLong(ok);
1387 }
1388
1389 static PyObject *
1390 dict_get(register dictobject *mp, PyObject *args)
1391 {
1392         PyObject *key;
1393         PyObject *failobj = Py_None;
1394         PyObject *val = NULL;
1395         long hash;
1396
1397         if (!PyArg_ParseTuple(args, "O|O:get", &key, &failobj))
1398                 return NULL;
1399
1400 #ifdef CACHE_HASH
1401         if (!PyString_Check(key) ||
1402             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1403 #endif
1404         {
1405                 hash = PyObject_Hash(key);
1406                 if (hash == -1)
1407                         return NULL;
1408         }
1409         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1410
1411         if (val == NULL)
1412                 val = failobj;
1413         Py_INCREF(val);
1414         return val;
1415 }
1416
1417
1418 static PyObject *
1419 dict_setdefault(register dictobject *mp, PyObject *args)
1420 {
1421         PyObject *key;
1422         PyObject *failobj = Py_None;
1423         PyObject *val = NULL;
1424         long hash;
1425
1426         if (!PyArg_ParseTuple(args, "O|O:setdefault", &key, &failobj))
1427                 return NULL;
1428
1429 #ifdef CACHE_HASH
1430         if (!PyString_Check(key) ||
1431             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1432 #endif
1433         {
1434                 hash = PyObject_Hash(key);
1435                 if (hash == -1)
1436                         return NULL;
1437         }
1438         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1439         if (val == NULL) {
1440                 val = failobj;
1441                 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1442                         val = NULL;
1443         }
1444         Py_XINCREF(val);
1445         return val;
1446 }
1447
1448
1449 static PyObject *
1450 dict_clear(register dictobject *mp, PyObject *args)
1451 {
1452         if (!PyArg_NoArgs(args))
1453                 return NULL;
1454         PyDict_Clear((PyObject *)mp);
1455         Py_INCREF(Py_None);
1456         return Py_None;
1457 }
1458
1459 static PyObject *
1460 dict_popitem(dictobject *mp, PyObject *args)
1461 {
1462         int i = 0;
1463         dictentry *ep;
1464         PyObject *res;
1465
1466         if (!PyArg_NoArgs(args))
1467                 return NULL;
1468         /* Allocate the result tuple before checking the size.  Believe it
1469          * or not, this allocation could trigger a garbage collection which
1470          * could empty the dict, so if we checked the size first and that
1471          * happened, the result would be an infinite loop (searching for an
1472          * entry that no longer exists).  Note that the usual popitem()
1473          * idiom is "while d: k, v = d.popitem()". so needing to throw the
1474          * tuple away  if the dict *is* empty isn't a significant
1475          * inefficiency -- possible, but unlikely in practice.
1476          */
1477         res = PyTuple_New(2);
1478         if (res == NULL)
1479                 return NULL;
1480         if (mp->ma_used == 0) {
1481                 Py_DECREF(res);
1482                 PyErr_SetString(PyExc_KeyError,
1483                                 "popitem(): dictionary is empty");
1484                 return NULL;
1485         }
1486         /* Set ep to "the first" dict entry with a value.  We abuse the hash
1487          * field of slot 0 to hold a search finger:
1488          * If slot 0 has a value, use slot 0.
1489          * Else slot 0 is being used to hold a search finger,
1490          * and we use its hash value as the first index to look.
1491          */
1492         ep = &mp->ma_table[0];
1493         if (ep->me_value == NULL) {
1494                 i = (int)ep->me_hash;
1495                 /* The hash field may be a real hash value, or it may be a
1496                  * legit search finger, or it may be a once-legit search
1497                  * finger that's out of bounds now because it wrapped around
1498                  * or the table shrunk -- simply make sure it's in bounds now.
1499                  */
1500                 if (i > mp->ma_mask || i < 1)
1501                         i = 1;  /* skip slot 0 */
1502                 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1503                         i++;
1504                         if (i > mp->ma_mask)
1505                                 i = 1;
1506                 }
1507         }
1508         PyTuple_SET_ITEM(res, 0, ep->me_key);
1509         PyTuple_SET_ITEM(res, 1, ep->me_value);
1510         Py_INCREF(dummy);
1511         ep->me_key = dummy;
1512         ep->me_value = NULL;
1513         mp->ma_used--;
1514         assert(mp->ma_table[0].me_value == NULL);
1515         mp->ma_table[0].me_hash = i + 1;  /* next place to start */
1516         return res;
1517 }
1518
1519 static int
1520 dict_traverse(PyObject *op, visitproc visit, void *arg)
1521 {
1522         int i = 0, err;
1523         PyObject *pk;
1524         PyObject *pv;
1525
1526         while (PyDict_Next(op, &i, &pk, &pv)) {
1527                 err = visit(pk, arg);
1528                 if (err)
1529                         return err;
1530                 err = visit(pv, arg);
1531                 if (err)
1532                         return err;
1533         }
1534         return 0;
1535 }
1536
1537 static int
1538 dict_tp_clear(PyObject *op)
1539 {
1540         PyDict_Clear(op);
1541         return 0;
1542 }
1543
1544
1545 staticforward PyObject *dictiter_new(dictobject *, binaryfunc);
1546
1547 static PyObject *
1548 select_key(PyObject *key, PyObject *value)
1549 {
1550         Py_INCREF(key);
1551         return key;
1552 }
1553
1554 static PyObject *
1555 select_value(PyObject *key, PyObject *value)
1556 {
1557         Py_INCREF(value);
1558         return value;
1559 }
1560
1561 static PyObject *
1562 select_item(PyObject *key, PyObject *value)
1563 {
1564         PyObject *res = PyTuple_New(2);
1565
1566         if (res != NULL) {
1567                 Py_INCREF(key);
1568                 Py_INCREF(value);
1569                 PyTuple_SET_ITEM(res, 0, key);
1570                 PyTuple_SET_ITEM(res, 1, value);
1571         }
1572         return res;
1573 }
1574
1575 static PyObject *
1576 dict_iterkeys(dictobject *dict, PyObject *args)
1577 {
1578         if (!PyArg_ParseTuple(args, ""))
1579                 return NULL;
1580         return dictiter_new(dict, select_key);
1581 }
1582
1583 static PyObject *
1584 dict_itervalues(dictobject *dict, PyObject *args)
1585 {
1586         if (!PyArg_ParseTuple(args, ""))
1587                 return NULL;
1588         return dictiter_new(dict, select_value);
1589 }
1590
1591 static PyObject *
1592 dict_iteritems(dictobject *dict, PyObject *args)
1593 {
1594         if (!PyArg_ParseTuple(args, ""))
1595                 return NULL;
1596         return dictiter_new(dict, select_item);
1597 }
1598
1599
1600 static char has_key__doc__[] =
1601 "D.has_key(k) -> 1 if D has a key k, else 0";
1602
1603 static char get__doc__[] =
1604 "D.get(k[,d]) -> D[k] if D.has_key(k), else d.  d defaults to None.";
1605
1606 static char setdefault_doc__[] =
1607 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if not D.has_key(k)";
1608
1609 static char popitem__doc__[] =
1610 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1611 2-tuple; but raise KeyError if D is empty";
1612
1613 static char keys__doc__[] =
1614 "D.keys() -> list of D's keys";
1615
1616 static char items__doc__[] =
1617 "D.items() -> list of D's (key, value) pairs, as 2-tuples";
1618
1619 static char values__doc__[] =
1620 "D.values() -> list of D's values";
1621
1622 static char update__doc__[] =
1623 "D.update(E) -> None.  Update D from E: for k in E.keys(): D[k] = E[k]";
1624
1625 static char clear__doc__[] =
1626 "D.clear() -> None.  Remove all items from D.";
1627
1628 static char copy__doc__[] =
1629 "D.copy() -> a shallow copy of D";
1630
1631 static char iterkeys__doc__[] =
1632 "D.iterkeys() -> an iterator over the keys of D";
1633
1634 static char itervalues__doc__[] =
1635 "D.itervalues() -> an iterator over the values of D";
1636
1637 static char iteritems__doc__[] =
1638 "D.iteritems() -> an iterator over the (key, value) items of D";
1639
1640 static PyMethodDef mapp_methods[] = {
1641         {"has_key",     (PyCFunction)dict_has_key,      METH_VARARGS,
1642          has_key__doc__},
1643         {"get",         (PyCFunction)dict_get,          METH_VARARGS,
1644          get__doc__},
1645         {"setdefault",  (PyCFunction)dict_setdefault,   METH_VARARGS,
1646          setdefault_doc__},
1647         {"popitem",     (PyCFunction)dict_popitem,      METH_OLDARGS,
1648          popitem__doc__},
1649         {"keys",        (PyCFunction)dict_keys,         METH_OLDARGS,
1650         keys__doc__},
1651         {"items",       (PyCFunction)dict_items,        METH_OLDARGS,
1652          items__doc__},
1653         {"values",      (PyCFunction)dict_values,       METH_OLDARGS,
1654          values__doc__},
1655         {"update",      (PyCFunction)dict_update,       METH_VARARGS,
1656          update__doc__},
1657         {"clear",       (PyCFunction)dict_clear,        METH_OLDARGS,
1658          clear__doc__},
1659         {"copy",        (PyCFunction)dict_copy,         METH_OLDARGS,
1660          copy__doc__},
1661         {"iterkeys",    (PyCFunction)dict_iterkeys,     METH_VARARGS,
1662          iterkeys__doc__},
1663         {"itervalues",  (PyCFunction)dict_itervalues,   METH_VARARGS,
1664          itervalues__doc__},
1665         {"iteritems",   (PyCFunction)dict_iteritems,    METH_VARARGS,
1666          iteritems__doc__},
1667         {NULL,          NULL}   /* sentinel */
1668 };
1669
1670 static int
1671 dict_contains(dictobject *mp, PyObject *key)
1672 {
1673         long hash;
1674
1675 #ifdef CACHE_HASH
1676         if (!PyString_Check(key) ||
1677             (hash = ((PyStringObject *) key)->ob_shash) == -1)
1678 #endif
1679         {
1680                 hash = PyObject_Hash(key);
1681                 if (hash == -1)
1682                         return -1;
1683         }
1684         return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1685 }
1686
1687 /* Hack to implement "key in dict" */
1688 static PySequenceMethods dict_as_sequence = {
1689         0,                                      /* sq_length */
1690         0,                                      /* sq_concat */
1691         0,                                      /* sq_repeat */
1692         0,                                      /* sq_item */
1693         0,                                      /* sq_slice */
1694         0,                                      /* sq_ass_item */
1695         0,                                      /* sq_ass_slice */
1696         (objobjproc)dict_contains,              /* sq_contains */
1697         0,                                      /* sq_inplace_concat */
1698         0,                                      /* sq_inplace_repeat */
1699 };
1700
1701 static PyObject *
1702 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1703 {
1704         PyObject *self;
1705
1706         assert(type != NULL && type->tp_alloc != NULL);
1707         self = type->tp_alloc(type, 0);
1708         if (self != NULL) {
1709                 PyDictObject *d = (PyDictObject *)self;
1710                 /* It's guaranteed that tp->alloc zeroed out the struct. */
1711                 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1712                 INIT_NONZERO_DICT_SLOTS(d);
1713                 d->ma_lookup = lookdict_string;
1714 #ifdef SHOW_CONVERSION_COUNTS
1715                 ++created;
1716 #endif
1717         }
1718         return self;
1719 }
1720
1721 static PyObject *
1722 dict_iter(dictobject *dict)
1723 {
1724         return dictiter_new(dict, select_key);
1725 }
1726
1727 PyTypeObject PyDict_Type = {
1728         PyObject_HEAD_INIT(&PyType_Type)
1729         0,
1730         "dictionary",
1731         sizeof(dictobject) + PyGC_HEAD_SIZE,
1732         0,
1733         (destructor)dict_dealloc,               /* tp_dealloc */
1734         (printfunc)dict_print,                  /* tp_print */
1735         0,                                      /* tp_getattr */
1736         0,                                      /* tp_setattr */
1737         (cmpfunc)dict_compare,                  /* tp_compare */
1738         (reprfunc)dict_repr,                    /* tp_repr */
1739         0,                                      /* tp_as_number */
1740         &dict_as_sequence,                      /* tp_as_sequence */
1741         &dict_as_mapping,                       /* tp_as_mapping */
1742         0,                                      /* tp_hash */
1743         0,                                      /* tp_call */
1744         0,                                      /* tp_str */
1745         PyObject_GenericGetAttr,                /* tp_getattro */
1746         0,                                      /* tp_setattro */
1747         0,                                      /* tp_as_buffer */
1748         Py_TPFLAGS_DEFAULT | Py_TPFLAGS_GC |
1749                 Py_TPFLAGS_BASETYPE,            /* tp_flags */
1750         "dictionary type",                      /* tp_doc */
1751         (traverseproc)dict_traverse,            /* tp_traverse */
1752         (inquiry)dict_tp_clear,                 /* tp_clear */
1753         dict_richcompare,                       /* tp_richcompare */
1754         0,                                      /* tp_weaklistoffset */
1755         (getiterfunc)dict_iter,                 /* tp_iter */
1756         0,                                      /* tp_iternext */
1757         mapp_methods,                           /* tp_methods */
1758         0,                                      /* tp_members */
1759         0,                                      /* tp_getset */
1760         0,                                      /* tp_base */
1761         0,                                      /* tp_dict */
1762         0,                                      /* tp_descr_get */
1763         0,                                      /* tp_descr_set */
1764         0,                                      /* tp_dictoffset */
1765         0,                                      /* tp_init */
1766         PyType_GenericAlloc,                    /* tp_alloc */
1767         dict_new,                               /* tp_new */
1768 };
1769
1770 /* For backward compatibility with old dictionary interface */
1771
1772 PyObject *
1773 PyDict_GetItemString(PyObject *v, char *key)
1774 {
1775         PyObject *kv, *rv;
1776         kv = PyString_FromString(key);
1777         if (kv == NULL)
1778                 return NULL;
1779         rv = PyDict_GetItem(v, kv);
1780         Py_DECREF(kv);
1781         return rv;
1782 }
1783
1784 int
1785 PyDict_SetItemString(PyObject *v, char *key, PyObject *item)
1786 {
1787         PyObject *kv;
1788         int err;
1789         kv = PyString_FromString(key);
1790         if (kv == NULL)
1791                 return -1;
1792         PyString_InternInPlace(&kv); /* XXX Should we really? */
1793         err = PyDict_SetItem(v, kv, item);
1794         Py_DECREF(kv);
1795         return err;
1796 }
1797
1798 int
1799 PyDict_DelItemString(PyObject *v, char *key)
1800 {
1801         PyObject *kv;
1802         int err;
1803         kv = PyString_FromString(key);
1804         if (kv == NULL)
1805                 return -1;
1806         err = PyDict_DelItem(v, kv);
1807         Py_DECREF(kv);
1808         return err;
1809 }
1810
1811 /* Dictionary iterator type */
1812
1813 extern PyTypeObject PyDictIter_Type; /* Forward */
1814
1815 typedef struct {
1816         PyObject_HEAD
1817         dictobject *di_dict;
1818         int di_used;
1819         int di_pos;
1820         binaryfunc di_select;
1821 } dictiterobject;
1822
1823 static PyObject *
1824 dictiter_new(dictobject *dict, binaryfunc select)
1825 {
1826         dictiterobject *di;
1827         di = PyObject_NEW(dictiterobject, &PyDictIter_Type);
1828         if (di == NULL)
1829                 return NULL;
1830         Py_INCREF(dict);
1831         di->di_dict = dict;
1832         di->di_used = dict->ma_used;
1833         di->di_pos = 0;
1834         di->di_select = select;
1835         return (PyObject *)di;
1836 }
1837
1838 static void
1839 dictiter_dealloc(dictiterobject *di)
1840 {
1841         Py_DECREF(di->di_dict);
1842         PyObject_DEL(di);
1843 }
1844
1845 static PyObject *
1846 dictiter_next(dictiterobject *di, PyObject *args)
1847 {
1848         PyObject *key, *value;
1849
1850         if (di->di_used != di->di_dict->ma_used) {
1851                 PyErr_SetString(PyExc_RuntimeError,
1852                                 "dictionary changed size during iteration");
1853                 return NULL;
1854         }
1855         if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1856                 return (*di->di_select)(key, value);
1857         }
1858         PyErr_SetObject(PyExc_StopIteration, Py_None);
1859         return NULL;
1860 }
1861
1862 static PyObject *
1863 dictiter_getiter(PyObject *it)
1864 {
1865         Py_INCREF(it);
1866         return it;
1867 }
1868
1869 static PyMethodDef dictiter_methods[] = {
1870         {"next",        (PyCFunction)dictiter_next,     METH_VARARGS,
1871          "it.next() -- get the next value, or raise StopIteration"},
1872         {NULL,          NULL}           /* sentinel */
1873 };
1874
1875 static PyObject *dictiter_iternext(dictiterobject *di)
1876 {
1877         PyObject *key, *value;
1878
1879         if (di->di_used != di->di_dict->ma_used) {
1880                 PyErr_SetString(PyExc_RuntimeError,
1881                                 "dictionary changed size during iteration");
1882                 return NULL;
1883         }
1884         if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1885                 return (*di->di_select)(key, value);
1886         }
1887         return NULL;
1888 }
1889
1890 PyTypeObject PyDictIter_Type = {
1891         PyObject_HEAD_INIT(&PyType_Type)
1892         0,                                      /* ob_size */
1893         "dictionary-iterator",                  /* tp_name */
1894         sizeof(dictiterobject),                 /* tp_basicsize */
1895         0,                                      /* tp_itemsize */
1896         /* methods */
1897         (destructor)dictiter_dealloc,           /* tp_dealloc */
1898         0,                                      /* tp_print */
1899         0,                                      /* tp_getattr */
1900         0,                                      /* tp_setattr */
1901         0,                                      /* tp_compare */
1902         0,                                      /* tp_repr */
1903         0,                                      /* tp_as_number */
1904         0,                                      /* tp_as_sequence */
1905         0,                                      /* tp_as_mapping */
1906         0,                                      /* tp_hash */
1907         0,                                      /* tp_call */
1908         0,                                      /* tp_str */
1909         PyObject_GenericGetAttr,                /* tp_getattro */
1910         0,                                      /* tp_setattro */
1911         0,                                      /* tp_as_buffer */
1912         Py_TPFLAGS_DEFAULT,                     /* tp_flags */
1913         0,                                      /* tp_doc */
1914         0,                                      /* tp_traverse */
1915         0,                                      /* tp_clear */
1916         0,                                      /* tp_richcompare */
1917         0,                                      /* tp_weaklistoffset */
1918         (getiterfunc)dictiter_getiter,          /* tp_iter */
1919         (iternextfunc)dictiter_iternext,        /* tp_iternext */
1920         dictiter_methods,                       /* tp_methods */
1921         0,                                      /* tp_members */
1922         0,                                      /* tp_getset */
1923         0,                                      /* tp_base */
1924         0,                                      /* tp_dict */
1925         0,                                      /* tp_descr_get */
1926         0,                                      /* tp_descr_set */
1927 };