This commit was manufactured by cvs2svn to create tag
[python/dscho.git] / Objects / dictobject.c
blobe843e760cb90dc5dbb7672967f9699a286ce95ee
2 /* Dictionary object implementation using a hash table */
4 #include "Python.h"
6 typedef PyDictEntry dictentry;
7 typedef PyDictObject dictobject;
9 /* Define this out if you don't want conversion statistics on exit. */
10 #undef SHOW_CONVERSION_COUNTS
12 /* See large comment block below. This must be >= 1. */
13 #define PERTURB_SHIFT 5
16 Major subtleties ahead: Most hash schemes depend on having a "good" hash
17 function, in the sense of simulating randomness. Python doesn't: its most
18 important hash functions (for strings and ints) are very regular in common
19 cases:
21 >>> map(hash, (0, 1, 2, 3))
22 [0, 1, 2, 3]
23 >>> map(hash, ("namea", "nameb", "namec", "named"))
24 [-1658398457, -1658398460, -1658398459, -1658398462]
25 >>>
27 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
28 the low-order i bits as the initial table index is extremely fast, and there
29 are no collisions at all for dicts indexed by a contiguous range of ints.
30 The same is approximately true when keys are "consecutive" strings. So this
31 gives better-than-random behavior in common cases, and that's very desirable.
33 OTOH, when collisions occur, the tendency to fill contiguous slices of the
34 hash table makes a good collision resolution strategy crucial. Taking only
35 the last i bits of the hash code is also vulnerable: for example, consider
36 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
37 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
38 hash code are all 0: they *all* map to the same table index.
40 But catering to unusual cases should not slow the usual ones, so we just take
41 the last i bits anyway. It's up to collision resolution to do the rest. If
42 we *usually* find the key we're looking for on the first try (and, it turns
43 out, we usually do -- the table load factor is kept under 2/3, so the odds
44 are solidly in our favor), then it makes best sense to keep the initial index
45 computation dirt cheap.
47 The first half of collision resolution is to visit table indices via this
48 recurrence:
50 j = ((5*j) + 1) mod 2**i
52 For any initial j in range(2**i), repeating that 2**i times generates each
53 int in range(2**i) exactly once (see any text on random-number generation for
54 proof). By itself, this doesn't help much: like linear probing (setting
55 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
56 order. This would be bad, except that's not the only thing we do, and it's
57 actually *good* in the common cases where hash keys are consecutive. In an
58 example that's really too small to make this entirely clear, for a table of
59 size 2**3 the order of indices is:
61 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
63 If two things come in at index 5, the first place we look after is index 2,
64 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
65 Linear probing is deadly in this case because there the fixed probe order
66 is the *same* as the order consecutive keys are likely to arrive. But it's
67 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
68 and certain that consecutive hash codes do not.
70 The other half of the strategy is to get the other bits of the hash code
71 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
72 full hash code, and changing the recurrence to:
74 j = (5*j) + 1 + perturb;
75 perturb >>= PERTURB_SHIFT;
76 use j % 2**i as the next table index;
78 Now the probe sequence depends (eventually) on every bit in the hash code,
79 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
80 because it quickly magnifies small differences in the bits that didn't affect
81 the initial index. Note that because perturb is unsigned, if the recurrence
82 is executed often enough perturb eventually becomes and remains 0. At that
83 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
84 that's certain to find an empty slot eventually (since it generates every int
85 in range(2**i), and we make sure there's always at least one empty slot).
87 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
88 small so that the high bits of the hash code continue to affect the probe
89 sequence across iterations; but you want it large so that in really bad cases
90 the high-order hash bits have an effect on early iterations. 5 was "the
91 best" in minimizing total collisions across experiments Tim Peters ran (on
92 both normal and pathological cases), but 4 and 6 weren't significantly worse.
94 Historical: Reimer Behrends contributed the idea of using a polynomial-based
95 approach, using repeated multiplication by x in GF(2**n) where an irreducible
96 polynomial for each table size was chosen such that x was a primitive root.
97 Christian Tismer later extended that to use division by x instead, as an
98 efficient way to get the high bits of the hash code into play. This scheme
99 also gave excellent collision statistics, but was more expensive: two
100 if-tests were required inside the loop; computing "the next" index took about
101 the same number of operations but without as much potential parallelism
102 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
103 above, and then shifting perturb can be done while the table index is being
104 masked); and the dictobject struct required a member to hold the table's
105 polynomial. In Tim's experiments the current scheme ran faster, produced
106 equally good collision statistics, needed less code & used less memory.
109 /* Object used as dummy key to fill deleted entries */
110 static PyObject *dummy; /* Initialized by first call to newdictobject() */
112 /* forward declarations */
113 static dictentry *
114 lookdict_string(dictobject *mp, PyObject *key, long hash);
116 #ifdef SHOW_CONVERSION_COUNTS
117 static long created = 0L;
118 static long converted = 0L;
120 static void
121 show_counts(void)
123 fprintf(stderr, "created %ld string dicts\n", created);
124 fprintf(stderr, "converted %ld to normal dicts\n", converted);
125 fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
127 #endif
129 /* Initialization macros.
130 There are two ways to create a dict: PyDict_New() is the main C API
131 function, and the tp_new slot maps to dict_new(). In the latter case we
132 can save a little time over what PyDict_New does because it's guaranteed
133 that the PyDictObject struct is already zeroed out.
134 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
135 an excellent reason not to).
138 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
139 (mp)->ma_table = (mp)->ma_smalltable; \
140 (mp)->ma_mask = PyDict_MINSIZE - 1; \
141 } while(0)
143 #define EMPTY_TO_MINSIZE(mp) do { \
144 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
145 (mp)->ma_used = (mp)->ma_fill = 0; \
146 INIT_NONZERO_DICT_SLOTS(mp); \
147 } while(0)
149 PyObject *
150 PyDict_New(void)
152 register dictobject *mp;
153 if (dummy == NULL) { /* Auto-initialize dummy */
154 dummy = PyString_FromString("<dummy key>");
155 if (dummy == NULL)
156 return NULL;
157 #ifdef SHOW_CONVERSION_COUNTS
158 Py_AtExit(show_counts);
159 #endif
161 mp = PyObject_GC_New(dictobject, &PyDict_Type);
162 if (mp == NULL)
163 return NULL;
164 EMPTY_TO_MINSIZE(mp);
165 mp->ma_lookup = lookdict_string;
166 #ifdef SHOW_CONVERSION_COUNTS
167 ++created;
168 #endif
169 _PyObject_GC_TRACK(mp);
170 return (PyObject *)mp;
174 The basic lookup function used by all operations.
175 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
176 Open addressing is preferred over chaining since the link overhead for
177 chaining would be substantial (100% with typical malloc overhead).
179 The initial probe index is computed as hash mod the table size. Subsequent
180 probe indices are computed as explained earlier.
182 All arithmetic on hash should ignore overflow.
184 (The details in this version are due to Tim Peters, building on many past
185 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
186 Christian Tismer).
188 This function must never return NULL; failures are indicated by returning
189 a dictentry* for which the me_value field is NULL. Exceptions are never
190 reported by this function, and outstanding exceptions are maintained.
193 static dictentry *
194 lookdict(dictobject *mp, PyObject *key, register long hash)
196 register int i;
197 register unsigned int perturb;
198 register dictentry *freeslot;
199 register unsigned int mask = mp->ma_mask;
200 dictentry *ep0 = mp->ma_table;
201 register dictentry *ep;
202 register int restore_error;
203 register int checked_error;
204 register int cmp;
205 PyObject *err_type, *err_value, *err_tb;
206 PyObject *startkey;
208 i = hash & mask;
209 ep = &ep0[i];
210 if (ep->me_key == NULL || ep->me_key == key)
211 return ep;
213 restore_error = checked_error = 0;
214 if (ep->me_key == dummy)
215 freeslot = ep;
216 else {
217 if (ep->me_hash == hash) {
218 /* error can't have been checked yet */
219 checked_error = 1;
220 if (PyErr_Occurred()) {
221 restore_error = 1;
222 PyErr_Fetch(&err_type, &err_value, &err_tb);
224 startkey = ep->me_key;
225 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
226 if (cmp < 0)
227 PyErr_Clear();
228 if (ep0 == mp->ma_table && ep->me_key == startkey) {
229 if (cmp > 0)
230 goto Done;
232 else {
233 /* The compare did major nasty stuff to the
234 * dict: start over.
235 * XXX A clever adversary could prevent this
236 * XXX from terminating.
238 ep = lookdict(mp, key, hash);
239 goto Done;
242 freeslot = NULL;
245 /* In the loop, me_key == dummy is by far (factor of 100s) the
246 least likely outcome, so test for that last. */
247 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
248 i = (i << 2) + i + perturb + 1;
249 ep = &ep0[i & mask];
250 if (ep->me_key == NULL) {
251 if (freeslot != NULL)
252 ep = freeslot;
253 break;
255 if (ep->me_key == key)
256 break;
257 if (ep->me_hash == hash && ep->me_key != dummy) {
258 if (!checked_error) {
259 checked_error = 1;
260 if (PyErr_Occurred()) {
261 restore_error = 1;
262 PyErr_Fetch(&err_type, &err_value,
263 &err_tb);
266 startkey = ep->me_key;
267 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
268 if (cmp < 0)
269 PyErr_Clear();
270 if (ep0 == mp->ma_table && ep->me_key == startkey) {
271 if (cmp > 0)
272 break;
274 else {
275 /* The compare did major nasty stuff to the
276 * dict: start over.
277 * XXX A clever adversary could prevent this
278 * XXX from terminating.
280 ep = lookdict(mp, key, hash);
281 break;
284 else if (ep->me_key == dummy && freeslot == NULL)
285 freeslot = ep;
288 Done:
289 if (restore_error)
290 PyErr_Restore(err_type, err_value, err_tb);
291 return ep;
295 * Hacked up version of lookdict which can assume keys are always strings;
296 * this assumption allows testing for errors during PyObject_Compare() to
297 * be dropped; string-string comparisons never raise exceptions. This also
298 * means we don't need to go through PyObject_Compare(); we can always use
299 * _PyString_Eq directly.
301 * This is valuable because the general-case error handling in lookdict() is
302 * expensive, and dicts with pure-string keys are very common.
304 static dictentry *
305 lookdict_string(dictobject *mp, PyObject *key, register long hash)
307 register int i;
308 register unsigned int perturb;
309 register dictentry *freeslot;
310 register unsigned int mask = mp->ma_mask;
311 dictentry *ep0 = mp->ma_table;
312 register dictentry *ep;
314 /* Make sure this function doesn't have to handle non-string keys,
315 including subclasses of str; e.g., one reason to subclass
316 strings is to override __eq__, and for speed we don't cater to
317 that here. */
318 if (!PyString_CheckExact(key)) {
319 #ifdef SHOW_CONVERSION_COUNTS
320 ++converted;
321 #endif
322 mp->ma_lookup = lookdict;
323 return lookdict(mp, key, hash);
325 i = hash & mask;
326 ep = &ep0[i];
327 if (ep->me_key == NULL || ep->me_key == key)
328 return ep;
329 if (ep->me_key == dummy)
330 freeslot = ep;
331 else {
332 if (ep->me_hash == hash
333 && _PyString_Eq(ep->me_key, key)) {
334 return ep;
336 freeslot = NULL;
339 /* In the loop, me_key == dummy is by far (factor of 100s) the
340 least likely outcome, so test for that last. */
341 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
342 i = (i << 2) + i + perturb + 1;
343 ep = &ep0[i & mask];
344 if (ep->me_key == NULL)
345 return freeslot == NULL ? ep : freeslot;
346 if (ep->me_key == key
347 || (ep->me_hash == hash
348 && ep->me_key != dummy
349 && _PyString_Eq(ep->me_key, key)))
350 return ep;
351 if (ep->me_key == dummy && freeslot == NULL)
352 freeslot = ep;
357 Internal routine to insert a new item into the table.
358 Used both by the internal resize routine and by the public insert routine.
359 Eats a reference to key and one to value.
361 static void
362 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
364 PyObject *old_value;
365 register dictentry *ep;
366 typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
368 assert(mp->ma_lookup != NULL);
369 ep = mp->ma_lookup(mp, key, hash);
370 if (ep->me_value != NULL) {
371 old_value = ep->me_value;
372 ep->me_value = value;
373 Py_DECREF(old_value); /* which **CAN** re-enter */
374 Py_DECREF(key);
376 else {
377 if (ep->me_key == NULL)
378 mp->ma_fill++;
379 else
380 Py_DECREF(ep->me_key);
381 ep->me_key = key;
382 ep->me_hash = hash;
383 ep->me_value = value;
384 mp->ma_used++;
389 Restructure the table by allocating a new table and reinserting all
390 items again. When entries have been deleted, the new table may
391 actually be smaller than the old one.
393 static int
394 dictresize(dictobject *mp, int minused)
396 int newsize;
397 dictentry *oldtable, *newtable, *ep;
398 int i;
399 int is_oldtable_malloced;
400 dictentry small_copy[PyDict_MINSIZE];
402 assert(minused >= 0);
404 /* Find the smallest table size > minused. */
405 for (newsize = PyDict_MINSIZE;
406 newsize <= minused && newsize > 0;
407 newsize <<= 1)
409 if (newsize <= 0) {
410 PyErr_NoMemory();
411 return -1;
414 /* Get space for a new table. */
415 oldtable = mp->ma_table;
416 assert(oldtable != NULL);
417 is_oldtable_malloced = oldtable != mp->ma_smalltable;
419 if (newsize == PyDict_MINSIZE) {
420 /* A large table is shrinking, or we can't get any smaller. */
421 newtable = mp->ma_smalltable;
422 if (newtable == oldtable) {
423 if (mp->ma_fill == mp->ma_used) {
424 /* No dummies, so no point doing anything. */
425 return 0;
427 /* We're not going to resize it, but rebuild the
428 table anyway to purge old dummy entries.
429 Subtle: This is *necessary* if fill==size,
430 as lookdict needs at least one virgin slot to
431 terminate failing searches. If fill < size, it's
432 merely desirable, as dummies slow searches. */
433 assert(mp->ma_fill > mp->ma_used);
434 memcpy(small_copy, oldtable, sizeof(small_copy));
435 oldtable = small_copy;
438 else {
439 newtable = PyMem_NEW(dictentry, newsize);
440 if (newtable == NULL) {
441 PyErr_NoMemory();
442 return -1;
446 /* Make the dict empty, using the new table. */
447 assert(newtable != oldtable);
448 mp->ma_table = newtable;
449 mp->ma_mask = newsize - 1;
450 memset(newtable, 0, sizeof(dictentry) * newsize);
451 mp->ma_used = 0;
452 i = mp->ma_fill;
453 mp->ma_fill = 0;
455 /* Copy the data over; this is refcount-neutral for active entries;
456 dummy entries aren't copied over, of course */
457 for (ep = oldtable; i > 0; ep++) {
458 if (ep->me_value != NULL) { /* active entry */
459 --i;
460 insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
462 else if (ep->me_key != NULL) { /* dummy entry */
463 --i;
464 assert(ep->me_key == dummy);
465 Py_DECREF(ep->me_key);
467 /* else key == value == NULL: nothing to do */
470 if (is_oldtable_malloced)
471 PyMem_DEL(oldtable);
472 return 0;
475 PyObject *
476 PyDict_GetItem(PyObject *op, PyObject *key)
478 long hash;
479 dictobject *mp = (dictobject *)op;
480 if (!PyDict_Check(op)) {
481 return NULL;
483 #ifdef CACHE_HASH
484 if (!PyString_CheckExact(key) ||
485 (hash = ((PyStringObject *) key)->ob_shash) == -1)
486 #endif
488 hash = PyObject_Hash(key);
489 if (hash == -1) {
490 PyErr_Clear();
491 return NULL;
494 return (mp->ma_lookup)(mp, key, hash)->me_value;
497 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
498 * dictionary if it is merely replacing the value for an existing key.
499 * This is means that it's safe to loop over a dictionary with
500 * PyDict_Next() and occasionally replace a value -- but you can't
501 * insert new keys or remove them.
504 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
506 register dictobject *mp;
507 register long hash;
508 register int n_used;
510 if (!PyDict_Check(op)) {
511 PyErr_BadInternalCall();
512 return -1;
514 mp = (dictobject *)op;
515 #ifdef CACHE_HASH
516 if (PyString_CheckExact(key)) {
517 #ifdef INTERN_STRINGS
518 if (((PyStringObject *)key)->ob_sinterned != NULL) {
519 key = ((PyStringObject *)key)->ob_sinterned;
520 hash = ((PyStringObject *)key)->ob_shash;
522 else
523 #endif
525 hash = ((PyStringObject *)key)->ob_shash;
526 if (hash == -1)
527 hash = PyObject_Hash(key);
530 else
531 #endif
533 hash = PyObject_Hash(key);
534 if (hash == -1)
535 return -1;
537 assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
538 n_used = mp->ma_used;
539 Py_INCREF(value);
540 Py_INCREF(key);
541 insertdict(mp, key, hash, value);
542 /* If we added a key, we can safely resize. Otherwise skip this!
543 * If fill >= 2/3 size, adjust size. Normally, this doubles the
544 * size, but it's also possible for the dict to shrink (if ma_fill is
545 * much larger than ma_used, meaning a lot of dict keys have been
546 * deleted).
548 if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) {
549 if (dictresize(mp, mp->ma_used*2) != 0)
550 return -1;
552 return 0;
556 PyDict_DelItem(PyObject *op, PyObject *key)
558 register dictobject *mp;
559 register long hash;
560 register dictentry *ep;
561 PyObject *old_value, *old_key;
563 if (!PyDict_Check(op)) {
564 PyErr_BadInternalCall();
565 return -1;
567 #ifdef CACHE_HASH
568 if (!PyString_CheckExact(key) ||
569 (hash = ((PyStringObject *) key)->ob_shash) == -1)
570 #endif
572 hash = PyObject_Hash(key);
573 if (hash == -1)
574 return -1;
576 mp = (dictobject *)op;
577 ep = (mp->ma_lookup)(mp, key, hash);
578 if (ep->me_value == NULL) {
579 PyErr_SetObject(PyExc_KeyError, key);
580 return -1;
582 old_key = ep->me_key;
583 Py_INCREF(dummy);
584 ep->me_key = dummy;
585 old_value = ep->me_value;
586 ep->me_value = NULL;
587 mp->ma_used--;
588 Py_DECREF(old_value);
589 Py_DECREF(old_key);
590 return 0;
593 void
594 PyDict_Clear(PyObject *op)
596 dictobject *mp;
597 dictentry *ep, *table;
598 int table_is_malloced;
599 int fill;
600 dictentry small_copy[PyDict_MINSIZE];
601 #ifdef Py_DEBUG
602 int i, n;
603 #endif
605 if (!PyDict_Check(op))
606 return;
607 mp = (dictobject *)op;
608 #ifdef Py_DEBUG
609 n = mp->ma_mask + 1;
610 i = 0;
611 #endif
613 table = mp->ma_table;
614 assert(table != NULL);
615 table_is_malloced = table != mp->ma_smalltable;
617 /* This is delicate. During the process of clearing the dict,
618 * decrefs can cause the dict to mutate. To avoid fatal confusion
619 * (voice of experience), we have to make the dict empty before
620 * clearing the slots, and never refer to anything via mp->xxx while
621 * clearing.
623 fill = mp->ma_fill;
624 if (table_is_malloced)
625 EMPTY_TO_MINSIZE(mp);
627 else if (fill > 0) {
628 /* It's a small table with something that needs to be cleared.
629 * Afraid the only safe way is to copy the dict entries into
630 * another small table first.
632 memcpy(small_copy, table, sizeof(small_copy));
633 table = small_copy;
634 EMPTY_TO_MINSIZE(mp);
636 /* else it's a small table that's already empty */
638 /* Now we can finally clear things. If C had refcounts, we could
639 * assert that the refcount on table is 1 now, i.e. that this function
640 * has unique access to it, so decref side-effects can't alter it.
642 for (ep = table; fill > 0; ++ep) {
643 #ifdef Py_DEBUG
644 assert(i < n);
645 ++i;
646 #endif
647 if (ep->me_key) {
648 --fill;
649 Py_DECREF(ep->me_key);
650 Py_XDECREF(ep->me_value);
652 #ifdef Py_DEBUG
653 else
654 assert(ep->me_value == NULL);
655 #endif
658 if (table_is_malloced)
659 PyMem_DEL(table);
662 /* CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
663 * mutates the dict. One exception: it is safe if the loop merely changes
664 * the values associated with the keys (but doesn't insert new keys or
665 * delete keys), via PyDict_SetItem().
668 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
670 int i;
671 register dictobject *mp;
672 if (!PyDict_Check(op))
673 return 0;
674 mp = (dictobject *)op;
675 i = *ppos;
676 if (i < 0)
677 return 0;
678 while (i <= mp->ma_mask && mp->ma_table[i].me_value == NULL)
679 i++;
680 *ppos = i+1;
681 if (i > mp->ma_mask)
682 return 0;
683 if (pkey)
684 *pkey = mp->ma_table[i].me_key;
685 if (pvalue)
686 *pvalue = mp->ma_table[i].me_value;
687 return 1;
690 /* Methods */
692 static void
693 dict_dealloc(register dictobject *mp)
695 register dictentry *ep;
696 int fill = mp->ma_fill;
697 Py_TRASHCAN_SAFE_BEGIN(mp)
698 _PyObject_GC_UNTRACK(mp);
699 for (ep = mp->ma_table; fill > 0; ep++) {
700 if (ep->me_key) {
701 --fill;
702 Py_DECREF(ep->me_key);
703 Py_XDECREF(ep->me_value);
706 if (mp->ma_table != mp->ma_smalltable)
707 PyMem_DEL(mp->ma_table);
708 mp->ob_type->tp_free((PyObject *)mp);
709 Py_TRASHCAN_SAFE_END(mp)
712 static int
713 dict_print(register dictobject *mp, register FILE *fp, register int flags)
715 register int i;
716 register int any;
718 i = Py_ReprEnter((PyObject*)mp);
719 if (i != 0) {
720 if (i < 0)
721 return i;
722 fprintf(fp, "{...}");
723 return 0;
726 fprintf(fp, "{");
727 any = 0;
728 for (i = 0; i <= mp->ma_mask; i++) {
729 dictentry *ep = mp->ma_table + i;
730 PyObject *pvalue = ep->me_value;
731 if (pvalue != NULL) {
732 /* Prevent PyObject_Repr from deleting value during
733 key format */
734 Py_INCREF(pvalue);
735 if (any++ > 0)
736 fprintf(fp, ", ");
737 if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
738 Py_DECREF(pvalue);
739 Py_ReprLeave((PyObject*)mp);
740 return -1;
742 fprintf(fp, ": ");
743 if (PyObject_Print(pvalue, fp, 0) != 0) {
744 Py_DECREF(pvalue);
745 Py_ReprLeave((PyObject*)mp);
746 return -1;
748 Py_DECREF(pvalue);
751 fprintf(fp, "}");
752 Py_ReprLeave((PyObject*)mp);
753 return 0;
756 static PyObject *
757 dict_repr(dictobject *mp)
759 int i;
760 PyObject *s, *temp, *colon = NULL;
761 PyObject *pieces = NULL, *result = NULL;
762 PyObject *key, *value;
764 i = Py_ReprEnter((PyObject *)mp);
765 if (i != 0) {
766 return i > 0 ? PyString_FromString("{...}") : NULL;
769 if (mp->ma_used == 0) {
770 result = PyString_FromString("{}");
771 goto Done;
774 pieces = PyList_New(0);
775 if (pieces == NULL)
776 goto Done;
778 colon = PyString_FromString(": ");
779 if (colon == NULL)
780 goto Done;
782 /* Do repr() on each key+value pair, and insert ": " between them.
783 Note that repr may mutate the dict. */
784 i = 0;
785 while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
786 int status;
787 /* Prevent repr from deleting value during key format. */
788 Py_INCREF(value);
789 s = PyObject_Repr(key);
790 PyString_Concat(&s, colon);
791 PyString_ConcatAndDel(&s, PyObject_Repr(value));
792 Py_DECREF(value);
793 if (s == NULL)
794 goto Done;
795 status = PyList_Append(pieces, s);
796 Py_DECREF(s); /* append created a new ref */
797 if (status < 0)
798 goto Done;
801 /* Add "{}" decorations to the first and last items. */
802 assert(PyList_GET_SIZE(pieces) > 0);
803 s = PyString_FromString("{");
804 if (s == NULL)
805 goto Done;
806 temp = PyList_GET_ITEM(pieces, 0);
807 PyString_ConcatAndDel(&s, temp);
808 PyList_SET_ITEM(pieces, 0, s);
809 if (s == NULL)
810 goto Done;
812 s = PyString_FromString("}");
813 if (s == NULL)
814 goto Done;
815 temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
816 PyString_ConcatAndDel(&temp, s);
817 PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
818 if (temp == NULL)
819 goto Done;
821 /* Paste them all together with ", " between. */
822 s = PyString_FromString(", ");
823 if (s == NULL)
824 goto Done;
825 result = _PyString_Join(s, pieces);
826 Py_DECREF(s);
828 Done:
829 Py_XDECREF(pieces);
830 Py_XDECREF(colon);
831 Py_ReprLeave((PyObject *)mp);
832 return result;
835 static int
836 dict_length(dictobject *mp)
838 return mp->ma_used;
841 static PyObject *
842 dict_subscript(dictobject *mp, register PyObject *key)
844 PyObject *v;
845 long hash;
846 assert(mp->ma_table != NULL);
847 #ifdef CACHE_HASH
848 if (!PyString_CheckExact(key) ||
849 (hash = ((PyStringObject *) key)->ob_shash) == -1)
850 #endif
852 hash = PyObject_Hash(key);
853 if (hash == -1)
854 return NULL;
856 v = (mp->ma_lookup)(mp, key, hash) -> me_value;
857 if (v == NULL)
858 PyErr_SetObject(PyExc_KeyError, key);
859 else
860 Py_INCREF(v);
861 return v;
864 static int
865 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
867 if (w == NULL)
868 return PyDict_DelItem((PyObject *)mp, v);
869 else
870 return PyDict_SetItem((PyObject *)mp, v, w);
873 static PyMappingMethods dict_as_mapping = {
874 (inquiry)dict_length, /*mp_length*/
875 (binaryfunc)dict_subscript, /*mp_subscript*/
876 (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
879 static PyObject *
880 dict_keys(register dictobject *mp)
882 register PyObject *v;
883 register int i, j, n;
885 again:
886 n = mp->ma_used;
887 v = PyList_New(n);
888 if (v == NULL)
889 return NULL;
890 if (n != mp->ma_used) {
891 /* Durnit. The allocations caused the dict to resize.
892 * Just start over, this shouldn't normally happen.
894 Py_DECREF(v);
895 goto again;
897 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
898 if (mp->ma_table[i].me_value != NULL) {
899 PyObject *key = mp->ma_table[i].me_key;
900 Py_INCREF(key);
901 PyList_SET_ITEM(v, j, key);
902 j++;
905 return v;
908 static PyObject *
909 dict_values(register dictobject *mp)
911 register PyObject *v;
912 register int i, j, n;
914 again:
915 n = mp->ma_used;
916 v = PyList_New(n);
917 if (v == NULL)
918 return NULL;
919 if (n != mp->ma_used) {
920 /* Durnit. The allocations caused the dict to resize.
921 * Just start over, this shouldn't normally happen.
923 Py_DECREF(v);
924 goto again;
926 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
927 if (mp->ma_table[i].me_value != NULL) {
928 PyObject *value = mp->ma_table[i].me_value;
929 Py_INCREF(value);
930 PyList_SET_ITEM(v, j, value);
931 j++;
934 return v;
937 static PyObject *
938 dict_items(register dictobject *mp)
940 register PyObject *v;
941 register int i, j, n;
942 PyObject *item, *key, *value;
944 /* Preallocate the list of tuples, to avoid allocations during
945 * the loop over the items, which could trigger GC, which
946 * could resize the dict. :-(
948 again:
949 n = mp->ma_used;
950 v = PyList_New(n);
951 if (v == NULL)
952 return NULL;
953 for (i = 0; i < n; i++) {
954 item = PyTuple_New(2);
955 if (item == NULL) {
956 Py_DECREF(v);
957 return NULL;
959 PyList_SET_ITEM(v, i, item);
961 if (n != mp->ma_used) {
962 /* Durnit. The allocations caused the dict to resize.
963 * Just start over, this shouldn't normally happen.
965 Py_DECREF(v);
966 goto again;
968 /* Nothing we do below makes any function calls. */
969 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
970 if (mp->ma_table[i].me_value != NULL) {
971 key = mp->ma_table[i].me_key;
972 value = mp->ma_table[i].me_value;
973 item = PyList_GET_ITEM(v, j);
974 Py_INCREF(key);
975 PyTuple_SET_ITEM(item, 0, key);
976 Py_INCREF(value);
977 PyTuple_SET_ITEM(item, 1, value);
978 j++;
981 assert(j == n);
982 return v;
985 static PyObject *
986 dict_update(PyObject *mp, PyObject *other)
988 if (PyDict_Update(mp, other) < 0)
989 return NULL;
990 Py_INCREF(Py_None);
991 return Py_None;
994 /* Update unconditionally replaces existing items.
995 Merge has a 3rd argument 'override'; if set, it acts like Update,
996 otherwise it leaves existing items unchanged.
998 PyDict_{Update,Merge} update/merge from a mapping object.
1000 PyDict_MergeFromSeq2 updates/merges from any iterable object
1001 producing iterable objects of length 2.
1005 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1007 PyObject *it; /* iter(seq2) */
1008 int i; /* index into seq2 of current element */
1009 PyObject *item; /* seq2[i] */
1010 PyObject *fast; /* item as a 2-tuple or 2-list */
1012 assert(d != NULL);
1013 assert(PyDict_Check(d));
1014 assert(seq2 != NULL);
1016 it = PyObject_GetIter(seq2);
1017 if (it == NULL)
1018 return -1;
1020 for (i = 0; ; ++i) {
1021 PyObject *key, *value;
1022 int n;
1024 fast = NULL;
1025 item = PyIter_Next(it);
1026 if (item == NULL) {
1027 if (PyErr_Occurred())
1028 goto Fail;
1029 break;
1032 /* Convert item to sequence, and verify length 2. */
1033 fast = PySequence_Fast(item, "");
1034 if (fast == NULL) {
1035 if (PyErr_ExceptionMatches(PyExc_TypeError))
1036 PyErr_Format(PyExc_TypeError,
1037 "cannot convert dictionary update "
1038 "sequence element #%d to a sequence",
1040 goto Fail;
1042 n = PySequence_Fast_GET_SIZE(fast);
1043 if (n != 2) {
1044 PyErr_Format(PyExc_ValueError,
1045 "dictionary update sequence element #%d "
1046 "has length %d; 2 is required",
1047 i, n);
1048 goto Fail;
1051 /* Update/merge with this (key, value) pair. */
1052 key = PySequence_Fast_GET_ITEM(fast, 0);
1053 value = PySequence_Fast_GET_ITEM(fast, 1);
1054 if (override || PyDict_GetItem(d, key) == NULL) {
1055 int status = PyDict_SetItem(d, key, value);
1056 if (status < 0)
1057 goto Fail;
1059 Py_DECREF(fast);
1060 Py_DECREF(item);
1063 i = 0;
1064 goto Return;
1065 Fail:
1066 Py_XDECREF(item);
1067 Py_XDECREF(fast);
1068 i = -1;
1069 Return:
1070 Py_DECREF(it);
1071 return i;
1075 PyDict_Update(PyObject *a, PyObject *b)
1077 return PyDict_Merge(a, b, 1);
1081 PyDict_Merge(PyObject *a, PyObject *b, int override)
1083 register PyDictObject *mp, *other;
1084 register int i;
1085 dictentry *entry;
1087 /* We accept for the argument either a concrete dictionary object,
1088 * or an abstract "mapping" object. For the former, we can do
1089 * things quite efficiently. For the latter, we only require that
1090 * PyMapping_Keys() and PyObject_GetItem() be supported.
1092 if (a == NULL || !PyDict_Check(a) || b == NULL) {
1093 PyErr_BadInternalCall();
1094 return -1;
1096 mp = (dictobject*)a;
1097 if (PyDict_Check(b)) {
1098 other = (dictobject*)b;
1099 if (other == mp || other->ma_used == 0)
1100 /* a.update(a) or a.update({}); nothing to do */
1101 return 0;
1102 /* Do one big resize at the start, rather than
1103 * incrementally resizing as we insert new items. Expect
1104 * that there will be no (or few) overlapping keys.
1106 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1107 if (dictresize(mp, (mp->ma_used + other->ma_used)*3/2) != 0)
1108 return -1;
1110 for (i = 0; i <= other->ma_mask; i++) {
1111 entry = &other->ma_table[i];
1112 if (entry->me_value != NULL &&
1113 (override ||
1114 PyDict_GetItem(a, entry->me_key) == NULL)) {
1115 Py_INCREF(entry->me_key);
1116 Py_INCREF(entry->me_value);
1117 insertdict(mp, entry->me_key, entry->me_hash,
1118 entry->me_value);
1122 else {
1123 /* Do it the generic, slower way */
1124 PyObject *keys = PyMapping_Keys(b);
1125 PyObject *iter;
1126 PyObject *key, *value;
1127 int status;
1129 if (keys == NULL)
1130 /* Docstring says this is equivalent to E.keys() so
1131 * if E doesn't have a .keys() method we want
1132 * AttributeError to percolate up. Might as well
1133 * do the same for any other error.
1135 return -1;
1137 iter = PyObject_GetIter(keys);
1138 Py_DECREF(keys);
1139 if (iter == NULL)
1140 return -1;
1142 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1143 if (!override && PyDict_GetItem(a, key) != NULL) {
1144 Py_DECREF(key);
1145 continue;
1147 value = PyObject_GetItem(b, key);
1148 if (value == NULL) {
1149 Py_DECREF(iter);
1150 Py_DECREF(key);
1151 return -1;
1153 status = PyDict_SetItem(a, key, value);
1154 Py_DECREF(key);
1155 Py_DECREF(value);
1156 if (status < 0) {
1157 Py_DECREF(iter);
1158 return -1;
1161 Py_DECREF(iter);
1162 if (PyErr_Occurred())
1163 /* Iterator completed, via error */
1164 return -1;
1166 return 0;
1169 static PyObject *
1170 dict_copy(register dictobject *mp)
1172 return PyDict_Copy((PyObject*)mp);
1175 PyObject *
1176 PyDict_Copy(PyObject *o)
1178 register dictobject *mp;
1179 register int i;
1180 dictobject *copy;
1181 dictentry *entry;
1183 if (o == NULL || !PyDict_Check(o)) {
1184 PyErr_BadInternalCall();
1185 return NULL;
1187 mp = (dictobject *)o;
1188 copy = (dictobject *)PyDict_New();
1189 if (copy == NULL)
1190 return NULL;
1191 if (mp->ma_used > 0) {
1192 if (dictresize(copy, mp->ma_used*3/2) != 0)
1193 return NULL;
1194 for (i = 0; i <= mp->ma_mask; i++) {
1195 entry = &mp->ma_table[i];
1196 if (entry->me_value != NULL) {
1197 Py_INCREF(entry->me_key);
1198 Py_INCREF(entry->me_value);
1199 insertdict(copy, entry->me_key, entry->me_hash,
1200 entry->me_value);
1204 return (PyObject *)copy;
1208 PyDict_Size(PyObject *mp)
1210 if (mp == NULL || !PyDict_Check(mp)) {
1211 PyErr_BadInternalCall();
1212 return 0;
1214 return ((dictobject *)mp)->ma_used;
1217 PyObject *
1218 PyDict_Keys(PyObject *mp)
1220 if (mp == NULL || !PyDict_Check(mp)) {
1221 PyErr_BadInternalCall();
1222 return NULL;
1224 return dict_keys((dictobject *)mp);
1227 PyObject *
1228 PyDict_Values(PyObject *mp)
1230 if (mp == NULL || !PyDict_Check(mp)) {
1231 PyErr_BadInternalCall();
1232 return NULL;
1234 return dict_values((dictobject *)mp);
1237 PyObject *
1238 PyDict_Items(PyObject *mp)
1240 if (mp == NULL || !PyDict_Check(mp)) {
1241 PyErr_BadInternalCall();
1242 return NULL;
1244 return dict_items((dictobject *)mp);
1247 /* Subroutine which returns the smallest key in a for which b's value
1248 is different or absent. The value is returned too, through the
1249 pval argument. Both are NULL if no key in a is found for which b's status
1250 differs. The refcounts on (and only on) non-NULL *pval and function return
1251 values must be decremented by the caller (characterize() increments them
1252 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1253 them before the caller is done looking at them). */
1255 static PyObject *
1256 characterize(dictobject *a, dictobject *b, PyObject **pval)
1258 PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1259 PyObject *aval = NULL; /* a[akey] */
1260 int i, cmp;
1262 for (i = 0; i <= a->ma_mask; i++) {
1263 PyObject *thiskey, *thisaval, *thisbval;
1264 if (a->ma_table[i].me_value == NULL)
1265 continue;
1266 thiskey = a->ma_table[i].me_key;
1267 Py_INCREF(thiskey); /* keep alive across compares */
1268 if (akey != NULL) {
1269 cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1270 if (cmp < 0) {
1271 Py_DECREF(thiskey);
1272 goto Fail;
1274 if (cmp > 0 ||
1275 i > a->ma_mask ||
1276 a->ma_table[i].me_value == NULL)
1278 /* Not the *smallest* a key; or maybe it is
1279 * but the compare shrunk the dict so we can't
1280 * find its associated value anymore; or
1281 * maybe it is but the compare deleted the
1282 * a[thiskey] entry.
1284 Py_DECREF(thiskey);
1285 continue;
1289 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1290 thisaval = a->ma_table[i].me_value;
1291 assert(thisaval);
1292 Py_INCREF(thisaval); /* keep alive */
1293 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1294 if (thisbval == NULL)
1295 cmp = 0;
1296 else {
1297 /* both dicts have thiskey: same values? */
1298 cmp = PyObject_RichCompareBool(
1299 thisaval, thisbval, Py_EQ);
1300 if (cmp < 0) {
1301 Py_DECREF(thiskey);
1302 Py_DECREF(thisaval);
1303 goto Fail;
1306 if (cmp == 0) {
1307 /* New winner. */
1308 Py_XDECREF(akey);
1309 Py_XDECREF(aval);
1310 akey = thiskey;
1311 aval = thisaval;
1313 else {
1314 Py_DECREF(thiskey);
1315 Py_DECREF(thisaval);
1318 *pval = aval;
1319 return akey;
1321 Fail:
1322 Py_XDECREF(akey);
1323 Py_XDECREF(aval);
1324 *pval = NULL;
1325 return NULL;
1328 static int
1329 dict_compare(dictobject *a, dictobject *b)
1331 PyObject *adiff, *bdiff, *aval, *bval;
1332 int res;
1334 /* Compare lengths first */
1335 if (a->ma_used < b->ma_used)
1336 return -1; /* a is shorter */
1337 else if (a->ma_used > b->ma_used)
1338 return 1; /* b is shorter */
1340 /* Same length -- check all keys */
1341 bdiff = bval = NULL;
1342 adiff = characterize(a, b, &aval);
1343 if (adiff == NULL) {
1344 assert(!aval);
1345 /* Either an error, or a is a subset with the same length so
1346 * must be equal.
1348 res = PyErr_Occurred() ? -1 : 0;
1349 goto Finished;
1351 bdiff = characterize(b, a, &bval);
1352 if (bdiff == NULL && PyErr_Occurred()) {
1353 assert(!bval);
1354 res = -1;
1355 goto Finished;
1357 res = 0;
1358 if (bdiff) {
1359 /* bdiff == NULL "should be" impossible now, but perhaps
1360 * the last comparison done by the characterize() on a had
1361 * the side effect of making the dicts equal!
1363 res = PyObject_Compare(adiff, bdiff);
1365 if (res == 0 && bval != NULL)
1366 res = PyObject_Compare(aval, bval);
1368 Finished:
1369 Py_XDECREF(adiff);
1370 Py_XDECREF(bdiff);
1371 Py_XDECREF(aval);
1372 Py_XDECREF(bval);
1373 return res;
1376 /* Return 1 if dicts equal, 0 if not, -1 if error.
1377 * Gets out as soon as any difference is detected.
1378 * Uses only Py_EQ comparison.
1380 static int
1381 dict_equal(dictobject *a, dictobject *b)
1383 int i;
1385 if (a->ma_used != b->ma_used)
1386 /* can't be equal if # of entries differ */
1387 return 0;
1389 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1390 for (i = 0; i <= a->ma_mask; i++) {
1391 PyObject *aval = a->ma_table[i].me_value;
1392 if (aval != NULL) {
1393 int cmp;
1394 PyObject *bval;
1395 PyObject *key = a->ma_table[i].me_key;
1396 /* temporarily bump aval's refcount to ensure it stays
1397 alive until we're done with it */
1398 Py_INCREF(aval);
1399 bval = PyDict_GetItem((PyObject *)b, key);
1400 if (bval == NULL) {
1401 Py_DECREF(aval);
1402 return 0;
1404 cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1405 Py_DECREF(aval);
1406 if (cmp <= 0) /* error or not equal */
1407 return cmp;
1410 return 1;
1413 static PyObject *
1414 dict_richcompare(PyObject *v, PyObject *w, int op)
1416 int cmp;
1417 PyObject *res;
1419 if (!PyDict_Check(v) || !PyDict_Check(w)) {
1420 res = Py_NotImplemented;
1422 else if (op == Py_EQ || op == Py_NE) {
1423 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1424 if (cmp < 0)
1425 return NULL;
1426 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1428 else
1429 res = Py_NotImplemented;
1430 Py_INCREF(res);
1431 return res;
1434 static PyObject *
1435 dict_has_key(register dictobject *mp, PyObject *key)
1437 long hash;
1438 register long ok;
1439 #ifdef CACHE_HASH
1440 if (!PyString_CheckExact(key) ||
1441 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1442 #endif
1444 hash = PyObject_Hash(key);
1445 if (hash == -1)
1446 return NULL;
1448 ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1449 return PyInt_FromLong(ok);
1452 static PyObject *
1453 dict_get(register dictobject *mp, PyObject *args)
1455 PyObject *key;
1456 PyObject *failobj = Py_None;
1457 PyObject *val = NULL;
1458 long hash;
1460 if (!PyArg_ParseTuple(args, "O|O:get", &key, &failobj))
1461 return NULL;
1463 #ifdef CACHE_HASH
1464 if (!PyString_CheckExact(key) ||
1465 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1466 #endif
1468 hash = PyObject_Hash(key);
1469 if (hash == -1)
1470 return NULL;
1472 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1474 if (val == NULL)
1475 val = failobj;
1476 Py_INCREF(val);
1477 return val;
1481 static PyObject *
1482 dict_setdefault(register dictobject *mp, PyObject *args)
1484 PyObject *key;
1485 PyObject *failobj = Py_None;
1486 PyObject *val = NULL;
1487 long hash;
1489 if (!PyArg_ParseTuple(args, "O|O:setdefault", &key, &failobj))
1490 return NULL;
1492 #ifdef CACHE_HASH
1493 if (!PyString_CheckExact(key) ||
1494 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1495 #endif
1497 hash = PyObject_Hash(key);
1498 if (hash == -1)
1499 return NULL;
1501 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1502 if (val == NULL) {
1503 val = failobj;
1504 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1505 val = NULL;
1507 Py_XINCREF(val);
1508 return val;
1512 static PyObject *
1513 dict_clear(register dictobject *mp)
1515 PyDict_Clear((PyObject *)mp);
1516 Py_INCREF(Py_None);
1517 return Py_None;
1520 static PyObject *
1521 dict_popitem(dictobject *mp)
1523 int i = 0;
1524 dictentry *ep;
1525 PyObject *res;
1527 /* Allocate the result tuple before checking the size. Believe it
1528 * or not, this allocation could trigger a garbage collection which
1529 * could empty the dict, so if we checked the size first and that
1530 * happened, the result would be an infinite loop (searching for an
1531 * entry that no longer exists). Note that the usual popitem()
1532 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1533 * tuple away if the dict *is* empty isn't a significant
1534 * inefficiency -- possible, but unlikely in practice.
1536 res = PyTuple_New(2);
1537 if (res == NULL)
1538 return NULL;
1539 if (mp->ma_used == 0) {
1540 Py_DECREF(res);
1541 PyErr_SetString(PyExc_KeyError,
1542 "popitem(): dictionary is empty");
1543 return NULL;
1545 /* Set ep to "the first" dict entry with a value. We abuse the hash
1546 * field of slot 0 to hold a search finger:
1547 * If slot 0 has a value, use slot 0.
1548 * Else slot 0 is being used to hold a search finger,
1549 * and we use its hash value as the first index to look.
1551 ep = &mp->ma_table[0];
1552 if (ep->me_value == NULL) {
1553 i = (int)ep->me_hash;
1554 /* The hash field may be a real hash value, or it may be a
1555 * legit search finger, or it may be a once-legit search
1556 * finger that's out of bounds now because it wrapped around
1557 * or the table shrunk -- simply make sure it's in bounds now.
1559 if (i > mp->ma_mask || i < 1)
1560 i = 1; /* skip slot 0 */
1561 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1562 i++;
1563 if (i > mp->ma_mask)
1564 i = 1;
1567 PyTuple_SET_ITEM(res, 0, ep->me_key);
1568 PyTuple_SET_ITEM(res, 1, ep->me_value);
1569 Py_INCREF(dummy);
1570 ep->me_key = dummy;
1571 ep->me_value = NULL;
1572 mp->ma_used--;
1573 assert(mp->ma_table[0].me_value == NULL);
1574 mp->ma_table[0].me_hash = i + 1; /* next place to start */
1575 return res;
1578 static int
1579 dict_traverse(PyObject *op, visitproc visit, void *arg)
1581 int i = 0, err;
1582 PyObject *pk;
1583 PyObject *pv;
1585 while (PyDict_Next(op, &i, &pk, &pv)) {
1586 err = visit(pk, arg);
1587 if (err)
1588 return err;
1589 err = visit(pv, arg);
1590 if (err)
1591 return err;
1593 return 0;
1596 static int
1597 dict_tp_clear(PyObject *op)
1599 PyDict_Clear(op);
1600 return 0;
1604 staticforward PyObject *dictiter_new(dictobject *, binaryfunc);
1606 static PyObject *
1607 select_key(PyObject *key, PyObject *value)
1609 Py_INCREF(key);
1610 return key;
1613 static PyObject *
1614 select_value(PyObject *key, PyObject *value)
1616 Py_INCREF(value);
1617 return value;
1620 static PyObject *
1621 select_item(PyObject *key, PyObject *value)
1623 PyObject *res = PyTuple_New(2);
1625 if (res != NULL) {
1626 Py_INCREF(key);
1627 Py_INCREF(value);
1628 PyTuple_SET_ITEM(res, 0, key);
1629 PyTuple_SET_ITEM(res, 1, value);
1631 return res;
1634 static PyObject *
1635 dict_iterkeys(dictobject *dict)
1637 return dictiter_new(dict, select_key);
1640 static PyObject *
1641 dict_itervalues(dictobject *dict)
1643 return dictiter_new(dict, select_value);
1646 static PyObject *
1647 dict_iteritems(dictobject *dict)
1649 return dictiter_new(dict, select_item);
1653 static char has_key__doc__[] =
1654 "D.has_key(k) -> 1 if D has a key k, else 0";
1656 static char get__doc__[] =
1657 "D.get(k[,d]) -> D[k] if D.has_key(k), else d. d defaults to None.";
1659 static char setdefault_doc__[] =
1660 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if not D.has_key(k)";
1662 static char popitem__doc__[] =
1663 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1664 2-tuple; but raise KeyError if D is empty";
1666 static char keys__doc__[] =
1667 "D.keys() -> list of D's keys";
1669 static char items__doc__[] =
1670 "D.items() -> list of D's (key, value) pairs, as 2-tuples";
1672 static char values__doc__[] =
1673 "D.values() -> list of D's values";
1675 static char update__doc__[] =
1676 "D.update(E) -> None. Update D from E: for k in E.keys(): D[k] = E[k]";
1678 static char clear__doc__[] =
1679 "D.clear() -> None. Remove all items from D.";
1681 static char copy__doc__[] =
1682 "D.copy() -> a shallow copy of D";
1684 static char iterkeys__doc__[] =
1685 "D.iterkeys() -> an iterator over the keys of D";
1687 static char itervalues__doc__[] =
1688 "D.itervalues() -> an iterator over the values of D";
1690 static char iteritems__doc__[] =
1691 "D.iteritems() -> an iterator over the (key, value) items of D";
1693 static PyMethodDef mapp_methods[] = {
1694 {"has_key", (PyCFunction)dict_has_key, METH_O,
1695 has_key__doc__},
1696 {"get", (PyCFunction)dict_get, METH_VARARGS,
1697 get__doc__},
1698 {"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
1699 setdefault_doc__},
1700 {"popitem", (PyCFunction)dict_popitem, METH_NOARGS,
1701 popitem__doc__},
1702 {"keys", (PyCFunction)dict_keys, METH_NOARGS,
1703 keys__doc__},
1704 {"items", (PyCFunction)dict_items, METH_NOARGS,
1705 items__doc__},
1706 {"values", (PyCFunction)dict_values, METH_NOARGS,
1707 values__doc__},
1708 {"update", (PyCFunction)dict_update, METH_O,
1709 update__doc__},
1710 {"clear", (PyCFunction)dict_clear, METH_NOARGS,
1711 clear__doc__},
1712 {"copy", (PyCFunction)dict_copy, METH_NOARGS,
1713 copy__doc__},
1714 {"iterkeys", (PyCFunction)dict_iterkeys, METH_NOARGS,
1715 iterkeys__doc__},
1716 {"itervalues", (PyCFunction)dict_itervalues, METH_NOARGS,
1717 itervalues__doc__},
1718 {"iteritems", (PyCFunction)dict_iteritems, METH_NOARGS,
1719 iteritems__doc__},
1720 {NULL, NULL} /* sentinel */
1723 static int
1724 dict_contains(dictobject *mp, PyObject *key)
1726 long hash;
1728 #ifdef CACHE_HASH
1729 if (!PyString_CheckExact(key) ||
1730 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1731 #endif
1733 hash = PyObject_Hash(key);
1734 if (hash == -1)
1735 return -1;
1737 return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1740 /* Hack to implement "key in dict" */
1741 static PySequenceMethods dict_as_sequence = {
1742 0, /* sq_length */
1743 0, /* sq_concat */
1744 0, /* sq_repeat */
1745 0, /* sq_item */
1746 0, /* sq_slice */
1747 0, /* sq_ass_item */
1748 0, /* sq_ass_slice */
1749 (objobjproc)dict_contains, /* sq_contains */
1750 0, /* sq_inplace_concat */
1751 0, /* sq_inplace_repeat */
1754 static PyObject *
1755 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1757 PyObject *self;
1759 assert(type != NULL && type->tp_alloc != NULL);
1760 self = type->tp_alloc(type, 0);
1761 if (self != NULL) {
1762 PyDictObject *d = (PyDictObject *)self;
1763 /* It's guaranteed that tp->alloc zeroed out the struct. */
1764 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1765 INIT_NONZERO_DICT_SLOTS(d);
1766 d->ma_lookup = lookdict_string;
1767 #ifdef SHOW_CONVERSION_COUNTS
1768 ++created;
1769 #endif
1771 return self;
1774 static int
1775 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
1777 PyObject *arg = NULL;
1778 static char *kwlist[] = {"items", 0};
1779 int result = 0;
1781 if (!PyArg_ParseTupleAndKeywords(args, kwds, "|O:dict",
1782 kwlist, &arg))
1783 result = -1;
1785 else if (arg != NULL) {
1786 if (PyObject_HasAttrString(arg, "keys"))
1787 result = PyDict_Merge(self, arg, 1);
1788 else
1789 result = PyDict_MergeFromSeq2(self, arg, 1);
1791 return result;
1794 static long
1795 dict_nohash(PyObject *self)
1797 PyErr_SetString(PyExc_TypeError, "dict objects are unhashable");
1798 return -1;
1801 static PyObject *
1802 dict_iter(dictobject *dict)
1804 return dictiter_new(dict, select_key);
1807 static char dictionary_doc[] =
1808 "dict() -> new empty dictionary.\n"
1809 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
1810 " (key, value) pairs.\n"
1811 "dict(seq) -> new dictionary initialized as if via:\n"
1812 " d = {}\n"
1813 " for k, v in seq:\n"
1814 " d[k] = v";
1816 PyTypeObject PyDict_Type = {
1817 PyObject_HEAD_INIT(&PyType_Type)
1819 "dict",
1820 sizeof(dictobject),
1822 (destructor)dict_dealloc, /* tp_dealloc */
1823 (printfunc)dict_print, /* tp_print */
1824 0, /* tp_getattr */
1825 0, /* tp_setattr */
1826 (cmpfunc)dict_compare, /* tp_compare */
1827 (reprfunc)dict_repr, /* tp_repr */
1828 0, /* tp_as_number */
1829 &dict_as_sequence, /* tp_as_sequence */
1830 &dict_as_mapping, /* tp_as_mapping */
1831 dict_nohash, /* tp_hash */
1832 0, /* tp_call */
1833 0, /* tp_str */
1834 PyObject_GenericGetAttr, /* tp_getattro */
1835 0, /* tp_setattro */
1836 0, /* tp_as_buffer */
1837 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
1838 Py_TPFLAGS_BASETYPE, /* tp_flags */
1839 dictionary_doc, /* tp_doc */
1840 (traverseproc)dict_traverse, /* tp_traverse */
1841 (inquiry)dict_tp_clear, /* tp_clear */
1842 dict_richcompare, /* tp_richcompare */
1843 0, /* tp_weaklistoffset */
1844 (getiterfunc)dict_iter, /* tp_iter */
1845 0, /* tp_iternext */
1846 mapp_methods, /* tp_methods */
1847 0, /* tp_members */
1848 0, /* tp_getset */
1849 0, /* tp_base */
1850 0, /* tp_dict */
1851 0, /* tp_descr_get */
1852 0, /* tp_descr_set */
1853 0, /* tp_dictoffset */
1854 (initproc)dict_init, /* tp_init */
1855 PyType_GenericAlloc, /* tp_alloc */
1856 dict_new, /* tp_new */
1857 _PyObject_GC_Del, /* tp_free */
1860 /* For backward compatibility with old dictionary interface */
1862 PyObject *
1863 PyDict_GetItemString(PyObject *v, char *key)
1865 PyObject *kv, *rv;
1866 kv = PyString_FromString(key);
1867 if (kv == NULL)
1868 return NULL;
1869 rv = PyDict_GetItem(v, kv);
1870 Py_DECREF(kv);
1871 return rv;
1875 PyDict_SetItemString(PyObject *v, char *key, PyObject *item)
1877 PyObject *kv;
1878 int err;
1879 kv = PyString_FromString(key);
1880 if (kv == NULL)
1881 return -1;
1882 PyString_InternInPlace(&kv); /* XXX Should we really? */
1883 err = PyDict_SetItem(v, kv, item);
1884 Py_DECREF(kv);
1885 return err;
1889 PyDict_DelItemString(PyObject *v, char *key)
1891 PyObject *kv;
1892 int err;
1893 kv = PyString_FromString(key);
1894 if (kv == NULL)
1895 return -1;
1896 err = PyDict_DelItem(v, kv);
1897 Py_DECREF(kv);
1898 return err;
1901 /* Dictionary iterator type */
1903 extern PyTypeObject PyDictIter_Type; /* Forward */
1905 typedef struct {
1906 PyObject_HEAD
1907 dictobject *di_dict;
1908 int di_used;
1909 int di_pos;
1910 binaryfunc di_select;
1911 } dictiterobject;
1913 static PyObject *
1914 dictiter_new(dictobject *dict, binaryfunc select)
1916 dictiterobject *di;
1917 di = PyObject_NEW(dictiterobject, &PyDictIter_Type);
1918 if (di == NULL)
1919 return NULL;
1920 Py_INCREF(dict);
1921 di->di_dict = dict;
1922 di->di_used = dict->ma_used;
1923 di->di_pos = 0;
1924 di->di_select = select;
1925 return (PyObject *)di;
1928 static void
1929 dictiter_dealloc(dictiterobject *di)
1931 Py_DECREF(di->di_dict);
1932 PyObject_DEL(di);
1935 static PyObject *
1936 dictiter_next(dictiterobject *di, PyObject *args)
1938 PyObject *key, *value;
1940 if (di->di_used != di->di_dict->ma_used) {
1941 PyErr_SetString(PyExc_RuntimeError,
1942 "dictionary changed size during iteration");
1943 return NULL;
1945 if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1946 return (*di->di_select)(key, value);
1948 PyErr_SetObject(PyExc_StopIteration, Py_None);
1949 return NULL;
1952 static PyObject *
1953 dictiter_getiter(PyObject *it)
1955 Py_INCREF(it);
1956 return it;
1959 static PyMethodDef dictiter_methods[] = {
1960 {"next", (PyCFunction)dictiter_next, METH_VARARGS,
1961 "it.next() -- get the next value, or raise StopIteration"},
1962 {NULL, NULL} /* sentinel */
1965 static PyObject *dictiter_iternext(dictiterobject *di)
1967 PyObject *key, *value;
1969 if (di->di_used != di->di_dict->ma_used) {
1970 PyErr_SetString(PyExc_RuntimeError,
1971 "dictionary changed size during iteration");
1972 return NULL;
1974 if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1975 return (*di->di_select)(key, value);
1977 return NULL;
1980 PyTypeObject PyDictIter_Type = {
1981 PyObject_HEAD_INIT(&PyType_Type)
1982 0, /* ob_size */
1983 "dictionary-iterator", /* tp_name */
1984 sizeof(dictiterobject), /* tp_basicsize */
1985 0, /* tp_itemsize */
1986 /* methods */
1987 (destructor)dictiter_dealloc, /* tp_dealloc */
1988 0, /* tp_print */
1989 0, /* tp_getattr */
1990 0, /* tp_setattr */
1991 0, /* tp_compare */
1992 0, /* tp_repr */
1993 0, /* tp_as_number */
1994 0, /* tp_as_sequence */
1995 0, /* tp_as_mapping */
1996 0, /* tp_hash */
1997 0, /* tp_call */
1998 0, /* tp_str */
1999 PyObject_GenericGetAttr, /* tp_getattro */
2000 0, /* tp_setattro */
2001 0, /* tp_as_buffer */
2002 Py_TPFLAGS_DEFAULT, /* tp_flags */
2003 0, /* tp_doc */
2004 0, /* tp_traverse */
2005 0, /* tp_clear */
2006 0, /* tp_richcompare */
2007 0, /* tp_weaklistoffset */
2008 (getiterfunc)dictiter_getiter, /* tp_iter */
2009 (iternextfunc)dictiter_iternext, /* tp_iternext */
2010 dictiter_methods, /* tp_methods */
2011 0, /* tp_members */
2012 0, /* tp_getset */
2013 0, /* tp_base */
2014 0, /* tp_dict */
2015 0, /* tp_descr_get */
2016 0, /* tp_descr_set */