Merged release21-maint changes.
[python/dscho.git] / Objects / dictobject.c
blob73c459fb41ce359cbed2a8248b7794d9151b28ea
2 /* Dictionary object implementation using a hash table */
4 #include "Python.h"
6 typedef PyDictEntry dictentry;
7 typedef PyDictObject dictobject;
9 /* Define this out if you don't want conversion statistics on exit. */
10 #undef SHOW_CONVERSION_COUNTS
12 /* See large comment block below. This must be >= 1. */
13 #define PERTURB_SHIFT 5
16 Major subtleties ahead: Most hash schemes depend on having a "good" hash
17 function, in the sense of simulating randomness. Python doesn't: its most
18 important hash functions (for strings and ints) are very regular in common
19 cases:
21 >>> map(hash, (0, 1, 2, 3))
22 [0, 1, 2, 3]
23 >>> map(hash, ("namea", "nameb", "namec", "named"))
24 [-1658398457, -1658398460, -1658398459, -1658398462]
25 >>>
27 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
28 the low-order i bits as the initial table index is extremely fast, and there
29 are no collisions at all for dicts indexed by a contiguous range of ints.
30 The same is approximately true when keys are "consecutive" strings. So this
31 gives better-than-random behavior in common cases, and that's very desirable.
33 OTOH, when collisions occur, the tendency to fill contiguous slices of the
34 hash table makes a good collision resolution strategy crucial. Taking only
35 the last i bits of the hash code is also vulnerable: for example, consider
36 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
37 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
38 hash code are all 0: they *all* map to the same table index.
40 But catering to unusual cases should not slow the usual ones, so we just take
41 the last i bits anyway. It's up to collision resolution to do the rest. If
42 we *usually* find the key we're looking for on the first try (and, it turns
43 out, we usually do -- the table load factor is kept under 2/3, so the odds
44 are solidly in our favor), then it makes best sense to keep the initial index
45 computation dirt cheap.
47 The first half of collision resolution is to visit table indices via this
48 recurrence:
50 j = ((5*j) + 1) mod 2**i
52 For any initial j in range(2**i), repeating that 2**i times generates each
53 int in range(2**i) exactly once (see any text on random-number generation for
54 proof). By itself, this doesn't help much: like linear probing (setting
55 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
56 order. This would be bad, except that's not the only thing we do, and it's
57 actually *good* in the common cases where hash keys are consecutive. In an
58 example that's really too small to make this entirely clear, for a table of
59 size 2**3 the order of indices is:
61 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
63 If two things come in at index 5, the first place we look after is index 2,
64 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
65 Linear probing is deadly in this case because there the fixed probe order
66 is the *same* as the order consecutive keys are likely to arrive. But it's
67 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
68 and certain that consecutive hash codes do not.
70 The other half of the strategy is to get the other bits of the hash code
71 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
72 full hash code, and changing the recurrence to:
74 j = (5*j) + 1 + perturb;
75 perturb >>= PERTURB_SHIFT;
76 use j % 2**i as the next table index;
78 Now the probe sequence depends (eventually) on every bit in the hash code,
79 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
80 because it quickly magnifies small differences in the bits that didn't affect
81 the initial index. Note that because perturb is unsigned, if the recurrence
82 is executed often enough perturb eventually becomes and remains 0. At that
83 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
84 that's certain to find an empty slot eventually (since it generates every int
85 in range(2**i), and we make sure there's always at least one empty slot).
87 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
88 small so that the high bits of the hash code continue to affect the probe
89 sequence across iterations; but you want it large so that in really bad cases
90 the high-order hash bits have an effect on early iterations. 5 was "the
91 best" in minimizing total collisions across experiments Tim Peters ran (on
92 both normal and pathological cases), but 4 and 6 weren't significantly worse.
94 Historical: Reimer Behrends contributed the idea of using a polynomial-based
95 approach, using repeated multiplication by x in GF(2**n) where an irreducible
96 polynomial for each table size was chosen such that x was a primitive root.
97 Christian Tismer later extended that to use division by x instead, as an
98 efficient way to get the high bits of the hash code into play. This scheme
99 also gave excellent collision statistics, but was more expensive: two
100 if-tests were required inside the loop; computing "the next" index took about
101 the same number of operations but without as much potential parallelism
102 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
103 above, and then shifting perturb can be done while the table index is being
104 masked); and the dictobject struct required a member to hold the table's
105 polynomial. In Tim's experiments the current scheme ran faster, produced
106 equally good collision statistics, needed less code & used less memory.
109 /* Object used as dummy key to fill deleted entries */
110 static PyObject *dummy; /* Initialized by first call to newdictobject() */
112 /* forward declarations */
113 static dictentry *
114 lookdict_string(dictobject *mp, PyObject *key, long hash);
116 #ifdef SHOW_CONVERSION_COUNTS
117 static long created = 0L;
118 static long converted = 0L;
120 static void
121 show_counts(void)
123 fprintf(stderr, "created %ld string dicts\n", created);
124 fprintf(stderr, "converted %ld to normal dicts\n", converted);
125 fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
127 #endif
129 /* Initialization macros.
130 There are two ways to create a dict: PyDict_New() is the main C API
131 function, and the tp_new slot maps to dict_new(). In the latter case we
132 can save a little time over what PyDict_New does because it's guaranteed
133 that the PyDictObject struct is already zeroed out.
134 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
135 an excellent reason not to).
138 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
139 (mp)->ma_table = (mp)->ma_smalltable; \
140 (mp)->ma_mask = PyDict_MINSIZE - 1; \
141 } while(0)
143 #define EMPTY_TO_MINSIZE(mp) do { \
144 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
145 (mp)->ma_used = (mp)->ma_fill = 0; \
146 INIT_NONZERO_DICT_SLOTS(mp); \
147 } while(0)
149 PyObject *
150 PyDict_New(void)
152 register dictobject *mp;
153 if (dummy == NULL) { /* Auto-initialize dummy */
154 dummy = PyString_FromString("<dummy key>");
155 if (dummy == NULL)
156 return NULL;
157 #ifdef SHOW_CONVERSION_COUNTS
158 Py_AtExit(show_counts);
159 #endif
161 mp = PyObject_NEW(dictobject, &PyDict_Type);
162 if (mp == NULL)
163 return NULL;
164 EMPTY_TO_MINSIZE(mp);
165 mp->ma_lookup = lookdict_string;
166 #ifdef SHOW_CONVERSION_COUNTS
167 ++created;
168 #endif
169 PyObject_GC_Init(mp);
170 return (PyObject *)mp;
174 The basic lookup function used by all operations.
175 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
176 Open addressing is preferred over chaining since the link overhead for
177 chaining would be substantial (100% with typical malloc overhead).
179 The initial probe index is computed as hash mod the table size. Subsequent
180 probe indices are computed as explained earlier.
182 All arithmetic on hash should ignore overflow.
184 (The details in this version are due to Tim Peters, building on many past
185 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
186 Christian Tismer).
188 This function must never return NULL; failures are indicated by returning
189 a dictentry* for which the me_value field is NULL. Exceptions are never
190 reported by this function, and outstanding exceptions are maintained.
193 static dictentry *
194 lookdict(dictobject *mp, PyObject *key, register long hash)
196 register int i;
197 register unsigned int perturb;
198 register dictentry *freeslot;
199 register unsigned int mask = mp->ma_mask;
200 dictentry *ep0 = mp->ma_table;
201 register dictentry *ep;
202 register int restore_error;
203 register int checked_error;
204 register int cmp;
205 PyObject *err_type, *err_value, *err_tb;
206 PyObject *startkey;
208 i = hash & mask;
209 ep = &ep0[i];
210 if (ep->me_key == NULL || ep->me_key == key)
211 return ep;
213 restore_error = checked_error = 0;
214 if (ep->me_key == dummy)
215 freeslot = ep;
216 else {
217 if (ep->me_hash == hash) {
218 /* error can't have been checked yet */
219 checked_error = 1;
220 if (PyErr_Occurred()) {
221 restore_error = 1;
222 PyErr_Fetch(&err_type, &err_value, &err_tb);
224 startkey = ep->me_key;
225 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
226 if (cmp < 0)
227 PyErr_Clear();
228 if (ep0 == mp->ma_table && ep->me_key == startkey) {
229 if (cmp > 0)
230 goto Done;
232 else {
233 /* The compare did major nasty stuff to the
234 * dict: start over.
235 * XXX A clever adversary could prevent this
236 * XXX from terminating.
238 ep = lookdict(mp, key, hash);
239 goto Done;
242 freeslot = NULL;
245 /* In the loop, me_key == dummy is by far (factor of 100s) the
246 least likely outcome, so test for that last. */
247 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
248 i = (i << 2) + i + perturb + 1;
249 ep = &ep0[i & mask];
250 if (ep->me_key == NULL) {
251 if (freeslot != NULL)
252 ep = freeslot;
253 break;
255 if (ep->me_key == key)
256 break;
257 if (ep->me_hash == hash && ep->me_key != dummy) {
258 if (!checked_error) {
259 checked_error = 1;
260 if (PyErr_Occurred()) {
261 restore_error = 1;
262 PyErr_Fetch(&err_type, &err_value,
263 &err_tb);
266 startkey = ep->me_key;
267 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
268 if (cmp < 0)
269 PyErr_Clear();
270 if (ep0 == mp->ma_table && ep->me_key == startkey) {
271 if (cmp > 0)
272 break;
274 else {
275 /* The compare did major nasty stuff to the
276 * dict: start over.
277 * XXX A clever adversary could prevent this
278 * XXX from terminating.
280 ep = lookdict(mp, key, hash);
281 break;
284 else if (ep->me_key == dummy && freeslot == NULL)
285 freeslot = ep;
288 Done:
289 if (restore_error)
290 PyErr_Restore(err_type, err_value, err_tb);
291 return ep;
295 * Hacked up version of lookdict which can assume keys are always strings;
296 * this assumption allows testing for errors during PyObject_Compare() to
297 * be dropped; string-string comparisons never raise exceptions. This also
298 * means we don't need to go through PyObject_Compare(); we can always use
299 * _PyString_Eq directly.
301 * This really only becomes meaningful if proper error handling in lookdict()
302 * is too expensive.
304 static dictentry *
305 lookdict_string(dictobject *mp, PyObject *key, register long hash)
307 register int i;
308 register unsigned int perturb;
309 register dictentry *freeslot;
310 register unsigned int mask = mp->ma_mask;
311 dictentry *ep0 = mp->ma_table;
312 register dictentry *ep;
314 /* make sure this function doesn't have to handle non-string keys */
315 if (!PyString_Check(key)) {
316 #ifdef SHOW_CONVERSION_COUNTS
317 ++converted;
318 #endif
319 mp->ma_lookup = lookdict;
320 return lookdict(mp, key, hash);
322 i = hash & mask;
323 ep = &ep0[i];
324 if (ep->me_key == NULL || ep->me_key == key)
325 return ep;
326 if (ep->me_key == dummy)
327 freeslot = ep;
328 else {
329 if (ep->me_hash == hash
330 && _PyString_Eq(ep->me_key, key)) {
331 return ep;
333 freeslot = NULL;
336 /* In the loop, me_key == dummy is by far (factor of 100s) the
337 least likely outcome, so test for that last. */
338 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
339 i = (i << 2) + i + perturb + 1;
340 ep = &ep0[i & mask];
341 if (ep->me_key == NULL)
342 return freeslot == NULL ? ep : freeslot;
343 if (ep->me_key == key
344 || (ep->me_hash == hash
345 && ep->me_key != dummy
346 && _PyString_Eq(ep->me_key, key)))
347 return ep;
348 if (ep->me_key == dummy && freeslot == NULL)
349 freeslot = ep;
354 Internal routine to insert a new item into the table.
355 Used both by the internal resize routine and by the public insert routine.
356 Eats a reference to key and one to value.
358 static void
359 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
361 PyObject *old_value;
362 register dictentry *ep;
363 typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
365 assert(mp->ma_lookup != NULL);
366 ep = mp->ma_lookup(mp, key, hash);
367 if (ep->me_value != NULL) {
368 old_value = ep->me_value;
369 ep->me_value = value;
370 Py_DECREF(old_value); /* which **CAN** re-enter */
371 Py_DECREF(key);
373 else {
374 if (ep->me_key == NULL)
375 mp->ma_fill++;
376 else
377 Py_DECREF(ep->me_key);
378 ep->me_key = key;
379 ep->me_hash = hash;
380 ep->me_value = value;
381 mp->ma_used++;
386 Restructure the table by allocating a new table and reinserting all
387 items again. When entries have been deleted, the new table may
388 actually be smaller than the old one.
390 static int
391 dictresize(dictobject *mp, int minused)
393 int newsize;
394 dictentry *oldtable, *newtable, *ep;
395 int i;
396 int is_oldtable_malloced;
397 dictentry small_copy[PyDict_MINSIZE];
399 assert(minused >= 0);
401 /* Find the smallest table size > minused. */
402 for (newsize = PyDict_MINSIZE;
403 newsize <= minused && newsize > 0;
404 newsize <<= 1)
406 if (newsize <= 0) {
407 PyErr_NoMemory();
408 return -1;
411 /* Get space for a new table. */
412 oldtable = mp->ma_table;
413 assert(oldtable != NULL);
414 is_oldtable_malloced = oldtable != mp->ma_smalltable;
416 if (newsize == PyDict_MINSIZE) {
417 /* A large table is shrinking, or we can't get any smaller. */
418 newtable = mp->ma_smalltable;
419 if (newtable == oldtable) {
420 if (mp->ma_fill == mp->ma_used) {
421 /* No dummies, so no point doing anything. */
422 return 0;
424 /* We're not going to resize it, but rebuild the
425 table anyway to purge old dummy entries.
426 Subtle: This is *necessary* if fill==size,
427 as lookdict needs at least one virgin slot to
428 terminate failing searches. If fill < size, it's
429 merely desirable, as dummies slow searches. */
430 assert(mp->ma_fill > mp->ma_used);
431 memcpy(small_copy, oldtable, sizeof(small_copy));
432 oldtable = small_copy;
435 else {
436 newtable = PyMem_NEW(dictentry, newsize);
437 if (newtable == NULL) {
438 PyErr_NoMemory();
439 return -1;
443 /* Make the dict empty, using the new table. */
444 assert(newtable != oldtable);
445 mp->ma_table = newtable;
446 mp->ma_mask = newsize - 1;
447 memset(newtable, 0, sizeof(dictentry) * newsize);
448 mp->ma_used = 0;
449 i = mp->ma_fill;
450 mp->ma_fill = 0;
452 /* Copy the data over; this is refcount-neutral for active entries;
453 dummy entries aren't copied over, of course */
454 for (ep = oldtable; i > 0; ep++) {
455 if (ep->me_value != NULL) { /* active entry */
456 --i;
457 insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
459 else if (ep->me_key != NULL) { /* dummy entry */
460 --i;
461 assert(ep->me_key == dummy);
462 Py_DECREF(ep->me_key);
464 /* else key == value == NULL: nothing to do */
467 if (is_oldtable_malloced)
468 PyMem_DEL(oldtable);
469 return 0;
472 PyObject *
473 PyDict_GetItem(PyObject *op, PyObject *key)
475 long hash;
476 dictobject *mp = (dictobject *)op;
477 if (!PyDict_Check(op)) {
478 return NULL;
480 #ifdef CACHE_HASH
481 if (!PyString_Check(key) ||
482 (hash = ((PyStringObject *) key)->ob_shash) == -1)
483 #endif
485 hash = PyObject_Hash(key);
486 if (hash == -1) {
487 PyErr_Clear();
488 return NULL;
491 return (mp->ma_lookup)(mp, key, hash)->me_value;
494 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
495 * dictionary if it is merely replacing the value for an existing key.
496 * This is means that it's safe to loop over a dictionary with
497 * PyDict_Next() and occasionally replace a value -- but you can't
498 * insert new keys or remove them.
501 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
503 register dictobject *mp;
504 register long hash;
505 register int n_used;
507 if (!PyDict_Check(op)) {
508 PyErr_BadInternalCall();
509 return -1;
511 mp = (dictobject *)op;
512 #ifdef CACHE_HASH
513 if (PyString_Check(key)) {
514 #ifdef INTERN_STRINGS
515 if (((PyStringObject *)key)->ob_sinterned != NULL) {
516 key = ((PyStringObject *)key)->ob_sinterned;
517 hash = ((PyStringObject *)key)->ob_shash;
519 else
520 #endif
522 hash = ((PyStringObject *)key)->ob_shash;
523 if (hash == -1)
524 hash = PyObject_Hash(key);
527 else
528 #endif
530 hash = PyObject_Hash(key);
531 if (hash == -1)
532 return -1;
534 assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
535 n_used = mp->ma_used;
536 Py_INCREF(value);
537 Py_INCREF(key);
538 insertdict(mp, key, hash, value);
539 /* If we added a key, we can safely resize. Otherwise skip this!
540 * If fill >= 2/3 size, adjust size. Normally, this doubles the
541 * size, but it's also possible for the dict to shrink (if ma_fill is
542 * much larger than ma_used, meaning a lot of dict keys have been
543 * deleted).
545 if (mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2) {
546 if (dictresize(mp, mp->ma_used*2) != 0)
547 return -1;
549 return 0;
553 PyDict_DelItem(PyObject *op, PyObject *key)
555 register dictobject *mp;
556 register long hash;
557 register dictentry *ep;
558 PyObject *old_value, *old_key;
560 if (!PyDict_Check(op)) {
561 PyErr_BadInternalCall();
562 return -1;
564 #ifdef CACHE_HASH
565 if (!PyString_Check(key) ||
566 (hash = ((PyStringObject *) key)->ob_shash) == -1)
567 #endif
569 hash = PyObject_Hash(key);
570 if (hash == -1)
571 return -1;
573 mp = (dictobject *)op;
574 ep = (mp->ma_lookup)(mp, key, hash);
575 if (ep->me_value == NULL) {
576 PyErr_SetObject(PyExc_KeyError, key);
577 return -1;
579 old_key = ep->me_key;
580 Py_INCREF(dummy);
581 ep->me_key = dummy;
582 old_value = ep->me_value;
583 ep->me_value = NULL;
584 mp->ma_used--;
585 Py_DECREF(old_value);
586 Py_DECREF(old_key);
587 return 0;
590 void
591 PyDict_Clear(PyObject *op)
593 dictobject *mp;
594 dictentry *ep, *table;
595 int table_is_malloced;
596 int fill;
597 dictentry small_copy[PyDict_MINSIZE];
598 #ifdef Py_DEBUG
599 int i, n;
600 #endif
602 if (!PyDict_Check(op))
603 return;
604 mp = (dictobject *)op;
605 #ifdef Py_DEBUG
606 n = mp->ma_mask + 1;
607 i = 0;
608 #endif
610 table = mp->ma_table;
611 assert(table != NULL);
612 table_is_malloced = table != mp->ma_smalltable;
614 /* This is delicate. During the process of clearing the dict,
615 * decrefs can cause the dict to mutate. To avoid fatal confusion
616 * (voice of experience), we have to make the dict empty before
617 * clearing the slots, and never refer to anything via mp->xxx while
618 * clearing.
620 fill = mp->ma_fill;
621 if (table_is_malloced)
622 EMPTY_TO_MINSIZE(mp);
624 else if (fill > 0) {
625 /* It's a small table with something that needs to be cleared.
626 * Afraid the only safe way is to copy the dict entries into
627 * another small table first.
629 memcpy(small_copy, table, sizeof(small_copy));
630 table = small_copy;
631 EMPTY_TO_MINSIZE(mp);
633 /* else it's a small table that's already empty */
635 /* Now we can finally clear things. If C had refcounts, we could
636 * assert that the refcount on table is 1 now, i.e. that this function
637 * has unique access to it, so decref side-effects can't alter it.
639 for (ep = table; fill > 0; ++ep) {
640 #ifdef Py_DEBUG
641 assert(i < n);
642 ++i;
643 #endif
644 if (ep->me_key) {
645 --fill;
646 Py_DECREF(ep->me_key);
647 Py_XDECREF(ep->me_value);
649 #ifdef Py_DEBUG
650 else
651 assert(ep->me_value == NULL);
652 #endif
655 if (table_is_malloced)
656 PyMem_DEL(table);
659 /* CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
660 * mutates the dict. One exception: it is safe if the loop merely changes
661 * the values associated with the keys (but doesn't insert new keys or
662 * delete keys), via PyDict_SetItem().
665 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
667 int i;
668 register dictobject *mp;
669 if (!PyDict_Check(op))
670 return 0;
671 mp = (dictobject *)op;
672 i = *ppos;
673 if (i < 0)
674 return 0;
675 while (i <= mp->ma_mask && mp->ma_table[i].me_value == NULL)
676 i++;
677 *ppos = i+1;
678 if (i > mp->ma_mask)
679 return 0;
680 if (pkey)
681 *pkey = mp->ma_table[i].me_key;
682 if (pvalue)
683 *pvalue = mp->ma_table[i].me_value;
684 return 1;
687 /* Methods */
689 static void
690 dict_dealloc(register dictobject *mp)
692 register dictentry *ep;
693 int fill = mp->ma_fill;
694 Py_TRASHCAN_SAFE_BEGIN(mp)
695 PyObject_GC_Fini(mp);
696 for (ep = mp->ma_table; fill > 0; ep++) {
697 if (ep->me_key) {
698 --fill;
699 Py_DECREF(ep->me_key);
700 Py_XDECREF(ep->me_value);
703 if (mp->ma_table != mp->ma_smalltable)
704 PyMem_DEL(mp->ma_table);
705 mp = (dictobject *) PyObject_AS_GC(mp);
706 PyObject_DEL(mp);
707 Py_TRASHCAN_SAFE_END(mp)
710 static int
711 dict_print(register dictobject *mp, register FILE *fp, register int flags)
713 register int i;
714 register int any;
716 i = Py_ReprEnter((PyObject*)mp);
717 if (i != 0) {
718 if (i < 0)
719 return i;
720 fprintf(fp, "{...}");
721 return 0;
724 fprintf(fp, "{");
725 any = 0;
726 for (i = 0; i <= mp->ma_mask; i++) {
727 dictentry *ep = mp->ma_table + i;
728 PyObject *pvalue = ep->me_value;
729 if (pvalue != NULL) {
730 /* Prevent PyObject_Repr from deleting value during
731 key format */
732 Py_INCREF(pvalue);
733 if (any++ > 0)
734 fprintf(fp, ", ");
735 if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
736 Py_DECREF(pvalue);
737 Py_ReprLeave((PyObject*)mp);
738 return -1;
740 fprintf(fp, ": ");
741 if (PyObject_Print(pvalue, fp, 0) != 0) {
742 Py_DECREF(pvalue);
743 Py_ReprLeave((PyObject*)mp);
744 return -1;
746 Py_DECREF(pvalue);
749 fprintf(fp, "}");
750 Py_ReprLeave((PyObject*)mp);
751 return 0;
754 static PyObject *
755 dict_repr(dictobject *mp)
757 int i;
758 PyObject *s, *temp, *colon = NULL;
759 PyObject *pieces = NULL, *result = NULL;
760 PyObject *key, *value;
762 i = Py_ReprEnter((PyObject *)mp);
763 if (i != 0) {
764 return i > 0 ? PyString_FromString("{...}") : NULL;
767 if (mp->ma_used == 0) {
768 result = PyString_FromString("{}");
769 goto Done;
772 pieces = PyList_New(0);
773 if (pieces == NULL)
774 goto Done;
776 colon = PyString_FromString(": ");
777 if (colon == NULL)
778 goto Done;
780 /* Do repr() on each key+value pair, and insert ": " between them.
781 Note that repr may mutate the dict. */
782 i = 0;
783 while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
784 int status;
785 /* Prevent repr from deleting value during key format. */
786 Py_INCREF(value);
787 s = PyObject_Repr(key);
788 PyString_Concat(&s, colon);
789 PyString_ConcatAndDel(&s, PyObject_Repr(value));
790 Py_DECREF(value);
791 if (s == NULL)
792 goto Done;
793 status = PyList_Append(pieces, s);
794 Py_DECREF(s); /* append created a new ref */
795 if (status < 0)
796 goto Done;
799 /* Add "{}" decorations to the first and last items. */
800 assert(PyList_GET_SIZE(pieces) > 0);
801 s = PyString_FromString("{");
802 if (s == NULL)
803 goto Done;
804 temp = PyList_GET_ITEM(pieces, 0);
805 PyString_ConcatAndDel(&s, temp);
806 PyList_SET_ITEM(pieces, 0, s);
807 if (s == NULL)
808 goto Done;
810 s = PyString_FromString("}");
811 if (s == NULL)
812 goto Done;
813 temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
814 PyString_ConcatAndDel(&temp, s);
815 PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
816 if (temp == NULL)
817 goto Done;
819 /* Paste them all together with ", " between. */
820 s = PyString_FromString(", ");
821 if (s == NULL)
822 goto Done;
823 result = _PyString_Join(s, pieces);
824 Py_DECREF(s);
826 Done:
827 Py_XDECREF(pieces);
828 Py_XDECREF(colon);
829 Py_ReprLeave((PyObject *)mp);
830 return result;
833 static int
834 dict_length(dictobject *mp)
836 return mp->ma_used;
839 static PyObject *
840 dict_subscript(dictobject *mp, register PyObject *key)
842 PyObject *v;
843 long hash;
844 assert(mp->ma_table != NULL);
845 #ifdef CACHE_HASH
846 if (!PyString_Check(key) ||
847 (hash = ((PyStringObject *) key)->ob_shash) == -1)
848 #endif
850 hash = PyObject_Hash(key);
851 if (hash == -1)
852 return NULL;
854 v = (mp->ma_lookup)(mp, key, hash) -> me_value;
855 if (v == NULL)
856 PyErr_SetObject(PyExc_KeyError, key);
857 else
858 Py_INCREF(v);
859 return v;
862 static int
863 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
865 if (w == NULL)
866 return PyDict_DelItem((PyObject *)mp, v);
867 else
868 return PyDict_SetItem((PyObject *)mp, v, w);
871 static PyMappingMethods dict_as_mapping = {
872 (inquiry)dict_length, /*mp_length*/
873 (binaryfunc)dict_subscript, /*mp_subscript*/
874 (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
877 static PyObject *
878 dict_keys(register dictobject *mp, PyObject *args)
880 register PyObject *v;
881 register int i, j, n;
883 if (!PyArg_NoArgs(args))
884 return NULL;
885 again:
886 n = mp->ma_used;
887 v = PyList_New(n);
888 if (v == NULL)
889 return NULL;
890 if (n != mp->ma_used) {
891 /* Durnit. The allocations caused the dict to resize.
892 * Just start over, this shouldn't normally happen.
894 Py_DECREF(v);
895 goto again;
897 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
898 if (mp->ma_table[i].me_value != NULL) {
899 PyObject *key = mp->ma_table[i].me_key;
900 Py_INCREF(key);
901 PyList_SET_ITEM(v, j, key);
902 j++;
905 return v;
908 static PyObject *
909 dict_values(register dictobject *mp, PyObject *args)
911 register PyObject *v;
912 register int i, j, n;
914 if (!PyArg_NoArgs(args))
915 return NULL;
916 again:
917 n = mp->ma_used;
918 v = PyList_New(n);
919 if (v == NULL)
920 return NULL;
921 if (n != mp->ma_used) {
922 /* Durnit. The allocations caused the dict to resize.
923 * Just start over, this shouldn't normally happen.
925 Py_DECREF(v);
926 goto again;
928 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
929 if (mp->ma_table[i].me_value != NULL) {
930 PyObject *value = mp->ma_table[i].me_value;
931 Py_INCREF(value);
932 PyList_SET_ITEM(v, j, value);
933 j++;
936 return v;
939 static PyObject *
940 dict_items(register dictobject *mp, PyObject *args)
942 register PyObject *v;
943 register int i, j, n;
944 PyObject *item, *key, *value;
946 if (!PyArg_NoArgs(args))
947 return NULL;
948 /* Preallocate the list of tuples, to avoid allocations during
949 * the loop over the items, which could trigger GC, which
950 * could resize the dict. :-(
952 again:
953 n = mp->ma_used;
954 v = PyList_New(n);
955 if (v == NULL)
956 return NULL;
957 for (i = 0; i < n; i++) {
958 item = PyTuple_New(2);
959 if (item == NULL) {
960 Py_DECREF(v);
961 return NULL;
963 PyList_SET_ITEM(v, i, item);
965 if (n != mp->ma_used) {
966 /* Durnit. The allocations caused the dict to resize.
967 * Just start over, this shouldn't normally happen.
969 Py_DECREF(v);
970 goto again;
972 /* Nothing we do below makes any function calls. */
973 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
974 if (mp->ma_table[i].me_value != NULL) {
975 key = mp->ma_table[i].me_key;
976 value = mp->ma_table[i].me_value;
977 item = PyList_GET_ITEM(v, j);
978 Py_INCREF(key);
979 PyTuple_SET_ITEM(item, 0, key);
980 Py_INCREF(value);
981 PyTuple_SET_ITEM(item, 1, value);
982 j++;
985 assert(j == n);
986 return v;
989 static PyObject *
990 dict_update(PyObject *mp, PyObject *args)
992 PyObject *other;
994 if (!PyArg_ParseTuple(args, "O:update", &other))
995 return NULL;
996 if (PyDict_Update(mp, other) < 0)
997 return NULL;
998 Py_INCREF(Py_None);
999 return Py_None;
1002 /* Update unconditionally replaces existing items.
1003 Merge has a 3rd argument 'override'; if set, it acts like Update,
1004 otherwise it leaves existing items unchanged. */
1007 PyDict_Update(PyObject *a, PyObject *b)
1009 return PyDict_Merge(a, b, 1);
1013 PyDict_Merge(PyObject *a, PyObject *b, int override)
1015 register PyDictObject *mp, *other;
1016 register int i;
1017 dictentry *entry;
1019 /* We accept for the argument either a concrete dictionary object,
1020 * or an abstract "mapping" object. For the former, we can do
1021 * things quite efficiently. For the latter, we only require that
1022 * PyMapping_Keys() and PyObject_GetItem() be supported.
1024 if (a == NULL || !PyDict_Check(a) || b == NULL) {
1025 PyErr_BadInternalCall();
1026 return -1;
1028 mp = (dictobject*)a;
1029 if (PyDict_Check(b)) {
1030 other = (dictobject*)b;
1031 if (other == mp || other->ma_used == 0)
1032 /* a.update(a) or a.update({}); nothing to do */
1033 return 0;
1034 /* Do one big resize at the start, rather than
1035 * incrementally resizing as we insert new items. Expect
1036 * that there will be no (or few) overlapping keys.
1038 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1039 if (dictresize(mp, (mp->ma_used + other->ma_used)*3/2) != 0)
1040 return -1;
1042 for (i = 0; i <= other->ma_mask; i++) {
1043 entry = &other->ma_table[i];
1044 if (entry->me_value != NULL &&
1045 (override ||
1046 PyDict_GetItem(a, entry->me_key) == NULL)) {
1047 Py_INCREF(entry->me_key);
1048 Py_INCREF(entry->me_value);
1049 insertdict(mp, entry->me_key, entry->me_hash,
1050 entry->me_value);
1054 else {
1055 /* Do it the generic, slower way */
1056 PyObject *keys = PyMapping_Keys(b);
1057 PyObject *iter;
1058 PyObject *key, *value;
1059 int status;
1061 if (keys == NULL)
1062 /* Docstring says this is equivalent to E.keys() so
1063 * if E doesn't have a .keys() method we want
1064 * AttributeError to percolate up. Might as well
1065 * do the same for any other error.
1067 return -1;
1069 iter = PyObject_GetIter(keys);
1070 Py_DECREF(keys);
1071 if (iter == NULL)
1072 return -1;
1074 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1075 if (!override && PyDict_GetItem(a, key) != NULL) {
1076 Py_DECREF(key);
1077 continue;
1079 value = PyObject_GetItem(b, key);
1080 if (value == NULL) {
1081 Py_DECREF(iter);
1082 Py_DECREF(key);
1083 return -1;
1085 status = PyDict_SetItem(a, key, value);
1086 Py_DECREF(key);
1087 Py_DECREF(value);
1088 if (status < 0) {
1089 Py_DECREF(iter);
1090 return -1;
1093 Py_DECREF(iter);
1094 if (PyErr_Occurred())
1095 /* Iterator completed, via error */
1096 return -1;
1098 return 0;
1101 static PyObject *
1102 dict_copy(register dictobject *mp, PyObject *args)
1104 if (!PyArg_Parse(args, ""))
1105 return NULL;
1106 return PyDict_Copy((PyObject*)mp);
1109 PyObject *
1110 PyDict_Copy(PyObject *o)
1112 register dictobject *mp;
1113 register int i;
1114 dictobject *copy;
1115 dictentry *entry;
1117 if (o == NULL || !PyDict_Check(o)) {
1118 PyErr_BadInternalCall();
1119 return NULL;
1121 mp = (dictobject *)o;
1122 copy = (dictobject *)PyDict_New();
1123 if (copy == NULL)
1124 return NULL;
1125 if (mp->ma_used > 0) {
1126 if (dictresize(copy, mp->ma_used*3/2) != 0)
1127 return NULL;
1128 for (i = 0; i <= mp->ma_mask; i++) {
1129 entry = &mp->ma_table[i];
1130 if (entry->me_value != NULL) {
1131 Py_INCREF(entry->me_key);
1132 Py_INCREF(entry->me_value);
1133 insertdict(copy, entry->me_key, entry->me_hash,
1134 entry->me_value);
1138 return (PyObject *)copy;
1142 PyDict_Size(PyObject *mp)
1144 if (mp == NULL || !PyDict_Check(mp)) {
1145 PyErr_BadInternalCall();
1146 return 0;
1148 return ((dictobject *)mp)->ma_used;
1151 PyObject *
1152 PyDict_Keys(PyObject *mp)
1154 if (mp == NULL || !PyDict_Check(mp)) {
1155 PyErr_BadInternalCall();
1156 return NULL;
1158 return dict_keys((dictobject *)mp, (PyObject *)NULL);
1161 PyObject *
1162 PyDict_Values(PyObject *mp)
1164 if (mp == NULL || !PyDict_Check(mp)) {
1165 PyErr_BadInternalCall();
1166 return NULL;
1168 return dict_values((dictobject *)mp, (PyObject *)NULL);
1171 PyObject *
1172 PyDict_Items(PyObject *mp)
1174 if (mp == NULL || !PyDict_Check(mp)) {
1175 PyErr_BadInternalCall();
1176 return NULL;
1178 return dict_items((dictobject *)mp, (PyObject *)NULL);
1181 /* Subroutine which returns the smallest key in a for which b's value
1182 is different or absent. The value is returned too, through the
1183 pval argument. Both are NULL if no key in a is found for which b's status
1184 differs. The refcounts on (and only on) non-NULL *pval and function return
1185 values must be decremented by the caller (characterize() increments them
1186 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1187 them before the caller is done looking at them). */
1189 static PyObject *
1190 characterize(dictobject *a, dictobject *b, PyObject **pval)
1192 PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1193 PyObject *aval = NULL; /* a[akey] */
1194 int i, cmp;
1196 for (i = 0; i <= a->ma_mask; i++) {
1197 PyObject *thiskey, *thisaval, *thisbval;
1198 if (a->ma_table[i].me_value == NULL)
1199 continue;
1200 thiskey = a->ma_table[i].me_key;
1201 Py_INCREF(thiskey); /* keep alive across compares */
1202 if (akey != NULL) {
1203 cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1204 if (cmp < 0) {
1205 Py_DECREF(thiskey);
1206 goto Fail;
1208 if (cmp > 0 ||
1209 i > a->ma_mask ||
1210 a->ma_table[i].me_value == NULL)
1212 /* Not the *smallest* a key; or maybe it is
1213 * but the compare shrunk the dict so we can't
1214 * find its associated value anymore; or
1215 * maybe it is but the compare deleted the
1216 * a[thiskey] entry.
1218 Py_DECREF(thiskey);
1219 continue;
1223 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1224 thisaval = a->ma_table[i].me_value;
1225 assert(thisaval);
1226 Py_INCREF(thisaval); /* keep alive */
1227 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1228 if (thisbval == NULL)
1229 cmp = 0;
1230 else {
1231 /* both dicts have thiskey: same values? */
1232 cmp = PyObject_RichCompareBool(
1233 thisaval, thisbval, Py_EQ);
1234 if (cmp < 0) {
1235 Py_DECREF(thiskey);
1236 Py_DECREF(thisaval);
1237 goto Fail;
1240 if (cmp == 0) {
1241 /* New winner. */
1242 Py_XDECREF(akey);
1243 Py_XDECREF(aval);
1244 akey = thiskey;
1245 aval = thisaval;
1247 else {
1248 Py_DECREF(thiskey);
1249 Py_DECREF(thisaval);
1252 *pval = aval;
1253 return akey;
1255 Fail:
1256 Py_XDECREF(akey);
1257 Py_XDECREF(aval);
1258 *pval = NULL;
1259 return NULL;
1262 static int
1263 dict_compare(dictobject *a, dictobject *b)
1265 PyObject *adiff, *bdiff, *aval, *bval;
1266 int res;
1268 /* Compare lengths first */
1269 if (a->ma_used < b->ma_used)
1270 return -1; /* a is shorter */
1271 else if (a->ma_used > b->ma_used)
1272 return 1; /* b is shorter */
1274 /* Same length -- check all keys */
1275 bdiff = bval = NULL;
1276 adiff = characterize(a, b, &aval);
1277 if (adiff == NULL) {
1278 assert(!aval);
1279 /* Either an error, or a is a subset with the same length so
1280 * must be equal.
1282 res = PyErr_Occurred() ? -1 : 0;
1283 goto Finished;
1285 bdiff = characterize(b, a, &bval);
1286 if (bdiff == NULL && PyErr_Occurred()) {
1287 assert(!bval);
1288 res = -1;
1289 goto Finished;
1291 res = 0;
1292 if (bdiff) {
1293 /* bdiff == NULL "should be" impossible now, but perhaps
1294 * the last comparison done by the characterize() on a had
1295 * the side effect of making the dicts equal!
1297 res = PyObject_Compare(adiff, bdiff);
1299 if (res == 0 && bval != NULL)
1300 res = PyObject_Compare(aval, bval);
1302 Finished:
1303 Py_XDECREF(adiff);
1304 Py_XDECREF(bdiff);
1305 Py_XDECREF(aval);
1306 Py_XDECREF(bval);
1307 return res;
1310 /* Return 1 if dicts equal, 0 if not, -1 if error.
1311 * Gets out as soon as any difference is detected.
1312 * Uses only Py_EQ comparison.
1314 static int
1315 dict_equal(dictobject *a, dictobject *b)
1317 int i;
1319 if (a->ma_used != b->ma_used)
1320 /* can't be equal if # of entries differ */
1321 return 0;
1323 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1324 for (i = 0; i <= a->ma_mask; i++) {
1325 PyObject *aval = a->ma_table[i].me_value;
1326 if (aval != NULL) {
1327 int cmp;
1328 PyObject *bval;
1329 PyObject *key = a->ma_table[i].me_key;
1330 /* temporarily bump aval's refcount to ensure it stays
1331 alive until we're done with it */
1332 Py_INCREF(aval);
1333 bval = PyDict_GetItem((PyObject *)b, key);
1334 if (bval == NULL) {
1335 Py_DECREF(aval);
1336 return 0;
1338 cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1339 Py_DECREF(aval);
1340 if (cmp <= 0) /* error or not equal */
1341 return cmp;
1344 return 1;
1347 static PyObject *
1348 dict_richcompare(PyObject *v, PyObject *w, int op)
1350 int cmp;
1351 PyObject *res;
1353 if (!PyDict_Check(v) || !PyDict_Check(w)) {
1354 res = Py_NotImplemented;
1356 else if (op == Py_EQ || op == Py_NE) {
1357 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1358 if (cmp < 0)
1359 return NULL;
1360 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1362 else
1363 res = Py_NotImplemented;
1364 Py_INCREF(res);
1365 return res;
1368 static PyObject *
1369 dict_has_key(register dictobject *mp, PyObject *args)
1371 PyObject *key;
1372 long hash;
1373 register long ok;
1374 if (!PyArg_ParseTuple(args, "O:has_key", &key))
1375 return NULL;
1376 #ifdef CACHE_HASH
1377 if (!PyString_Check(key) ||
1378 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1379 #endif
1381 hash = PyObject_Hash(key);
1382 if (hash == -1)
1383 return NULL;
1385 ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1386 return PyInt_FromLong(ok);
1389 static PyObject *
1390 dict_get(register dictobject *mp, PyObject *args)
1392 PyObject *key;
1393 PyObject *failobj = Py_None;
1394 PyObject *val = NULL;
1395 long hash;
1397 if (!PyArg_ParseTuple(args, "O|O:get", &key, &failobj))
1398 return NULL;
1400 #ifdef CACHE_HASH
1401 if (!PyString_Check(key) ||
1402 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1403 #endif
1405 hash = PyObject_Hash(key);
1406 if (hash == -1)
1407 return NULL;
1409 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1411 if (val == NULL)
1412 val = failobj;
1413 Py_INCREF(val);
1414 return val;
1418 static PyObject *
1419 dict_setdefault(register dictobject *mp, PyObject *args)
1421 PyObject *key;
1422 PyObject *failobj = Py_None;
1423 PyObject *val = NULL;
1424 long hash;
1426 if (!PyArg_ParseTuple(args, "O|O:setdefault", &key, &failobj))
1427 return NULL;
1429 #ifdef CACHE_HASH
1430 if (!PyString_Check(key) ||
1431 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1432 #endif
1434 hash = PyObject_Hash(key);
1435 if (hash == -1)
1436 return NULL;
1438 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1439 if (val == NULL) {
1440 val = failobj;
1441 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1442 val = NULL;
1444 Py_XINCREF(val);
1445 return val;
1449 static PyObject *
1450 dict_clear(register dictobject *mp, PyObject *args)
1452 if (!PyArg_NoArgs(args))
1453 return NULL;
1454 PyDict_Clear((PyObject *)mp);
1455 Py_INCREF(Py_None);
1456 return Py_None;
1459 static PyObject *
1460 dict_popitem(dictobject *mp, PyObject *args)
1462 int i = 0;
1463 dictentry *ep;
1464 PyObject *res;
1466 if (!PyArg_NoArgs(args))
1467 return NULL;
1468 /* Allocate the result tuple before checking the size. Believe it
1469 * or not, this allocation could trigger a garbage collection which
1470 * could empty the dict, so if we checked the size first and that
1471 * happened, the result would be an infinite loop (searching for an
1472 * entry that no longer exists). Note that the usual popitem()
1473 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1474 * tuple away if the dict *is* empty isn't a significant
1475 * inefficiency -- possible, but unlikely in practice.
1477 res = PyTuple_New(2);
1478 if (res == NULL)
1479 return NULL;
1480 if (mp->ma_used == 0) {
1481 Py_DECREF(res);
1482 PyErr_SetString(PyExc_KeyError,
1483 "popitem(): dictionary is empty");
1484 return NULL;
1486 /* Set ep to "the first" dict entry with a value. We abuse the hash
1487 * field of slot 0 to hold a search finger:
1488 * If slot 0 has a value, use slot 0.
1489 * Else slot 0 is being used to hold a search finger,
1490 * and we use its hash value as the first index to look.
1492 ep = &mp->ma_table[0];
1493 if (ep->me_value == NULL) {
1494 i = (int)ep->me_hash;
1495 /* The hash field may be a real hash value, or it may be a
1496 * legit search finger, or it may be a once-legit search
1497 * finger that's out of bounds now because it wrapped around
1498 * or the table shrunk -- simply make sure it's in bounds now.
1500 if (i > mp->ma_mask || i < 1)
1501 i = 1; /* skip slot 0 */
1502 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1503 i++;
1504 if (i > mp->ma_mask)
1505 i = 1;
1508 PyTuple_SET_ITEM(res, 0, ep->me_key);
1509 PyTuple_SET_ITEM(res, 1, ep->me_value);
1510 Py_INCREF(dummy);
1511 ep->me_key = dummy;
1512 ep->me_value = NULL;
1513 mp->ma_used--;
1514 assert(mp->ma_table[0].me_value == NULL);
1515 mp->ma_table[0].me_hash = i + 1; /* next place to start */
1516 return res;
1519 static int
1520 dict_traverse(PyObject *op, visitproc visit, void *arg)
1522 int i = 0, err;
1523 PyObject *pk;
1524 PyObject *pv;
1526 while (PyDict_Next(op, &i, &pk, &pv)) {
1527 err = visit(pk, arg);
1528 if (err)
1529 return err;
1530 err = visit(pv, arg);
1531 if (err)
1532 return err;
1534 return 0;
1537 static int
1538 dict_tp_clear(PyObject *op)
1540 PyDict_Clear(op);
1541 return 0;
1545 staticforward PyObject *dictiter_new(dictobject *, binaryfunc);
1547 static PyObject *
1548 select_key(PyObject *key, PyObject *value)
1550 Py_INCREF(key);
1551 return key;
1554 static PyObject *
1555 select_value(PyObject *key, PyObject *value)
1557 Py_INCREF(value);
1558 return value;
1561 static PyObject *
1562 select_item(PyObject *key, PyObject *value)
1564 PyObject *res = PyTuple_New(2);
1566 if (res != NULL) {
1567 Py_INCREF(key);
1568 Py_INCREF(value);
1569 PyTuple_SET_ITEM(res, 0, key);
1570 PyTuple_SET_ITEM(res, 1, value);
1572 return res;
1575 static PyObject *
1576 dict_iterkeys(dictobject *dict, PyObject *args)
1578 if (!PyArg_ParseTuple(args, ""))
1579 return NULL;
1580 return dictiter_new(dict, select_key);
1583 static PyObject *
1584 dict_itervalues(dictobject *dict, PyObject *args)
1586 if (!PyArg_ParseTuple(args, ""))
1587 return NULL;
1588 return dictiter_new(dict, select_value);
1591 static PyObject *
1592 dict_iteritems(dictobject *dict, PyObject *args)
1594 if (!PyArg_ParseTuple(args, ""))
1595 return NULL;
1596 return dictiter_new(dict, select_item);
1600 static char has_key__doc__[] =
1601 "D.has_key(k) -> 1 if D has a key k, else 0";
1603 static char get__doc__[] =
1604 "D.get(k[,d]) -> D[k] if D.has_key(k), else d. d defaults to None.";
1606 static char setdefault_doc__[] =
1607 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if not D.has_key(k)";
1609 static char popitem__doc__[] =
1610 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1611 2-tuple; but raise KeyError if D is empty";
1613 static char keys__doc__[] =
1614 "D.keys() -> list of D's keys";
1616 static char items__doc__[] =
1617 "D.items() -> list of D's (key, value) pairs, as 2-tuples";
1619 static char values__doc__[] =
1620 "D.values() -> list of D's values";
1622 static char update__doc__[] =
1623 "D.update(E) -> None. Update D from E: for k in E.keys(): D[k] = E[k]";
1625 static char clear__doc__[] =
1626 "D.clear() -> None. Remove all items from D.";
1628 static char copy__doc__[] =
1629 "D.copy() -> a shallow copy of D";
1631 static char iterkeys__doc__[] =
1632 "D.iterkeys() -> an iterator over the keys of D";
1634 static char itervalues__doc__[] =
1635 "D.itervalues() -> an iterator over the values of D";
1637 static char iteritems__doc__[] =
1638 "D.iteritems() -> an iterator over the (key, value) items of D";
1640 static PyMethodDef mapp_methods[] = {
1641 {"has_key", (PyCFunction)dict_has_key, METH_VARARGS,
1642 has_key__doc__},
1643 {"get", (PyCFunction)dict_get, METH_VARARGS,
1644 get__doc__},
1645 {"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
1646 setdefault_doc__},
1647 {"popitem", (PyCFunction)dict_popitem, METH_OLDARGS,
1648 popitem__doc__},
1649 {"keys", (PyCFunction)dict_keys, METH_OLDARGS,
1650 keys__doc__},
1651 {"items", (PyCFunction)dict_items, METH_OLDARGS,
1652 items__doc__},
1653 {"values", (PyCFunction)dict_values, METH_OLDARGS,
1654 values__doc__},
1655 {"update", (PyCFunction)dict_update, METH_VARARGS,
1656 update__doc__},
1657 {"clear", (PyCFunction)dict_clear, METH_OLDARGS,
1658 clear__doc__},
1659 {"copy", (PyCFunction)dict_copy, METH_OLDARGS,
1660 copy__doc__},
1661 {"iterkeys", (PyCFunction)dict_iterkeys, METH_VARARGS,
1662 iterkeys__doc__},
1663 {"itervalues", (PyCFunction)dict_itervalues, METH_VARARGS,
1664 itervalues__doc__},
1665 {"iteritems", (PyCFunction)dict_iteritems, METH_VARARGS,
1666 iteritems__doc__},
1667 {NULL, NULL} /* sentinel */
1670 static int
1671 dict_contains(dictobject *mp, PyObject *key)
1673 long hash;
1675 #ifdef CACHE_HASH
1676 if (!PyString_Check(key) ||
1677 (hash = ((PyStringObject *) key)->ob_shash) == -1)
1678 #endif
1680 hash = PyObject_Hash(key);
1681 if (hash == -1)
1682 return -1;
1684 return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1687 /* Hack to implement "key in dict" */
1688 static PySequenceMethods dict_as_sequence = {
1689 0, /* sq_length */
1690 0, /* sq_concat */
1691 0, /* sq_repeat */
1692 0, /* sq_item */
1693 0, /* sq_slice */
1694 0, /* sq_ass_item */
1695 0, /* sq_ass_slice */
1696 (objobjproc)dict_contains, /* sq_contains */
1697 0, /* sq_inplace_concat */
1698 0, /* sq_inplace_repeat */
1701 static PyObject *
1702 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1704 PyObject *self;
1706 assert(type != NULL && type->tp_alloc != NULL);
1707 self = type->tp_alloc(type, 0);
1708 if (self != NULL) {
1709 PyDictObject *d = (PyDictObject *)self;
1710 /* It's guaranteed that tp->alloc zeroed out the struct. */
1711 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1712 INIT_NONZERO_DICT_SLOTS(d);
1713 d->ma_lookup = lookdict_string;
1714 #ifdef SHOW_CONVERSION_COUNTS
1715 ++created;
1716 #endif
1718 return self;
1721 static PyObject *
1722 dict_iter(dictobject *dict)
1724 return dictiter_new(dict, select_key);
1727 PyTypeObject PyDict_Type = {
1728 PyObject_HEAD_INIT(&PyType_Type)
1730 "dictionary",
1731 sizeof(dictobject) + PyGC_HEAD_SIZE,
1733 (destructor)dict_dealloc, /* tp_dealloc */
1734 (printfunc)dict_print, /* tp_print */
1735 0, /* tp_getattr */
1736 0, /* tp_setattr */
1737 (cmpfunc)dict_compare, /* tp_compare */
1738 (reprfunc)dict_repr, /* tp_repr */
1739 0, /* tp_as_number */
1740 &dict_as_sequence, /* tp_as_sequence */
1741 &dict_as_mapping, /* tp_as_mapping */
1742 0, /* tp_hash */
1743 0, /* tp_call */
1744 0, /* tp_str */
1745 PyObject_GenericGetAttr, /* tp_getattro */
1746 0, /* tp_setattro */
1747 0, /* tp_as_buffer */
1748 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_GC |
1749 Py_TPFLAGS_BASETYPE, /* tp_flags */
1750 "dictionary type", /* tp_doc */
1751 (traverseproc)dict_traverse, /* tp_traverse */
1752 (inquiry)dict_tp_clear, /* tp_clear */
1753 dict_richcompare, /* tp_richcompare */
1754 0, /* tp_weaklistoffset */
1755 (getiterfunc)dict_iter, /* tp_iter */
1756 0, /* tp_iternext */
1757 mapp_methods, /* tp_methods */
1758 0, /* tp_members */
1759 0, /* tp_getset */
1760 0, /* tp_base */
1761 0, /* tp_dict */
1762 0, /* tp_descr_get */
1763 0, /* tp_descr_set */
1764 0, /* tp_dictoffset */
1765 0, /* tp_init */
1766 PyType_GenericAlloc, /* tp_alloc */
1767 dict_new, /* tp_new */
1770 /* For backward compatibility with old dictionary interface */
1772 PyObject *
1773 PyDict_GetItemString(PyObject *v, char *key)
1775 PyObject *kv, *rv;
1776 kv = PyString_FromString(key);
1777 if (kv == NULL)
1778 return NULL;
1779 rv = PyDict_GetItem(v, kv);
1780 Py_DECREF(kv);
1781 return rv;
1785 PyDict_SetItemString(PyObject *v, char *key, PyObject *item)
1787 PyObject *kv;
1788 int err;
1789 kv = PyString_FromString(key);
1790 if (kv == NULL)
1791 return -1;
1792 PyString_InternInPlace(&kv); /* XXX Should we really? */
1793 err = PyDict_SetItem(v, kv, item);
1794 Py_DECREF(kv);
1795 return err;
1799 PyDict_DelItemString(PyObject *v, char *key)
1801 PyObject *kv;
1802 int err;
1803 kv = PyString_FromString(key);
1804 if (kv == NULL)
1805 return -1;
1806 err = PyDict_DelItem(v, kv);
1807 Py_DECREF(kv);
1808 return err;
1811 /* Dictionary iterator type */
1813 extern PyTypeObject PyDictIter_Type; /* Forward */
1815 typedef struct {
1816 PyObject_HEAD
1817 dictobject *di_dict;
1818 int di_used;
1819 int di_pos;
1820 binaryfunc di_select;
1821 } dictiterobject;
1823 static PyObject *
1824 dictiter_new(dictobject *dict, binaryfunc select)
1826 dictiterobject *di;
1827 di = PyObject_NEW(dictiterobject, &PyDictIter_Type);
1828 if (di == NULL)
1829 return NULL;
1830 Py_INCREF(dict);
1831 di->di_dict = dict;
1832 di->di_used = dict->ma_used;
1833 di->di_pos = 0;
1834 di->di_select = select;
1835 return (PyObject *)di;
1838 static void
1839 dictiter_dealloc(dictiterobject *di)
1841 Py_DECREF(di->di_dict);
1842 PyObject_DEL(di);
1845 static PyObject *
1846 dictiter_next(dictiterobject *di, PyObject *args)
1848 PyObject *key, *value;
1850 if (di->di_used != di->di_dict->ma_used) {
1851 PyErr_SetString(PyExc_RuntimeError,
1852 "dictionary changed size during iteration");
1853 return NULL;
1855 if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1856 return (*di->di_select)(key, value);
1858 PyErr_SetObject(PyExc_StopIteration, Py_None);
1859 return NULL;
1862 static PyObject *
1863 dictiter_getiter(PyObject *it)
1865 Py_INCREF(it);
1866 return it;
1869 static PyMethodDef dictiter_methods[] = {
1870 {"next", (PyCFunction)dictiter_next, METH_VARARGS,
1871 "it.next() -- get the next value, or raise StopIteration"},
1872 {NULL, NULL} /* sentinel */
1875 static PyObject *dictiter_iternext(dictiterobject *di)
1877 PyObject *key, *value;
1879 if (di->di_used != di->di_dict->ma_used) {
1880 PyErr_SetString(PyExc_RuntimeError,
1881 "dictionary changed size during iteration");
1882 return NULL;
1884 if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value)) {
1885 return (*di->di_select)(key, value);
1887 return NULL;
1890 PyTypeObject PyDictIter_Type = {
1891 PyObject_HEAD_INIT(&PyType_Type)
1892 0, /* ob_size */
1893 "dictionary-iterator", /* tp_name */
1894 sizeof(dictiterobject), /* tp_basicsize */
1895 0, /* tp_itemsize */
1896 /* methods */
1897 (destructor)dictiter_dealloc, /* tp_dealloc */
1898 0, /* tp_print */
1899 0, /* tp_getattr */
1900 0, /* tp_setattr */
1901 0, /* tp_compare */
1902 0, /* tp_repr */
1903 0, /* tp_as_number */
1904 0, /* tp_as_sequence */
1905 0, /* tp_as_mapping */
1906 0, /* tp_hash */
1907 0, /* tp_call */
1908 0, /* tp_str */
1909 PyObject_GenericGetAttr, /* tp_getattro */
1910 0, /* tp_setattro */
1911 0, /* tp_as_buffer */
1912 Py_TPFLAGS_DEFAULT, /* tp_flags */
1913 0, /* tp_doc */
1914 0, /* tp_traverse */
1915 0, /* tp_clear */
1916 0, /* tp_richcompare */
1917 0, /* tp_weaklistoffset */
1918 (getiterfunc)dictiter_getiter, /* tp_iter */
1919 (iternextfunc)dictiter_iternext, /* tp_iternext */
1920 dictiter_methods, /* tp_methods */
1921 0, /* tp_members */
1922 0, /* tp_getset */
1923 0, /* tp_base */
1924 0, /* tp_dict */
1925 0, /* tp_descr_get */
1926 0, /* tp_descr_set */