_make_boundary(): Fix for SF bug #745478, broken boundary calculation
[python/dscho.git] / Objects / dictobject.c
blobc4959ff8e78ab8aa80004300999ee7538e4e0639
2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
8 */
10 #include "Python.h"
12 typedef PyDictEntry dictentry;
13 typedef PyDictObject dictobject;
15 /* Define this out if you don't want conversion statistics on exit. */
16 #undef SHOW_CONVERSION_COUNTS
18 /* See large comment block below. This must be >= 1. */
19 #define PERTURB_SHIFT 5
22 Major subtleties ahead: Most hash schemes depend on having a "good" hash
23 function, in the sense of simulating randomness. Python doesn't: its most
24 important hash functions (for strings and ints) are very regular in common
25 cases:
27 >>> map(hash, (0, 1, 2, 3))
28 [0, 1, 2, 3]
29 >>> map(hash, ("namea", "nameb", "namec", "named"))
30 [-1658398457, -1658398460, -1658398459, -1658398462]
31 >>>
33 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
34 the low-order i bits as the initial table index is extremely fast, and there
35 are no collisions at all for dicts indexed by a contiguous range of ints.
36 The same is approximately true when keys are "consecutive" strings. So this
37 gives better-than-random behavior in common cases, and that's very desirable.
39 OTOH, when collisions occur, the tendency to fill contiguous slices of the
40 hash table makes a good collision resolution strategy crucial. Taking only
41 the last i bits of the hash code is also vulnerable: for example, consider
42 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
43 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
44 hash code are all 0: they *all* map to the same table index.
46 But catering to unusual cases should not slow the usual ones, so we just take
47 the last i bits anyway. It's up to collision resolution to do the rest. If
48 we *usually* find the key we're looking for on the first try (and, it turns
49 out, we usually do -- the table load factor is kept under 2/3, so the odds
50 are solidly in our favor), then it makes best sense to keep the initial index
51 computation dirt cheap.
53 The first half of collision resolution is to visit table indices via this
54 recurrence:
56 j = ((5*j) + 1) mod 2**i
58 For any initial j in range(2**i), repeating that 2**i times generates each
59 int in range(2**i) exactly once (see any text on random-number generation for
60 proof). By itself, this doesn't help much: like linear probing (setting
61 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
62 order. This would be bad, except that's not the only thing we do, and it's
63 actually *good* in the common cases where hash keys are consecutive. In an
64 example that's really too small to make this entirely clear, for a table of
65 size 2**3 the order of indices is:
67 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
69 If two things come in at index 5, the first place we look after is index 2,
70 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
71 Linear probing is deadly in this case because there the fixed probe order
72 is the *same* as the order consecutive keys are likely to arrive. But it's
73 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
74 and certain that consecutive hash codes do not.
76 The other half of the strategy is to get the other bits of the hash code
77 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
78 full hash code, and changing the recurrence to:
80 j = (5*j) + 1 + perturb;
81 perturb >>= PERTURB_SHIFT;
82 use j % 2**i as the next table index;
84 Now the probe sequence depends (eventually) on every bit in the hash code,
85 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
86 because it quickly magnifies small differences in the bits that didn't affect
87 the initial index. Note that because perturb is unsigned, if the recurrence
88 is executed often enough perturb eventually becomes and remains 0. At that
89 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
90 that's certain to find an empty slot eventually (since it generates every int
91 in range(2**i), and we make sure there's always at least one empty slot).
93 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
94 small so that the high bits of the hash code continue to affect the probe
95 sequence across iterations; but you want it large so that in really bad cases
96 the high-order hash bits have an effect on early iterations. 5 was "the
97 best" in minimizing total collisions across experiments Tim Peters ran (on
98 both normal and pathological cases), but 4 and 6 weren't significantly worse.
100 Historical: Reimer Behrends contributed the idea of using a polynomial-based
101 approach, using repeated multiplication by x in GF(2**n) where an irreducible
102 polynomial for each table size was chosen such that x was a primitive root.
103 Christian Tismer later extended that to use division by x instead, as an
104 efficient way to get the high bits of the hash code into play. This scheme
105 also gave excellent collision statistics, but was more expensive: two
106 if-tests were required inside the loop; computing "the next" index took about
107 the same number of operations but without as much potential parallelism
108 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
109 above, and then shifting perturb can be done while the table index is being
110 masked); and the dictobject struct required a member to hold the table's
111 polynomial. In Tim's experiments the current scheme ran faster, produced
112 equally good collision statistics, needed less code & used less memory.
115 /* Object used as dummy key to fill deleted entries */
116 static PyObject *dummy; /* Initialized by first call to newdictobject() */
118 /* forward declarations */
119 static dictentry *
120 lookdict_string(dictobject *mp, PyObject *key, long hash);
122 #ifdef SHOW_CONVERSION_COUNTS
123 static long created = 0L;
124 static long converted = 0L;
126 static void
127 show_counts(void)
129 fprintf(stderr, "created %ld string dicts\n", created);
130 fprintf(stderr, "converted %ld to normal dicts\n", converted);
131 fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
133 #endif
135 /* Initialization macros.
136 There are two ways to create a dict: PyDict_New() is the main C API
137 function, and the tp_new slot maps to dict_new(). In the latter case we
138 can save a little time over what PyDict_New does because it's guaranteed
139 that the PyDictObject struct is already zeroed out.
140 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
141 an excellent reason not to).
144 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
145 (mp)->ma_table = (mp)->ma_smalltable; \
146 (mp)->ma_mask = PyDict_MINSIZE - 1; \
147 } while(0)
149 #define EMPTY_TO_MINSIZE(mp) do { \
150 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
151 (mp)->ma_used = (mp)->ma_fill = 0; \
152 INIT_NONZERO_DICT_SLOTS(mp); \
153 } while(0)
155 PyObject *
156 PyDict_New(void)
158 register dictobject *mp;
159 if (dummy == NULL) { /* Auto-initialize dummy */
160 dummy = PyString_FromString("<dummy key>");
161 if (dummy == NULL)
162 return NULL;
163 #ifdef SHOW_CONVERSION_COUNTS
164 Py_AtExit(show_counts);
165 #endif
167 mp = PyObject_GC_New(dictobject, &PyDict_Type);
168 if (mp == NULL)
169 return NULL;
170 EMPTY_TO_MINSIZE(mp);
171 mp->ma_lookup = lookdict_string;
172 #ifdef SHOW_CONVERSION_COUNTS
173 ++created;
174 #endif
175 _PyObject_GC_TRACK(mp);
176 return (PyObject *)mp;
180 The basic lookup function used by all operations.
181 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
182 Open addressing is preferred over chaining since the link overhead for
183 chaining would be substantial (100% with typical malloc overhead).
185 The initial probe index is computed as hash mod the table size. Subsequent
186 probe indices are computed as explained earlier.
188 All arithmetic on hash should ignore overflow.
190 (The details in this version are due to Tim Peters, building on many past
191 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
192 Christian Tismer).
194 This function must never return NULL; failures are indicated by returning
195 a dictentry* for which the me_value field is NULL. Exceptions are never
196 reported by this function, and outstanding exceptions are maintained.
199 static dictentry *
200 lookdict(dictobject *mp, PyObject *key, register long hash)
202 register int i;
203 register unsigned int perturb;
204 register dictentry *freeslot;
205 register unsigned int mask = mp->ma_mask;
206 dictentry *ep0 = mp->ma_table;
207 register dictentry *ep;
208 register int restore_error;
209 register int checked_error;
210 register int cmp;
211 PyObject *err_type, *err_value, *err_tb;
212 PyObject *startkey;
214 i = hash & mask;
215 ep = &ep0[i];
216 if (ep->me_key == NULL || ep->me_key == key)
217 return ep;
219 restore_error = checked_error = 0;
220 if (ep->me_key == dummy)
221 freeslot = ep;
222 else {
223 if (ep->me_hash == hash) {
224 /* error can't have been checked yet */
225 checked_error = 1;
226 if (PyErr_Occurred()) {
227 restore_error = 1;
228 PyErr_Fetch(&err_type, &err_value, &err_tb);
230 startkey = ep->me_key;
231 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
232 if (cmp < 0)
233 PyErr_Clear();
234 if (ep0 == mp->ma_table && ep->me_key == startkey) {
235 if (cmp > 0)
236 goto Done;
238 else {
239 /* The compare did major nasty stuff to the
240 * dict: start over.
241 * XXX A clever adversary could prevent this
242 * XXX from terminating.
244 ep = lookdict(mp, key, hash);
245 goto Done;
248 freeslot = NULL;
251 /* In the loop, me_key == dummy is by far (factor of 100s) the
252 least likely outcome, so test for that last. */
253 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
254 i = (i << 2) + i + perturb + 1;
255 ep = &ep0[i & mask];
256 if (ep->me_key == NULL) {
257 if (freeslot != NULL)
258 ep = freeslot;
259 break;
261 if (ep->me_key == key)
262 break;
263 if (ep->me_hash == hash && ep->me_key != dummy) {
264 if (!checked_error) {
265 checked_error = 1;
266 if (PyErr_Occurred()) {
267 restore_error = 1;
268 PyErr_Fetch(&err_type, &err_value,
269 &err_tb);
272 startkey = ep->me_key;
273 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
274 if (cmp < 0)
275 PyErr_Clear();
276 if (ep0 == mp->ma_table && ep->me_key == startkey) {
277 if (cmp > 0)
278 break;
280 else {
281 /* The compare did major nasty stuff to the
282 * dict: start over.
283 * XXX A clever adversary could prevent this
284 * XXX from terminating.
286 ep = lookdict(mp, key, hash);
287 break;
290 else if (ep->me_key == dummy && freeslot == NULL)
291 freeslot = ep;
294 Done:
295 if (restore_error)
296 PyErr_Restore(err_type, err_value, err_tb);
297 return ep;
301 * Hacked up version of lookdict which can assume keys are always strings;
302 * this assumption allows testing for errors during PyObject_Compare() to
303 * be dropped; string-string comparisons never raise exceptions. This also
304 * means we don't need to go through PyObject_Compare(); we can always use
305 * _PyString_Eq directly.
307 * This is valuable because the general-case error handling in lookdict() is
308 * expensive, and dicts with pure-string keys are very common.
310 static dictentry *
311 lookdict_string(dictobject *mp, PyObject *key, register long hash)
313 register int i;
314 register unsigned int perturb;
315 register dictentry *freeslot;
316 register unsigned int mask = mp->ma_mask;
317 dictentry *ep0 = mp->ma_table;
318 register dictentry *ep;
320 /* Make sure this function doesn't have to handle non-string keys,
321 including subclasses of str; e.g., one reason to subclass
322 strings is to override __eq__, and for speed we don't cater to
323 that here. */
324 if (!PyString_CheckExact(key)) {
325 #ifdef SHOW_CONVERSION_COUNTS
326 ++converted;
327 #endif
328 mp->ma_lookup = lookdict;
329 return lookdict(mp, key, hash);
331 i = hash & mask;
332 ep = &ep0[i];
333 if (ep->me_key == NULL || ep->me_key == key)
334 return ep;
335 if (ep->me_key == dummy)
336 freeslot = ep;
337 else {
338 if (ep->me_hash == hash
339 && _PyString_Eq(ep->me_key, key)) {
340 return ep;
342 freeslot = NULL;
345 /* In the loop, me_key == dummy is by far (factor of 100s) the
346 least likely outcome, so test for that last. */
347 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
348 i = (i << 2) + i + perturb + 1;
349 ep = &ep0[i & mask];
350 if (ep->me_key == NULL)
351 return freeslot == NULL ? ep : freeslot;
352 if (ep->me_key == key
353 || (ep->me_hash == hash
354 && ep->me_key != dummy
355 && _PyString_Eq(ep->me_key, key)))
356 return ep;
357 if (ep->me_key == dummy && freeslot == NULL)
358 freeslot = ep;
363 Internal routine to insert a new item into the table.
364 Used both by the internal resize routine and by the public insert routine.
365 Eats a reference to key and one to value.
367 static void
368 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
370 PyObject *old_value;
371 register dictentry *ep;
372 typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
374 assert(mp->ma_lookup != NULL);
375 ep = mp->ma_lookup(mp, key, hash);
376 if (ep->me_value != NULL) {
377 old_value = ep->me_value;
378 ep->me_value = value;
379 Py_DECREF(old_value); /* which **CAN** re-enter */
380 Py_DECREF(key);
382 else {
383 if (ep->me_key == NULL)
384 mp->ma_fill++;
385 else
386 Py_DECREF(ep->me_key);
387 ep->me_key = key;
388 ep->me_hash = hash;
389 ep->me_value = value;
390 mp->ma_used++;
395 Restructure the table by allocating a new table and reinserting all
396 items again. When entries have been deleted, the new table may
397 actually be smaller than the old one.
399 static int
400 dictresize(dictobject *mp, int minused)
402 int newsize;
403 dictentry *oldtable, *newtable, *ep;
404 int i;
405 int is_oldtable_malloced;
406 dictentry small_copy[PyDict_MINSIZE];
408 assert(minused >= 0);
410 /* Find the smallest table size > minused. */
411 for (newsize = PyDict_MINSIZE;
412 newsize <= minused && newsize > 0;
413 newsize <<= 1)
415 if (newsize <= 0) {
416 PyErr_NoMemory();
417 return -1;
420 /* Get space for a new table. */
421 oldtable = mp->ma_table;
422 assert(oldtable != NULL);
423 is_oldtable_malloced = oldtable != mp->ma_smalltable;
425 if (newsize == PyDict_MINSIZE) {
426 /* A large table is shrinking, or we can't get any smaller. */
427 newtable = mp->ma_smalltable;
428 if (newtable == oldtable) {
429 if (mp->ma_fill == mp->ma_used) {
430 /* No dummies, so no point doing anything. */
431 return 0;
433 /* We're not going to resize it, but rebuild the
434 table anyway to purge old dummy entries.
435 Subtle: This is *necessary* if fill==size,
436 as lookdict needs at least one virgin slot to
437 terminate failing searches. If fill < size, it's
438 merely desirable, as dummies slow searches. */
439 assert(mp->ma_fill > mp->ma_used);
440 memcpy(small_copy, oldtable, sizeof(small_copy));
441 oldtable = small_copy;
444 else {
445 newtable = PyMem_NEW(dictentry, newsize);
446 if (newtable == NULL) {
447 PyErr_NoMemory();
448 return -1;
452 /* Make the dict empty, using the new table. */
453 assert(newtable != oldtable);
454 mp->ma_table = newtable;
455 mp->ma_mask = newsize - 1;
456 memset(newtable, 0, sizeof(dictentry) * newsize);
457 mp->ma_used = 0;
458 i = mp->ma_fill;
459 mp->ma_fill = 0;
461 /* Copy the data over; this is refcount-neutral for active entries;
462 dummy entries aren't copied over, of course */
463 for (ep = oldtable; i > 0; ep++) {
464 if (ep->me_value != NULL) { /* active entry */
465 --i;
466 insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
468 else if (ep->me_key != NULL) { /* dummy entry */
469 --i;
470 assert(ep->me_key == dummy);
471 Py_DECREF(ep->me_key);
473 /* else key == value == NULL: nothing to do */
476 if (is_oldtable_malloced)
477 PyMem_DEL(oldtable);
478 return 0;
481 PyObject *
482 PyDict_GetItem(PyObject *op, PyObject *key)
484 long hash;
485 dictobject *mp = (dictobject *)op;
486 if (!PyDict_Check(op)) {
487 return NULL;
489 if (!PyString_CheckExact(key) ||
490 (hash = ((PyStringObject *) key)->ob_shash) == -1)
492 hash = PyObject_Hash(key);
493 if (hash == -1) {
494 PyErr_Clear();
495 return NULL;
498 return (mp->ma_lookup)(mp, key, hash)->me_value;
501 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
502 * dictionary if it is merely replacing the value for an existing key.
503 * This is means that it's safe to loop over a dictionary with
504 * PyDict_Next() and occasionally replace a value -- but you can't
505 * insert new keys or remove them.
508 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
510 register dictobject *mp;
511 register long hash;
512 register int n_used;
514 if (!PyDict_Check(op)) {
515 PyErr_BadInternalCall();
516 return -1;
518 mp = (dictobject *)op;
519 if (PyString_CheckExact(key)) {
520 hash = ((PyStringObject *)key)->ob_shash;
521 if (hash == -1)
522 hash = PyObject_Hash(key);
524 else {
525 hash = PyObject_Hash(key);
526 if (hash == -1)
527 return -1;
529 assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
530 n_used = mp->ma_used;
531 Py_INCREF(value);
532 Py_INCREF(key);
533 insertdict(mp, key, hash, value);
534 /* If we added a key, we can safely resize. Otherwise just return!
535 * If fill >= 2/3 size, adjust size. Normally, this doubles or
536 * quaduples the size, but it's also possible for the dict to shrink
537 * (if ma_fill is much larger than ma_used, meaning a lot of dict
538 * keys have been * deleted).
540 * Quadrupling the size improves average dictionary sparseness
541 * (reducing collisions) at the cost of some memory and iteration
542 * speed (which loops over every possible entry). It also halves
543 * the number of expensive resize operations in a growing dictionary.
545 * Very large dictionaries (over 50K items) use doubling instead.
546 * This may help applications with severe memory constraints.
548 if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
549 return 0;
550 return dictresize(mp, mp->ma_used*(mp->ma_used>50000 ? 2 : 4));
554 PyDict_DelItem(PyObject *op, PyObject *key)
556 register dictobject *mp;
557 register long hash;
558 register dictentry *ep;
559 PyObject *old_value, *old_key;
561 if (!PyDict_Check(op)) {
562 PyErr_BadInternalCall();
563 return -1;
565 if (!PyString_CheckExact(key) ||
566 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
567 hash = PyObject_Hash(key);
568 if (hash == -1)
569 return -1;
571 mp = (dictobject *)op;
572 ep = (mp->ma_lookup)(mp, key, hash);
573 if (ep->me_value == NULL) {
574 PyErr_SetObject(PyExc_KeyError, key);
575 return -1;
577 old_key = ep->me_key;
578 Py_INCREF(dummy);
579 ep->me_key = dummy;
580 old_value = ep->me_value;
581 ep->me_value = NULL;
582 mp->ma_used--;
583 Py_DECREF(old_value);
584 Py_DECREF(old_key);
585 return 0;
588 void
589 PyDict_Clear(PyObject *op)
591 dictobject *mp;
592 dictentry *ep, *table;
593 int table_is_malloced;
594 int fill;
595 dictentry small_copy[PyDict_MINSIZE];
596 #ifdef Py_DEBUG
597 int i, n;
598 #endif
600 if (!PyDict_Check(op))
601 return;
602 mp = (dictobject *)op;
603 #ifdef Py_DEBUG
604 n = mp->ma_mask + 1;
605 i = 0;
606 #endif
608 table = mp->ma_table;
609 assert(table != NULL);
610 table_is_malloced = table != mp->ma_smalltable;
612 /* This is delicate. During the process of clearing the dict,
613 * decrefs can cause the dict to mutate. To avoid fatal confusion
614 * (voice of experience), we have to make the dict empty before
615 * clearing the slots, and never refer to anything via mp->xxx while
616 * clearing.
618 fill = mp->ma_fill;
619 if (table_is_malloced)
620 EMPTY_TO_MINSIZE(mp);
622 else if (fill > 0) {
623 /* It's a small table with something that needs to be cleared.
624 * Afraid the only safe way is to copy the dict entries into
625 * another small table first.
627 memcpy(small_copy, table, sizeof(small_copy));
628 table = small_copy;
629 EMPTY_TO_MINSIZE(mp);
631 /* else it's a small table that's already empty */
633 /* Now we can finally clear things. If C had refcounts, we could
634 * assert that the refcount on table is 1 now, i.e. that this function
635 * has unique access to it, so decref side-effects can't alter it.
637 for (ep = table; fill > 0; ++ep) {
638 #ifdef Py_DEBUG
639 assert(i < n);
640 ++i;
641 #endif
642 if (ep->me_key) {
643 --fill;
644 Py_DECREF(ep->me_key);
645 Py_XDECREF(ep->me_value);
647 #ifdef Py_DEBUG
648 else
649 assert(ep->me_value == NULL);
650 #endif
653 if (table_is_malloced)
654 PyMem_DEL(table);
658 * Iterate over a dict. Use like so:
660 * int i;
661 * PyObject *key, *value;
662 * i = 0; # important! i should not otherwise be changed by you
663 * while (PyDict_Next(yourdict, &i, &key, &value)) {
664 * Refer to borrowed references in key and value.
667 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
668 * mutates the dict. One exception: it is safe if the loop merely changes
669 * the values associated with the keys (but doesn't insert new keys or
670 * delete keys), via PyDict_SetItem().
673 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
675 int i;
676 register dictobject *mp;
677 if (!PyDict_Check(op))
678 return 0;
679 mp = (dictobject *)op;
680 i = *ppos;
681 if (i < 0)
682 return 0;
683 while (i <= mp->ma_mask && mp->ma_table[i].me_value == NULL)
684 i++;
685 *ppos = i+1;
686 if (i > mp->ma_mask)
687 return 0;
688 if (pkey)
689 *pkey = mp->ma_table[i].me_key;
690 if (pvalue)
691 *pvalue = mp->ma_table[i].me_value;
692 return 1;
695 /* Methods */
697 static void
698 dict_dealloc(register dictobject *mp)
700 register dictentry *ep;
701 int fill = mp->ma_fill;
702 PyObject_GC_UnTrack(mp);
703 Py_TRASHCAN_SAFE_BEGIN(mp)
704 for (ep = mp->ma_table; fill > 0; ep++) {
705 if (ep->me_key) {
706 --fill;
707 Py_DECREF(ep->me_key);
708 Py_XDECREF(ep->me_value);
711 if (mp->ma_table != mp->ma_smalltable)
712 PyMem_DEL(mp->ma_table);
713 mp->ob_type->tp_free((PyObject *)mp);
714 Py_TRASHCAN_SAFE_END(mp)
717 static int
718 dict_print(register dictobject *mp, register FILE *fp, register int flags)
720 register int i;
721 register int any;
723 i = Py_ReprEnter((PyObject*)mp);
724 if (i != 0) {
725 if (i < 0)
726 return i;
727 fprintf(fp, "{...}");
728 return 0;
731 fprintf(fp, "{");
732 any = 0;
733 for (i = 0; i <= mp->ma_mask; i++) {
734 dictentry *ep = mp->ma_table + i;
735 PyObject *pvalue = ep->me_value;
736 if (pvalue != NULL) {
737 /* Prevent PyObject_Repr from deleting value during
738 key format */
739 Py_INCREF(pvalue);
740 if (any++ > 0)
741 fprintf(fp, ", ");
742 if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
743 Py_DECREF(pvalue);
744 Py_ReprLeave((PyObject*)mp);
745 return -1;
747 fprintf(fp, ": ");
748 if (PyObject_Print(pvalue, fp, 0) != 0) {
749 Py_DECREF(pvalue);
750 Py_ReprLeave((PyObject*)mp);
751 return -1;
753 Py_DECREF(pvalue);
756 fprintf(fp, "}");
757 Py_ReprLeave((PyObject*)mp);
758 return 0;
761 static PyObject *
762 dict_repr(dictobject *mp)
764 int i;
765 PyObject *s, *temp, *colon = NULL;
766 PyObject *pieces = NULL, *result = NULL;
767 PyObject *key, *value;
769 i = Py_ReprEnter((PyObject *)mp);
770 if (i != 0) {
771 return i > 0 ? PyString_FromString("{...}") : NULL;
774 if (mp->ma_used == 0) {
775 result = PyString_FromString("{}");
776 goto Done;
779 pieces = PyList_New(0);
780 if (pieces == NULL)
781 goto Done;
783 colon = PyString_FromString(": ");
784 if (colon == NULL)
785 goto Done;
787 /* Do repr() on each key+value pair, and insert ": " between them.
788 Note that repr may mutate the dict. */
789 i = 0;
790 while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
791 int status;
792 /* Prevent repr from deleting value during key format. */
793 Py_INCREF(value);
794 s = PyObject_Repr(key);
795 PyString_Concat(&s, colon);
796 PyString_ConcatAndDel(&s, PyObject_Repr(value));
797 Py_DECREF(value);
798 if (s == NULL)
799 goto Done;
800 status = PyList_Append(pieces, s);
801 Py_DECREF(s); /* append created a new ref */
802 if (status < 0)
803 goto Done;
806 /* Add "{}" decorations to the first and last items. */
807 assert(PyList_GET_SIZE(pieces) > 0);
808 s = PyString_FromString("{");
809 if (s == NULL)
810 goto Done;
811 temp = PyList_GET_ITEM(pieces, 0);
812 PyString_ConcatAndDel(&s, temp);
813 PyList_SET_ITEM(pieces, 0, s);
814 if (s == NULL)
815 goto Done;
817 s = PyString_FromString("}");
818 if (s == NULL)
819 goto Done;
820 temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
821 PyString_ConcatAndDel(&temp, s);
822 PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
823 if (temp == NULL)
824 goto Done;
826 /* Paste them all together with ", " between. */
827 s = PyString_FromString(", ");
828 if (s == NULL)
829 goto Done;
830 result = _PyString_Join(s, pieces);
831 Py_DECREF(s);
833 Done:
834 Py_XDECREF(pieces);
835 Py_XDECREF(colon);
836 Py_ReprLeave((PyObject *)mp);
837 return result;
840 static int
841 dict_length(dictobject *mp)
843 return mp->ma_used;
846 static PyObject *
847 dict_subscript(dictobject *mp, register PyObject *key)
849 PyObject *v;
850 long hash;
851 assert(mp->ma_table != NULL);
852 if (!PyString_CheckExact(key) ||
853 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
854 hash = PyObject_Hash(key);
855 if (hash == -1)
856 return NULL;
858 v = (mp->ma_lookup)(mp, key, hash) -> me_value;
859 if (v == NULL)
860 PyErr_SetObject(PyExc_KeyError, key);
861 else
862 Py_INCREF(v);
863 return v;
866 static int
867 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
869 if (w == NULL)
870 return PyDict_DelItem((PyObject *)mp, v);
871 else
872 return PyDict_SetItem((PyObject *)mp, v, w);
875 static PyMappingMethods dict_as_mapping = {
876 (inquiry)dict_length, /*mp_length*/
877 (binaryfunc)dict_subscript, /*mp_subscript*/
878 (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
881 static PyObject *
882 dict_keys(register dictobject *mp)
884 register PyObject *v;
885 register int i, j, n;
887 again:
888 n = mp->ma_used;
889 v = PyList_New(n);
890 if (v == NULL)
891 return NULL;
892 if (n != mp->ma_used) {
893 /* Durnit. The allocations caused the dict to resize.
894 * Just start over, this shouldn't normally happen.
896 Py_DECREF(v);
897 goto again;
899 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
900 if (mp->ma_table[i].me_value != NULL) {
901 PyObject *key = mp->ma_table[i].me_key;
902 Py_INCREF(key);
903 PyList_SET_ITEM(v, j, key);
904 j++;
907 return v;
910 static PyObject *
911 dict_values(register dictobject *mp)
913 register PyObject *v;
914 register int i, j, n;
916 again:
917 n = mp->ma_used;
918 v = PyList_New(n);
919 if (v == NULL)
920 return NULL;
921 if (n != mp->ma_used) {
922 /* Durnit. The allocations caused the dict to resize.
923 * Just start over, this shouldn't normally happen.
925 Py_DECREF(v);
926 goto again;
928 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
929 if (mp->ma_table[i].me_value != NULL) {
930 PyObject *value = mp->ma_table[i].me_value;
931 Py_INCREF(value);
932 PyList_SET_ITEM(v, j, value);
933 j++;
936 return v;
939 static PyObject *
940 dict_items(register dictobject *mp)
942 register PyObject *v;
943 register int i, j, n;
944 PyObject *item, *key, *value;
946 /* Preallocate the list of tuples, to avoid allocations during
947 * the loop over the items, which could trigger GC, which
948 * could resize the dict. :-(
950 again:
951 n = mp->ma_used;
952 v = PyList_New(n);
953 if (v == NULL)
954 return NULL;
955 for (i = 0; i < n; i++) {
956 item = PyTuple_New(2);
957 if (item == NULL) {
958 Py_DECREF(v);
959 return NULL;
961 PyList_SET_ITEM(v, i, item);
963 if (n != mp->ma_used) {
964 /* Durnit. The allocations caused the dict to resize.
965 * Just start over, this shouldn't normally happen.
967 Py_DECREF(v);
968 goto again;
970 /* Nothing we do below makes any function calls. */
971 for (i = 0, j = 0; i <= mp->ma_mask; i++) {
972 if (mp->ma_table[i].me_value != NULL) {
973 key = mp->ma_table[i].me_key;
974 value = mp->ma_table[i].me_value;
975 item = PyList_GET_ITEM(v, j);
976 Py_INCREF(key);
977 PyTuple_SET_ITEM(item, 0, key);
978 Py_INCREF(value);
979 PyTuple_SET_ITEM(item, 1, value);
980 j++;
983 assert(j == n);
984 return v;
987 static PyObject *
988 dict_fromkeys(PyObject *cls, PyObject *args)
990 PyObject *seq;
991 PyObject *value = Py_None;
992 PyObject *it; /* iter(seq) */
993 PyObject *key;
994 PyObject *d;
995 int status;
997 if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
998 return NULL;
1000 d = PyObject_CallObject(cls, NULL);
1001 if (d == NULL)
1002 return NULL;
1004 it = PyObject_GetIter(seq);
1005 if (it == NULL){
1006 Py_DECREF(d);
1007 return NULL;
1010 for (;;) {
1011 key = PyIter_Next(it);
1012 if (key == NULL) {
1013 if (PyErr_Occurred())
1014 goto Fail;
1015 break;
1017 status = PyObject_SetItem(d, key, value);
1018 Py_DECREF(key);
1019 if (status < 0)
1020 goto Fail;
1023 Py_DECREF(it);
1024 return d;
1026 Fail:
1027 Py_DECREF(it);
1028 Py_DECREF(d);
1029 return NULL;
1032 static PyObject *
1033 dict_update(PyObject *mp, PyObject *other)
1035 if (PyDict_Update(mp, other) < 0)
1036 return NULL;
1037 Py_INCREF(Py_None);
1038 return Py_None;
1041 /* Update unconditionally replaces existing items.
1042 Merge has a 3rd argument 'override'; if set, it acts like Update,
1043 otherwise it leaves existing items unchanged.
1045 PyDict_{Update,Merge} update/merge from a mapping object.
1047 PyDict_MergeFromSeq2 updates/merges from any iterable object
1048 producing iterable objects of length 2.
1052 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1054 PyObject *it; /* iter(seq2) */
1055 int i; /* index into seq2 of current element */
1056 PyObject *item; /* seq2[i] */
1057 PyObject *fast; /* item as a 2-tuple or 2-list */
1059 assert(d != NULL);
1060 assert(PyDict_Check(d));
1061 assert(seq2 != NULL);
1063 it = PyObject_GetIter(seq2);
1064 if (it == NULL)
1065 return -1;
1067 for (i = 0; ; ++i) {
1068 PyObject *key, *value;
1069 int n;
1071 fast = NULL;
1072 item = PyIter_Next(it);
1073 if (item == NULL) {
1074 if (PyErr_Occurred())
1075 goto Fail;
1076 break;
1079 /* Convert item to sequence, and verify length 2. */
1080 fast = PySequence_Fast(item, "");
1081 if (fast == NULL) {
1082 if (PyErr_ExceptionMatches(PyExc_TypeError))
1083 PyErr_Format(PyExc_TypeError,
1084 "cannot convert dictionary update "
1085 "sequence element #%d to a sequence",
1087 goto Fail;
1089 n = PySequence_Fast_GET_SIZE(fast);
1090 if (n != 2) {
1091 PyErr_Format(PyExc_ValueError,
1092 "dictionary update sequence element #%d "
1093 "has length %d; 2 is required",
1094 i, n);
1095 goto Fail;
1098 /* Update/merge with this (key, value) pair. */
1099 key = PySequence_Fast_GET_ITEM(fast, 0);
1100 value = PySequence_Fast_GET_ITEM(fast, 1);
1101 if (override || PyDict_GetItem(d, key) == NULL) {
1102 int status = PyDict_SetItem(d, key, value);
1103 if (status < 0)
1104 goto Fail;
1106 Py_DECREF(fast);
1107 Py_DECREF(item);
1110 i = 0;
1111 goto Return;
1112 Fail:
1113 Py_XDECREF(item);
1114 Py_XDECREF(fast);
1115 i = -1;
1116 Return:
1117 Py_DECREF(it);
1118 return i;
1122 PyDict_Update(PyObject *a, PyObject *b)
1124 return PyDict_Merge(a, b, 1);
1128 PyDict_Merge(PyObject *a, PyObject *b, int override)
1130 register PyDictObject *mp, *other;
1131 register int i;
1132 dictentry *entry;
1134 /* We accept for the argument either a concrete dictionary object,
1135 * or an abstract "mapping" object. For the former, we can do
1136 * things quite efficiently. For the latter, we only require that
1137 * PyMapping_Keys() and PyObject_GetItem() be supported.
1139 if (a == NULL || !PyDict_Check(a) || b == NULL) {
1140 PyErr_BadInternalCall();
1141 return -1;
1143 mp = (dictobject*)a;
1144 if (PyDict_Check(b)) {
1145 other = (dictobject*)b;
1146 if (other == mp || other->ma_used == 0)
1147 /* a.update(a) or a.update({}); nothing to do */
1148 return 0;
1149 /* Do one big resize at the start, rather than
1150 * incrementally resizing as we insert new items. Expect
1151 * that there will be no (or few) overlapping keys.
1153 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1154 if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1155 return -1;
1157 for (i = 0; i <= other->ma_mask; i++) {
1158 entry = &other->ma_table[i];
1159 if (entry->me_value != NULL &&
1160 (override ||
1161 PyDict_GetItem(a, entry->me_key) == NULL)) {
1162 Py_INCREF(entry->me_key);
1163 Py_INCREF(entry->me_value);
1164 insertdict(mp, entry->me_key, entry->me_hash,
1165 entry->me_value);
1169 else {
1170 /* Do it the generic, slower way */
1171 PyObject *keys = PyMapping_Keys(b);
1172 PyObject *iter;
1173 PyObject *key, *value;
1174 int status;
1176 if (keys == NULL)
1177 /* Docstring says this is equivalent to E.keys() so
1178 * if E doesn't have a .keys() method we want
1179 * AttributeError to percolate up. Might as well
1180 * do the same for any other error.
1182 return -1;
1184 iter = PyObject_GetIter(keys);
1185 Py_DECREF(keys);
1186 if (iter == NULL)
1187 return -1;
1189 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1190 if (!override && PyDict_GetItem(a, key) != NULL) {
1191 Py_DECREF(key);
1192 continue;
1194 value = PyObject_GetItem(b, key);
1195 if (value == NULL) {
1196 Py_DECREF(iter);
1197 Py_DECREF(key);
1198 return -1;
1200 status = PyDict_SetItem(a, key, value);
1201 Py_DECREF(key);
1202 Py_DECREF(value);
1203 if (status < 0) {
1204 Py_DECREF(iter);
1205 return -1;
1208 Py_DECREF(iter);
1209 if (PyErr_Occurred())
1210 /* Iterator completed, via error */
1211 return -1;
1213 return 0;
1216 static PyObject *
1217 dict_copy(register dictobject *mp)
1219 return PyDict_Copy((PyObject*)mp);
1222 PyObject *
1223 PyDict_Copy(PyObject *o)
1225 register dictobject *mp;
1226 register int i;
1227 dictobject *copy;
1228 dictentry *entry;
1230 if (o == NULL || !PyDict_Check(o)) {
1231 PyErr_BadInternalCall();
1232 return NULL;
1234 mp = (dictobject *)o;
1235 copy = (dictobject *)PyDict_New();
1236 if (copy == NULL)
1237 return NULL;
1238 if (mp->ma_used > 0) {
1239 if (dictresize(copy, mp->ma_used*2) != 0)
1240 return NULL;
1241 for (i = 0; i <= mp->ma_mask; i++) {
1242 entry = &mp->ma_table[i];
1243 if (entry->me_value != NULL) {
1244 Py_INCREF(entry->me_key);
1245 Py_INCREF(entry->me_value);
1246 insertdict(copy, entry->me_key, entry->me_hash,
1247 entry->me_value);
1251 return (PyObject *)copy;
1255 PyDict_Size(PyObject *mp)
1257 if (mp == NULL || !PyDict_Check(mp)) {
1258 PyErr_BadInternalCall();
1259 return 0;
1261 return ((dictobject *)mp)->ma_used;
1264 PyObject *
1265 PyDict_Keys(PyObject *mp)
1267 if (mp == NULL || !PyDict_Check(mp)) {
1268 PyErr_BadInternalCall();
1269 return NULL;
1271 return dict_keys((dictobject *)mp);
1274 PyObject *
1275 PyDict_Values(PyObject *mp)
1277 if (mp == NULL || !PyDict_Check(mp)) {
1278 PyErr_BadInternalCall();
1279 return NULL;
1281 return dict_values((dictobject *)mp);
1284 PyObject *
1285 PyDict_Items(PyObject *mp)
1287 if (mp == NULL || !PyDict_Check(mp)) {
1288 PyErr_BadInternalCall();
1289 return NULL;
1291 return dict_items((dictobject *)mp);
1294 /* Subroutine which returns the smallest key in a for which b's value
1295 is different or absent. The value is returned too, through the
1296 pval argument. Both are NULL if no key in a is found for which b's status
1297 differs. The refcounts on (and only on) non-NULL *pval and function return
1298 values must be decremented by the caller (characterize() increments them
1299 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1300 them before the caller is done looking at them). */
1302 static PyObject *
1303 characterize(dictobject *a, dictobject *b, PyObject **pval)
1305 PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1306 PyObject *aval = NULL; /* a[akey] */
1307 int i, cmp;
1309 for (i = 0; i <= a->ma_mask; i++) {
1310 PyObject *thiskey, *thisaval, *thisbval;
1311 if (a->ma_table[i].me_value == NULL)
1312 continue;
1313 thiskey = a->ma_table[i].me_key;
1314 Py_INCREF(thiskey); /* keep alive across compares */
1315 if (akey != NULL) {
1316 cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1317 if (cmp < 0) {
1318 Py_DECREF(thiskey);
1319 goto Fail;
1321 if (cmp > 0 ||
1322 i > a->ma_mask ||
1323 a->ma_table[i].me_value == NULL)
1325 /* Not the *smallest* a key; or maybe it is
1326 * but the compare shrunk the dict so we can't
1327 * find its associated value anymore; or
1328 * maybe it is but the compare deleted the
1329 * a[thiskey] entry.
1331 Py_DECREF(thiskey);
1332 continue;
1336 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1337 thisaval = a->ma_table[i].me_value;
1338 assert(thisaval);
1339 Py_INCREF(thisaval); /* keep alive */
1340 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1341 if (thisbval == NULL)
1342 cmp = 0;
1343 else {
1344 /* both dicts have thiskey: same values? */
1345 cmp = PyObject_RichCompareBool(
1346 thisaval, thisbval, Py_EQ);
1347 if (cmp < 0) {
1348 Py_DECREF(thiskey);
1349 Py_DECREF(thisaval);
1350 goto Fail;
1353 if (cmp == 0) {
1354 /* New winner. */
1355 Py_XDECREF(akey);
1356 Py_XDECREF(aval);
1357 akey = thiskey;
1358 aval = thisaval;
1360 else {
1361 Py_DECREF(thiskey);
1362 Py_DECREF(thisaval);
1365 *pval = aval;
1366 return akey;
1368 Fail:
1369 Py_XDECREF(akey);
1370 Py_XDECREF(aval);
1371 *pval = NULL;
1372 return NULL;
1375 static int
1376 dict_compare(dictobject *a, dictobject *b)
1378 PyObject *adiff, *bdiff, *aval, *bval;
1379 int res;
1381 /* Compare lengths first */
1382 if (a->ma_used < b->ma_used)
1383 return -1; /* a is shorter */
1384 else if (a->ma_used > b->ma_used)
1385 return 1; /* b is shorter */
1387 /* Same length -- check all keys */
1388 bdiff = bval = NULL;
1389 adiff = characterize(a, b, &aval);
1390 if (adiff == NULL) {
1391 assert(!aval);
1392 /* Either an error, or a is a subset with the same length so
1393 * must be equal.
1395 res = PyErr_Occurred() ? -1 : 0;
1396 goto Finished;
1398 bdiff = characterize(b, a, &bval);
1399 if (bdiff == NULL && PyErr_Occurred()) {
1400 assert(!bval);
1401 res = -1;
1402 goto Finished;
1404 res = 0;
1405 if (bdiff) {
1406 /* bdiff == NULL "should be" impossible now, but perhaps
1407 * the last comparison done by the characterize() on a had
1408 * the side effect of making the dicts equal!
1410 res = PyObject_Compare(adiff, bdiff);
1412 if (res == 0 && bval != NULL)
1413 res = PyObject_Compare(aval, bval);
1415 Finished:
1416 Py_XDECREF(adiff);
1417 Py_XDECREF(bdiff);
1418 Py_XDECREF(aval);
1419 Py_XDECREF(bval);
1420 return res;
1423 /* Return 1 if dicts equal, 0 if not, -1 if error.
1424 * Gets out as soon as any difference is detected.
1425 * Uses only Py_EQ comparison.
1427 static int
1428 dict_equal(dictobject *a, dictobject *b)
1430 int i;
1432 if (a->ma_used != b->ma_used)
1433 /* can't be equal if # of entries differ */
1434 return 0;
1436 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1437 for (i = 0; i <= a->ma_mask; i++) {
1438 PyObject *aval = a->ma_table[i].me_value;
1439 if (aval != NULL) {
1440 int cmp;
1441 PyObject *bval;
1442 PyObject *key = a->ma_table[i].me_key;
1443 /* temporarily bump aval's refcount to ensure it stays
1444 alive until we're done with it */
1445 Py_INCREF(aval);
1446 bval = PyDict_GetItem((PyObject *)b, key);
1447 if (bval == NULL) {
1448 Py_DECREF(aval);
1449 return 0;
1451 cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1452 Py_DECREF(aval);
1453 if (cmp <= 0) /* error or not equal */
1454 return cmp;
1457 return 1;
1460 static PyObject *
1461 dict_richcompare(PyObject *v, PyObject *w, int op)
1463 int cmp;
1464 PyObject *res;
1466 if (!PyDict_Check(v) || !PyDict_Check(w)) {
1467 res = Py_NotImplemented;
1469 else if (op == Py_EQ || op == Py_NE) {
1470 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1471 if (cmp < 0)
1472 return NULL;
1473 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1475 else
1476 res = Py_NotImplemented;
1477 Py_INCREF(res);
1478 return res;
1481 static PyObject *
1482 dict_has_key(register dictobject *mp, PyObject *key)
1484 long hash;
1485 register long ok;
1486 if (!PyString_CheckExact(key) ||
1487 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1488 hash = PyObject_Hash(key);
1489 if (hash == -1)
1490 return NULL;
1492 ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1493 return PyBool_FromLong(ok);
1496 static PyObject *
1497 dict_get(register dictobject *mp, PyObject *args)
1499 PyObject *key;
1500 PyObject *failobj = Py_None;
1501 PyObject *val = NULL;
1502 long hash;
1504 if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1505 return NULL;
1507 if (!PyString_CheckExact(key) ||
1508 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1509 hash = PyObject_Hash(key);
1510 if (hash == -1)
1511 return NULL;
1513 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1515 if (val == NULL)
1516 val = failobj;
1517 Py_INCREF(val);
1518 return val;
1522 static PyObject *
1523 dict_setdefault(register dictobject *mp, PyObject *args)
1525 PyObject *key;
1526 PyObject *failobj = Py_None;
1527 PyObject *val = NULL;
1528 long hash;
1530 if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1531 return NULL;
1533 if (!PyString_CheckExact(key) ||
1534 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1535 hash = PyObject_Hash(key);
1536 if (hash == -1)
1537 return NULL;
1539 val = (mp->ma_lookup)(mp, key, hash)->me_value;
1540 if (val == NULL) {
1541 val = failobj;
1542 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1543 val = NULL;
1545 Py_XINCREF(val);
1546 return val;
1550 static PyObject *
1551 dict_clear(register dictobject *mp)
1553 PyDict_Clear((PyObject *)mp);
1554 Py_INCREF(Py_None);
1555 return Py_None;
1558 static PyObject *
1559 dict_pop(dictobject *mp, PyObject *args)
1561 long hash;
1562 dictentry *ep;
1563 PyObject *old_value, *old_key;
1564 PyObject *key, *deflt = NULL;
1566 if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1567 return NULL;
1568 if (mp->ma_used == 0) {
1569 if (deflt) {
1570 Py_INCREF(deflt);
1571 return deflt;
1573 PyErr_SetString(PyExc_KeyError,
1574 "pop(): dictionary is empty");
1575 return NULL;
1577 if (!PyString_CheckExact(key) ||
1578 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1579 hash = PyObject_Hash(key);
1580 if (hash == -1)
1581 return NULL;
1583 ep = (mp->ma_lookup)(mp, key, hash);
1584 if (ep->me_value == NULL) {
1585 if (deflt) {
1586 Py_INCREF(deflt);
1587 return deflt;
1589 PyErr_SetObject(PyExc_KeyError, key);
1590 return NULL;
1592 old_key = ep->me_key;
1593 Py_INCREF(dummy);
1594 ep->me_key = dummy;
1595 old_value = ep->me_value;
1596 ep->me_value = NULL;
1597 mp->ma_used--;
1598 Py_DECREF(old_key);
1599 return old_value;
1602 static PyObject *
1603 dict_popitem(dictobject *mp)
1605 int i = 0;
1606 dictentry *ep;
1607 PyObject *res;
1609 /* Allocate the result tuple before checking the size. Believe it
1610 * or not, this allocation could trigger a garbage collection which
1611 * could empty the dict, so if we checked the size first and that
1612 * happened, the result would be an infinite loop (searching for an
1613 * entry that no longer exists). Note that the usual popitem()
1614 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1615 * tuple away if the dict *is* empty isn't a significant
1616 * inefficiency -- possible, but unlikely in practice.
1618 res = PyTuple_New(2);
1619 if (res == NULL)
1620 return NULL;
1621 if (mp->ma_used == 0) {
1622 Py_DECREF(res);
1623 PyErr_SetString(PyExc_KeyError,
1624 "popitem(): dictionary is empty");
1625 return NULL;
1627 /* Set ep to "the first" dict entry with a value. We abuse the hash
1628 * field of slot 0 to hold a search finger:
1629 * If slot 0 has a value, use slot 0.
1630 * Else slot 0 is being used to hold a search finger,
1631 * and we use its hash value as the first index to look.
1633 ep = &mp->ma_table[0];
1634 if (ep->me_value == NULL) {
1635 i = (int)ep->me_hash;
1636 /* The hash field may be a real hash value, or it may be a
1637 * legit search finger, or it may be a once-legit search
1638 * finger that's out of bounds now because it wrapped around
1639 * or the table shrunk -- simply make sure it's in bounds now.
1641 if (i > mp->ma_mask || i < 1)
1642 i = 1; /* skip slot 0 */
1643 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1644 i++;
1645 if (i > mp->ma_mask)
1646 i = 1;
1649 PyTuple_SET_ITEM(res, 0, ep->me_key);
1650 PyTuple_SET_ITEM(res, 1, ep->me_value);
1651 Py_INCREF(dummy);
1652 ep->me_key = dummy;
1653 ep->me_value = NULL;
1654 mp->ma_used--;
1655 assert(mp->ma_table[0].me_value == NULL);
1656 mp->ma_table[0].me_hash = i + 1; /* next place to start */
1657 return res;
1660 static int
1661 dict_traverse(PyObject *op, visitproc visit, void *arg)
1663 int i = 0, err;
1664 PyObject *pk;
1665 PyObject *pv;
1667 while (PyDict_Next(op, &i, &pk, &pv)) {
1668 err = visit(pk, arg);
1669 if (err)
1670 return err;
1671 err = visit(pv, arg);
1672 if (err)
1673 return err;
1675 return 0;
1678 static int
1679 dict_tp_clear(PyObject *op)
1681 PyDict_Clear(op);
1682 return 0;
1686 static PyObject *dictiter_new(dictobject *, binaryfunc);
1688 static PyObject *
1689 select_key(PyObject *key, PyObject *value)
1691 Py_INCREF(key);
1692 return key;
1695 static PyObject *
1696 select_value(PyObject *key, PyObject *value)
1698 Py_INCREF(value);
1699 return value;
1702 static PyObject *
1703 select_item(PyObject *key, PyObject *value)
1705 PyObject *res = PyTuple_New(2);
1707 if (res != NULL) {
1708 Py_INCREF(key);
1709 Py_INCREF(value);
1710 PyTuple_SET_ITEM(res, 0, key);
1711 PyTuple_SET_ITEM(res, 1, value);
1713 return res;
1716 static PyObject *
1717 dict_iterkeys(dictobject *dict)
1719 return dictiter_new(dict, select_key);
1722 static PyObject *
1723 dict_itervalues(dictobject *dict)
1725 return dictiter_new(dict, select_value);
1728 static PyObject *
1729 dict_iteritems(dictobject *dict)
1731 return dictiter_new(dict, select_item);
1735 PyDoc_STRVAR(has_key__doc__,
1736 "D.has_key(k) -> 1 if D has a key k, else 0");
1738 PyDoc_STRVAR(get__doc__,
1739 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
1741 PyDoc_STRVAR(setdefault_doc__,
1742 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
1744 PyDoc_STRVAR(pop__doc__,
1745 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value\n\
1746 If key is not found, d is returned if given, otherwise KeyError is raised");
1748 PyDoc_STRVAR(popitem__doc__,
1749 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1750 2-tuple; but raise KeyError if D is empty");
1752 PyDoc_STRVAR(keys__doc__,
1753 "D.keys() -> list of D's keys");
1755 PyDoc_STRVAR(items__doc__,
1756 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
1758 PyDoc_STRVAR(values__doc__,
1759 "D.values() -> list of D's values");
1761 PyDoc_STRVAR(update__doc__,
1762 "D.update(E) -> None. Update D from E: for k in E.keys(): D[k] = E[k]");
1764 PyDoc_STRVAR(fromkeys__doc__,
1765 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
1766 v defaults to None.");
1768 PyDoc_STRVAR(clear__doc__,
1769 "D.clear() -> None. Remove all items from D.");
1771 PyDoc_STRVAR(copy__doc__,
1772 "D.copy() -> a shallow copy of D");
1774 PyDoc_STRVAR(iterkeys__doc__,
1775 "D.iterkeys() -> an iterator over the keys of D");
1777 PyDoc_STRVAR(itervalues__doc__,
1778 "D.itervalues() -> an iterator over the values of D");
1780 PyDoc_STRVAR(iteritems__doc__,
1781 "D.iteritems() -> an iterator over the (key, value) items of D");
1783 static PyMethodDef mapp_methods[] = {
1784 {"has_key", (PyCFunction)dict_has_key, METH_O,
1785 has_key__doc__},
1786 {"get", (PyCFunction)dict_get, METH_VARARGS,
1787 get__doc__},
1788 {"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
1789 setdefault_doc__},
1790 {"pop", (PyCFunction)dict_pop, METH_VARARGS,
1791 pop__doc__},
1792 {"popitem", (PyCFunction)dict_popitem, METH_NOARGS,
1793 popitem__doc__},
1794 {"keys", (PyCFunction)dict_keys, METH_NOARGS,
1795 keys__doc__},
1796 {"items", (PyCFunction)dict_items, METH_NOARGS,
1797 items__doc__},
1798 {"values", (PyCFunction)dict_values, METH_NOARGS,
1799 values__doc__},
1800 {"update", (PyCFunction)dict_update, METH_O,
1801 update__doc__},
1802 {"fromkeys", (PyCFunction)dict_fromkeys, METH_VARARGS | METH_CLASS,
1803 fromkeys__doc__},
1804 {"clear", (PyCFunction)dict_clear, METH_NOARGS,
1805 clear__doc__},
1806 {"copy", (PyCFunction)dict_copy, METH_NOARGS,
1807 copy__doc__},
1808 {"iterkeys", (PyCFunction)dict_iterkeys, METH_NOARGS,
1809 iterkeys__doc__},
1810 {"itervalues", (PyCFunction)dict_itervalues, METH_NOARGS,
1811 itervalues__doc__},
1812 {"iteritems", (PyCFunction)dict_iteritems, METH_NOARGS,
1813 iteritems__doc__},
1814 {NULL, NULL} /* sentinel */
1817 static int
1818 dict_contains(dictobject *mp, PyObject *key)
1820 long hash;
1822 if (!PyString_CheckExact(key) ||
1823 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1824 hash = PyObject_Hash(key);
1825 if (hash == -1)
1826 return -1;
1828 return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1831 /* Hack to implement "key in dict" */
1832 static PySequenceMethods dict_as_sequence = {
1833 0, /* sq_length */
1834 0, /* sq_concat */
1835 0, /* sq_repeat */
1836 0, /* sq_item */
1837 0, /* sq_slice */
1838 0, /* sq_ass_item */
1839 0, /* sq_ass_slice */
1840 (objobjproc)dict_contains, /* sq_contains */
1841 0, /* sq_inplace_concat */
1842 0, /* sq_inplace_repeat */
1845 static PyObject *
1846 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1848 PyObject *self;
1850 assert(type != NULL && type->tp_alloc != NULL);
1851 self = type->tp_alloc(type, 0);
1852 if (self != NULL) {
1853 PyDictObject *d = (PyDictObject *)self;
1854 /* It's guaranteed that tp->alloc zeroed out the struct. */
1855 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1856 INIT_NONZERO_DICT_SLOTS(d);
1857 d->ma_lookup = lookdict_string;
1858 #ifdef SHOW_CONVERSION_COUNTS
1859 ++created;
1860 #endif
1862 return self;
1865 static int
1866 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
1868 PyObject *arg = NULL;
1869 int result = 0;
1871 if (!PyArg_UnpackTuple(args, "dict", 0, 1, &arg))
1872 result = -1;
1874 else if (arg != NULL) {
1875 if (PyObject_HasAttrString(arg, "keys"))
1876 result = PyDict_Merge(self, arg, 1);
1877 else
1878 result = PyDict_MergeFromSeq2(self, arg, 1);
1880 if (result == 0 && kwds != NULL)
1881 result = PyDict_Merge(self, kwds, 1);
1882 return result;
1885 static long
1886 dict_nohash(PyObject *self)
1888 PyErr_SetString(PyExc_TypeError, "dict objects are unhashable");
1889 return -1;
1892 static PyObject *
1893 dict_iter(dictobject *dict)
1895 return dictiter_new(dict, select_key);
1898 PyDoc_STRVAR(dictionary_doc,
1899 "dict() -> new empty dictionary.\n"
1900 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
1901 " (key, value) pairs.\n"
1902 "dict(seq) -> new dictionary initialized as if via:\n"
1903 " d = {}\n"
1904 " for k, v in seq:\n"
1905 " d[k] = v\n"
1906 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
1907 " in the keyword argument list. For example: dict(one=1, two=2)");
1909 PyTypeObject PyDict_Type = {
1910 PyObject_HEAD_INIT(&PyType_Type)
1912 "dict",
1913 sizeof(dictobject),
1915 (destructor)dict_dealloc, /* tp_dealloc */
1916 (printfunc)dict_print, /* tp_print */
1917 0, /* tp_getattr */
1918 0, /* tp_setattr */
1919 (cmpfunc)dict_compare, /* tp_compare */
1920 (reprfunc)dict_repr, /* tp_repr */
1921 0, /* tp_as_number */
1922 &dict_as_sequence, /* tp_as_sequence */
1923 &dict_as_mapping, /* tp_as_mapping */
1924 dict_nohash, /* tp_hash */
1925 0, /* tp_call */
1926 0, /* tp_str */
1927 PyObject_GenericGetAttr, /* tp_getattro */
1928 0, /* tp_setattro */
1929 0, /* tp_as_buffer */
1930 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
1931 Py_TPFLAGS_BASETYPE, /* tp_flags */
1932 dictionary_doc, /* tp_doc */
1933 (traverseproc)dict_traverse, /* tp_traverse */
1934 (inquiry)dict_tp_clear, /* tp_clear */
1935 dict_richcompare, /* tp_richcompare */
1936 0, /* tp_weaklistoffset */
1937 (getiterfunc)dict_iter, /* tp_iter */
1938 0, /* tp_iternext */
1939 mapp_methods, /* tp_methods */
1940 0, /* tp_members */
1941 0, /* tp_getset */
1942 0, /* tp_base */
1943 0, /* tp_dict */
1944 0, /* tp_descr_get */
1945 0, /* tp_descr_set */
1946 0, /* tp_dictoffset */
1947 (initproc)dict_init, /* tp_init */
1948 PyType_GenericAlloc, /* tp_alloc */
1949 dict_new, /* tp_new */
1950 PyObject_GC_Del, /* tp_free */
1953 /* For backward compatibility with old dictionary interface */
1955 PyObject *
1956 PyDict_GetItemString(PyObject *v, const char *key)
1958 PyObject *kv, *rv;
1959 kv = PyString_FromString(key);
1960 if (kv == NULL)
1961 return NULL;
1962 rv = PyDict_GetItem(v, kv);
1963 Py_DECREF(kv);
1964 return rv;
1968 PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
1970 PyObject *kv;
1971 int err;
1972 kv = PyString_FromString(key);
1973 if (kv == NULL)
1974 return -1;
1975 PyString_InternInPlace(&kv); /* XXX Should we really? */
1976 err = PyDict_SetItem(v, kv, item);
1977 Py_DECREF(kv);
1978 return err;
1982 PyDict_DelItemString(PyObject *v, const char *key)
1984 PyObject *kv;
1985 int err;
1986 kv = PyString_FromString(key);
1987 if (kv == NULL)
1988 return -1;
1989 err = PyDict_DelItem(v, kv);
1990 Py_DECREF(kv);
1991 return err;
1994 /* Dictionary iterator type */
1996 extern PyTypeObject PyDictIter_Type; /* Forward */
1998 typedef struct {
1999 PyObject_HEAD
2000 dictobject *di_dict; /* Set to NULL when iterator is exhausted */
2001 int di_used;
2002 int di_pos;
2003 binaryfunc di_select;
2004 } dictiterobject;
2006 static PyObject *
2007 dictiter_new(dictobject *dict, binaryfunc select)
2009 dictiterobject *di;
2010 di = PyObject_New(dictiterobject, &PyDictIter_Type);
2011 if (di == NULL)
2012 return NULL;
2013 Py_INCREF(dict);
2014 di->di_dict = dict;
2015 di->di_used = dict->ma_used;
2016 di->di_pos = 0;
2017 di->di_select = select;
2018 return (PyObject *)di;
2021 static void
2022 dictiter_dealloc(dictiterobject *di)
2024 Py_XDECREF(di->di_dict);
2025 PyObject_Del(di);
2028 static PyObject *dictiter_iternext(dictiterobject *di)
2030 PyObject *key, *value;
2032 if (di->di_dict == NULL)
2033 return NULL;
2035 if (di->di_used != di->di_dict->ma_used) {
2036 PyErr_SetString(PyExc_RuntimeError,
2037 "dictionary changed size during iteration");
2038 di->di_used = -1; /* Make this state sticky */
2039 return NULL;
2041 if (PyDict_Next((PyObject *)(di->di_dict), &di->di_pos, &key, &value))
2042 return (*di->di_select)(key, value);
2044 Py_DECREF(di->di_dict);
2045 di->di_dict = NULL;
2046 return NULL;
2049 PyTypeObject PyDictIter_Type = {
2050 PyObject_HEAD_INIT(&PyType_Type)
2051 0, /* ob_size */
2052 "dictionary-iterator", /* tp_name */
2053 sizeof(dictiterobject), /* tp_basicsize */
2054 0, /* tp_itemsize */
2055 /* methods */
2056 (destructor)dictiter_dealloc, /* tp_dealloc */
2057 0, /* tp_print */
2058 0, /* tp_getattr */
2059 0, /* tp_setattr */
2060 0, /* tp_compare */
2061 0, /* tp_repr */
2062 0, /* tp_as_number */
2063 0, /* tp_as_sequence */
2064 0, /* tp_as_mapping */
2065 0, /* tp_hash */
2066 0, /* tp_call */
2067 0, /* tp_str */
2068 PyObject_GenericGetAttr, /* tp_getattro */
2069 0, /* tp_setattro */
2070 0, /* tp_as_buffer */
2071 Py_TPFLAGS_DEFAULT, /* tp_flags */
2072 0, /* tp_doc */
2073 0, /* tp_traverse */
2074 0, /* tp_clear */
2075 0, /* tp_richcompare */
2076 0, /* tp_weaklistoffset */
2077 PyObject_SelfIter, /* tp_iter */
2078 (iternextfunc)dictiter_iternext, /* tp_iternext */
2079 0, /* tp_methods */
2080 0, /* tp_members */
2081 0, /* tp_getset */
2082 0, /* tp_base */
2083 0, /* tp_dict */
2084 0, /* tp_descr_get */
2085 0, /* tp_descr_set */