This commit was manufactured by cvs2svn to create tag 'r234c1'.
[python/dscho.git] / Doc / lib / libsets.tex
blob6d49b16e9d13a8c4ba4185eac419e527193f5ef7
1 \section{\module{sets} ---
2 Unordered collections of unique elements}
4 \declaremodule{standard}{sets}
5 \modulesynopsis{Implementation of sets of unique elements.}
6 \moduleauthor{Greg V. Wilson}{gvwilson@nevex.com}
7 \moduleauthor{Alex Martelli}{aleax@aleax.it}
8 \moduleauthor{Guido van Rossum}{guido@python.org}
9 \sectionauthor{Raymond D. Hettinger}{python@rcn.com}
11 \versionadded{2.3}
13 The \module{sets} module provides classes for constructing and manipulating
14 unordered collections of unique elements. Common uses include membership
15 testing, removing duplicates from a sequence, and computing standard math
16 operations on sets such as intersection, union, difference, and symmetric
17 difference.
19 Like other collections, sets support \code{\var{x} in \var{set}},
20 \code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
21 unordered collection, sets do not record element position or order of
22 insertion. Accordingly, sets do not support indexing, slicing, or
23 other sequence-like behavior.
25 Most set applications use the \class{Set} class which provides every set
26 method except for \method{__hash__()}. For advanced applications requiring
27 a hash method, the \class{ImmutableSet} class adds a \method{__hash__()}
28 method but omits methods which alter the contents of the set. Both
29 \class{Set} and \class{ImmutableSet} derive from \class{BaseSet}, an
30 abstract class useful for determining whether something is a set:
31 \code{isinstance(\var{obj}, BaseSet)}.
33 The set classes are implemented using dictionaries. As a result, sets
34 cannot contain mutable elements such as lists or dictionaries.
35 However, they can contain immutable collections such as tuples or
36 instances of \class{ImmutableSet}. For convenience in implementing
37 sets of sets, inner sets are automatically converted to immutable
38 form, for example, \code{Set([Set(['dog'])])} is transformed to
39 \code{Set([ImmutableSet(['dog'])])}.
41 \begin{classdesc}{Set}{\optional{iterable}}
42 Constructs a new empty \class{Set} object. If the optional \var{iterable}
43 parameter is supplied, updates the set with elements obtained from iteration.
44 All of the elements in \var{iterable} should be immutable or be transformable
45 to an immutable using the protocol described in
46 section~\ref{immutable-transforms}.
47 \end{classdesc}
49 \begin{classdesc}{ImmutableSet}{\optional{iterable}}
50 Constructs a new empty \class{ImmutableSet} object. If the optional
51 \var{iterable} parameter is supplied, updates the set with elements obtained
52 from iteration. All of the elements in \var{iterable} should be immutable or
53 be transformable to an immutable using the protocol described in
54 section~\ref{immutable-transforms}.
56 Because \class{ImmutableSet} objects provide a \method{__hash__()} method,
57 they can be used as set elements or as dictionary keys. \class{ImmutableSet}
58 objects do not have methods for adding or removing elements, so all of the
59 elements must be known when the constructor is called.
60 \end{classdesc}
63 \subsection{Set Objects \label{set-objects}}
65 Instances of \class{Set} and \class{ImmutableSet} both provide
66 the following operations:
68 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
69 \lineiii{len(\var{s})}{}{cardinality of set \var{s}}
71 \hline
72 \lineiii{\var{x} in \var{s}}{}
73 {test \var{x} for membership in \var{s}}
74 \lineiii{\var{x} not in \var{s}}{}
75 {test \var{x} for non-membership in \var{s}}
76 \lineiii{\var{s}.issubset(\var{t})}{\code{\var{s} <= \var{t}}}
77 {test whether every element in \var{s} is in \var{t}}
78 \lineiii{\var{s}.issuperset(\var{t})}{\code{\var{s} >= \var{t}}}
79 {test whether every element in \var{t} is in \var{s}}
81 \hline
82 \lineiii{\var{s}.union(\var{t})}{\var{s} | \var{t}}
83 {new set with elements from both \var{s} and \var{t}}
84 \lineiii{\var{s}.intersection(\var{t})}{\var{s} \&\ \var{t}}
85 {new set with elements common to \var{s} and \var{t}}
86 \lineiii{\var{s}.difference(\var{t})}{\var{s} - \var{t}}
87 {new set with elements in \var{s} but not in \var{t}}
88 \lineiii{\var{s}.symmetric_difference(\var{t})}{\var{s} \^\ \var{t}}
89 {new set with elements in either \var{s} or \var{t} but not both}
90 \lineiii{\var{s}.copy()}{}
91 {new set with a shallow copy of \var{s}}
92 \end{tableiii}
94 Note, the non-operator versions of \method{union()},
95 \method{intersection()}, \method{difference()}, and
96 \method{symmetric_difference()} will accept any iterable as an argument.
97 In contrast, their operator based counterparts require their arguments to
98 be sets. This precludes error-prone constructions like
99 \code{Set('abc') \&\ 'cbs'} in favor of the more readable
100 \code{Set('abc').intersection('cbs')}.
101 \versionchanged[Formerly all arguments were required to be sets]{2.3.1}
103 In addition, both \class{Set} and \class{ImmutableSet}
104 support set to set comparisons. Two sets are equal if and only if
105 every element of each set is contained in the other (each is a subset
106 of the other).
107 A set is less than another set if and only if the first set is a proper
108 subset of the second set (is a subset, but is not equal).
109 A set is greater than another set if and only if the first set is a proper
110 superset of the second set (is a superset, but is not equal).
112 The subset and equality comparisons do not generalize to a complete
113 ordering function. For example, any two disjoint sets are not equal and
114 are not subsets of each other, so \emph{all} of the following return
115 \code{False}: \code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or
116 \code{\var{a}>\var{b}}.
117 Accordingly, sets do not implement the \method{__cmp__} method.
119 Since sets only define partial ordering (subset relationships), the output
120 of the \method{list.sort()} method is undefined for lists of sets.
122 The following table lists operations available in \class{ImmutableSet}
123 but not found in \class{Set}:
125 \begin{tableii}{c|l}{code}{Operation}{Result}
126 \lineii{hash(\var{s})}{returns a hash value for \var{s}}
127 \end{tableii}
129 The following table lists operations available in \class{Set}
130 but not found in \class{ImmutableSet}:
132 \begin{tableiii}{c|c|l}{code}{Operation}{Equivalent}{Result}
133 \lineiii{\var{s}.union_update(\var{t})}
134 {\var{s} |= \var{t}}
135 {return set \var{s} with elements added from \var{t}}
136 \lineiii{\var{s}.intersection_update(\var{t})}
137 {\var{s} \&= \var{t}}
138 {return set \var{s} keeping only elements also found in \var{t}}
139 \lineiii{\var{s}.difference_update(\var{t})}
140 {\var{s} -= \var{t}}
141 {return set \var{s} after removing elements found in \var{t}}
142 \lineiii{\var{s}.symmetric_difference_update(\var{t})}
143 {\var{s} \textasciicircum= \var{t}}
144 {return set \var{s} with elements from \var{s} or \var{t}
145 but not both}
147 \hline
148 \lineiii{\var{s}.add(\var{x})}{}
149 {add element \var{x} to set \var{s}}
150 \lineiii{\var{s}.remove(\var{x})}{}
151 {remove \var{x} from set \var{s}; raises KeyError if not present}
152 \lineiii{\var{s}.discard(\var{x})}{}
153 {removes \var{x} from set \var{s} if present}
154 \lineiii{\var{s}.pop()}{}
155 {remove and return an arbitrary element from \var{s}; raises
156 KeyError if empty}
157 \lineiii{\var{s}.clear()}{}
158 {remove all elements from set \var{s}}
159 \end{tableiii}
161 Note, the non-operator versions of \method{union_update()},
162 \method{intersection_update()}, \method{difference_update()}, and
163 \method{symmetric_difference_update()} will accept any iterable as
164 an argument.
165 \versionchanged[Formerly all arguments were required to be sets]{2.3.1}
168 \subsection{Example \label{set-example}}
170 \begin{verbatim}
171 >>> from sets import Set
172 >>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
173 >>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
174 >>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack'])
175 >>> employees = engineers | programmers | managers # union
176 >>> engineering_management = engineers & managers # intersection
177 >>> fulltime_management = managers - engineers - programmers # difference
178 >>> engineers.add('Marvin') # add element
179 >>> print engineers
180 Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
181 >>> employees.issuperset(engineers) # superset test
182 False
183 >>> employees.union_update(engineers) # update from another set
184 >>> employees.issuperset(engineers)
185 True
186 >>> for group in [engineers, programmers, managers, employees]:
187 ... group.discard('Susan') # unconditionally remove element
188 ... print group
190 Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
191 Set(['Janice', 'Jack', 'Sam'])
192 Set(['Jane', 'Zack', 'Jack'])
193 Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
194 \end{verbatim}
197 \subsection{Protocol for automatic conversion to immutable
198 \label{immutable-transforms}}
200 Sets can only contain immutable elements. For convenience, mutable
201 \class{Set} objects are automatically copied to an \class{ImmutableSet}
202 before being added as a set element.
204 The mechanism is to always add a hashable element, or if it is not
205 hashable, the element is checked to see if it has an
206 \method{__as_immutable__()} method which returns an immutable equivalent.
208 Since \class{Set} objects have a \method{__as_immutable__()} method
209 returning an instance of \class{ImmutableSet}, it is possible to
210 construct sets of sets.
212 A similar mechanism is needed by the \method{__contains__()} and
213 \method{remove()} methods which need to hash an element to check
214 for membership in a set. Those methods check an element for hashability
215 and, if not, check for a \method{__as_temporarily_immutable__()} method
216 which returns the element wrapped by a class that provides temporary
217 methods for \method{__hash__()}, \method{__eq__()}, and \method{__ne__()}.
219 The alternate mechanism spares the need to build a separate copy of
220 the original mutable object.
222 \class{Set} objects implement the \method{__as_temporarily_immutable__()}
223 method which returns the \class{Set} object wrapped by a new class
224 \class{_TemporarilyImmutableSet}.
226 The two mechanisms for adding hashability are normally invisible to the
227 user; however, a conflict can arise in a multi-threaded environment
228 where one thread is updating a set while another has temporarily wrapped it
229 in \class{_TemporarilyImmutableSet}. In other words, sets of mutable sets
230 are not thread-safe.