4 This document discusses difficult traps and pitfalls in PHP, and how to avoid,
5 work around, or at least understand them.
7 = `array_merge()` in Incredibly Slow When Merging A List of Arrays =
9 If you merge a list of arrays like this:
11 COUNTEREXAMPLE, lang=php
13 foreach ($list_of_lists as $one_list) {
14 $result = array_merge($result, $one_list);
17 ...your program now has a huge runtime because it generates a large number of
18 intermediate arrays and copies every element it has previously seen each time
21 In a libphutil environment, you can use @{function@libphutil:array_mergev}
24 = `var_export()` Hates Baby Animals =
26 If you try to `var_export()` an object that contains recursive references, your
27 program will terminate. You have no chance to intercept or react to this or
28 otherwise stop it from happening. Avoid `var_export()` unless you are certain
29 you have only simple data. You can use `print_r()` or `var_dump()` to display
30 complex variables safely.
32 = `isset()`, `empty()` and Truthiness =
34 A value is "truthy" if it evaluates to true in an `if` clause:
42 If a value is not truthy, it is "falsey". These values are falsey in PHP:
50 array() // empty array
52 Disregarding some bizarre edge cases, all other values are truthy. Note that
53 because "0" is falsey, this sort of thing (intended to prevent users from making
54 empty comments) is wrong in PHP:
58 make_comment($comment_text);
61 This is wrong because it prevents users from making the comment "0". //THIS
62 COMMENT IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD BETTER NOT
63 BREAK IT!!!// A better test is probably `strlen()`.
65 In addition to truth tests with `if`, PHP has two special truthiness operators
66 which look like functions but aren't: `empty()` and `isset()`. These operators
67 help deal with undeclared variables.
69 In PHP, there are two major cases where you get undeclared variables -- either
70 you directly use a variable without declaring it:
72 COUNTEREXAMPLE, lang=php
79 ...or you index into an array with an index which may not exist:
82 function f(array $mystery) {
83 if ($mystery['stuff']) {
88 When you do either of these, PHP issues a warning. Avoid these warnings by
89 using `empty()` and `isset()` to do tests that are safe to apply to undeclared
92 `empty()` evaluates truthiness exactly opposite of `if()`. `isset()` returns
93 `true` for everything except `null`. This is the truth table:
95 | Value | `if()` | `empty()` | `isset()` |
96 |-------|--------|-----------|-----------|
97 | `null` | `false` | `true` | `false` |
98 | `0` | `false` | `true` | `true` |
99 | `0.0` | `false` | `true` | `true` |
100 | `"0"` | `false` | `true` | `true` |
101 | `""` | `false` | `true` | `true` |
102 | `false` | `false` | `true` | `true` |
103 | `array()` | `false` | `true` | `true` |
104 | Everything else | `true` | `false` | `true` |
106 The value of these operators is that they accept undeclared variables and do
107 not issue a warning. Specifically, if you try to do this you get a warning:
109 ```lang=php, COUNTEREXAMPLE
110 if ($not_previously_declared) { // PHP Notice: Undefined variable!
118 if (empty($not_previously_declared)) { // No notice, returns true.
121 if (isset($not_previously_declared)) { // No notice, returns false.
126 So, `isset()` really means
127 `is_declared_and_is_set_to_something_other_than_null()`. `empty()` really means
128 `is_falsey_or_is_not_declared()`. Thus:
130 - If a variable is known to exist, test falsiness with `if (!$v)`, not
131 `empty()`. In particular, test for empty arrays with `if (!$array)`. There
132 is no reason to ever use `empty()` on a declared variable.
133 - When you use `isset()` on an array key, like `isset($array['key'])`, it
134 will evaluate to "false" if the key exists but has the value `null`! Test
135 for index existence with `array_key_exists()`.
137 Put another way, use `isset()` if you want to type `if ($value !== null)` but
138 are testing something that may not be declared. Use `empty()` if you want to
139 type `if (!$value)` but you are testing something that may not be declared.
141 = usort(), uksort(), and uasort() are Slow =
143 This family of functions is often extremely slow for large datasets. You should
144 avoid them if at all possible. Instead, build an array which contains surrogate
145 keys that are naturally sortable with a function that uses native comparison
146 (e.g., `sort()`, `asort()`, `ksort()`, or `natcasesort()`). Sort this array
147 instead, and use it to reorder the original array.
149 In a libphutil environment, you can often do this easily with
150 @{function@libphutil:isort} or @{function@libphutil:msort}.
152 = `array_intersect()` and `array_diff()` are Also Slow =
154 These functions are much slower for even moderately large inputs than
155 `array_intersect_key()` and `array_diff_key()`, because they can not make the
156 assumption that their inputs are unique scalars as the `key` varieties can.
157 Strongly prefer the `key` varieties.
159 = `array_uintersect()` and `array_udiff()` are Definitely Slow Too =
161 These functions have the problems of both the `usort()` family and the
162 `array_diff()` family. Avoid them.
164 = `foreach()` Does Not Create Scope =
166 Variables survive outside of the scope of `foreach()`. More problematically,
167 references survive outside of the scope of `foreach()`. This code mutates
168 `$array` because the reference leaks from the first loop to the second:
170 ```lang=php, COUNTEREXAMPLE
171 $array = range(1, 3);
172 echo implode(',', $array); // Outputs '1,2,3'
173 foreach ($array as &$value) {}
174 echo implode(',', $array); // Outputs '1,2,3'
175 foreach ($array as $value) {}
176 echo implode(',', $array); // Outputs '1,2,2'
179 The easiest way to avoid this is to avoid using foreach-by-reference. If you do
180 use it, unset the reference after the loop:
183 foreach ($array as &$value) {
189 = `unserialize()` is Incredibly Slow on Large Datasets =
191 The performance of `unserialize()` is nonlinear in the number of zvals you
192 unserialize, roughly `O(N^2)`.
194 | zvals | Approximate time |
195 |-------|------------------|
198 | 1000000 | 8,000ms |
199 | 10000000 | 72 billion years |
201 = `call_user_func()` Breaks References =
203 If you use `call_use_func()` to invoke a function which takes parameters by
204 reference, the variables you pass in will have their references broken and will
205 emerge unmodified. That is, if you have a function that takes references:
208 function add_one(&$v) {
213 ...and you call it with `call_user_func()`:
215 ```lang=php, COUNTEREXAMPLE
217 call_user_func('add_one', $x);
220 ...`$x` will not be modified. The solution is to use `call_user_func_array()`
221 and wrap the reference in an array:
225 call_user_func_array(
227 array(&$x)); // Note '&$x'!
230 This will work as expected.
232 = You Can't Throw From `__toString()` =
234 If you throw from `__toString()`, your program will terminate uselessly and you
235 won't get the exception.
237 = An Object Can Have Any Scalar as a Property =
239 Object properties are not limited to legal variable names:
242 $property = '!@#$%^&*()';
243 $obj->$property = 'zebra';
244 echo $obj->$property; // Outputs 'zebra'.
247 So, don't make assumptions about property names.
249 = There is an `(object)` Cast =
251 You can cast a dictionary into an object.
254 $obj = (object)array('flavor' => 'coconut');
255 echo $obj->flavor; // Outputs 'coconut'.
256 echo get_class($obj); // Outputs 'stdClass'.
259 This is occasionally useful, mostly to force an object to become a Javascript
260 dictionary (vs a list) when passed to `json_encode()`.
262 = Invoking `new` With an Argument Vector is Really Hard =
264 If you have some `$class_name` and some `$argv` of constructor arguments
265 and you want to do this:
268 new $class_name($argv[0], $argv[1], ...);
271 ...you'll probably invent a very interesting, very novel solution that is very
272 wrong. In a libphutil environment, solve this problem with
273 @{function@libphutil:newv}. Elsewhere, copy `newv()`'s implementation.
275 = Equality is not Transitive =
277 This isn't terribly surprising since equality isn't transitive in a lot of
278 languages, but the `==` operator is not transitive:
281 $a = ''; $b = 0; $c = '0a';
287 When either operand is an integer, the other operand is cast to an integer
288 before comparison. Avoid this and similar pitfalls by using the `===` operator,
291 = All 676 Letters in the Alphabet =
293 This doesn't do what you'd expect it to do in C:
296 for ($c = 'a'; $c <= 'z'; $c++) {
301 This is because the successor to `z` is `aa`, which is "less than" `z`.
302 The loop will run for ~700 iterations until it reaches `zz` and terminates.
303 That is, `$c` will take on these values:
311 aa // loop continues because 'aa' <= 'z'
320 zz // loop now terminates because 'zz' > 'z'
323 Instead, use this loop:
326 foreach (range('a', 'z') as $c) {