PHP optimisation tips

Some tips for developers who want to make their code faster.

BenchmarkingEdit

Confirm any proposed performance measure by benchmarking it. Steps documented at wikitech:Performance/Runbook/Measure backend performance#Benchmarking.

Array separationEdit

Arrays in PHP have copy-on-write semantics. When you modify an array, if the reference count is one, the modification is done in place and is fast. If the reference count is more than one, the array needs to be copied before the modification can take place. This is termed "separation" in the PHP source, since you start with one value and separate it so that there are two values.

Array separation means that innocuous-looking code can be surprisingly slow.

function getFirstElement($array) {
    return reset( $array );
}
$array = range(0, 1000000);
getFirstElement($array);

Modification of the iteration pointer by reset() requires separation, so the getFirstElement() call takes 14ms.

Anything that takes an array as input and returns a modified version of the array will require O(N) time.

$a = [];
for ( $i = 0; $i < 1000; $i++ ) {
    $a = array_merge( $a, [ 1 ] );
}

The above code snippet requires n(n+1)/2 = 500500 single-element operations for a running time of 1ms. The alternative using $a[] = 1 is 50 times faster.

Some observations about built-in functions:

  • reset() and end() should be avoided. In PHP 7.3+ we will have array_key_first() and array_key_last() as alternatives.
  • array_merge() should be replaced by a loop when the result replaces the first argument, especially if the first argument is large.
  • array_pop() and array_key_last() are fast as long as there are not too many holes at the end of the array
  • array_push() is O(1), although $a[] = ... is faster unless there is a very large number of arguments.
  • array_splice() is slow despite its apparent in-place semantics. It always copies its input arguments.

Constant factorsEdit

PHP code is compiled to an array of operations (an oparray). The PHP VM traverses the oparray, executing each opcode as it finds it. Some ops are faster than others.

  • Local variable access is heavily optimised and is generally fast. The only slow thing you can do with local variables is accessing them by name, e.g. $$varname = 1 -- this builds a hashtable of local variables at the first instance of such code in a function.
  • Function calls are relatively slow due to the need to initialise a new stack frame. Userspace function calls are slightly slower than built-in function calls.
  • Some things that look like functions are actually special opcodes. This makes them faster. For example, count(), strlen(), isset() and empty() are fast. Starting with PHP 7.4, array_key_exists has been optimized to be as fast, or faster than isset. Use it as \array_key_exists, or with a function import to benefit.
  • Object construction is comparable to a function call.
  • Access to declared properties of an object is pretty fast, since this has been heavily optimised. It is faster to access a declared property than an undeclared property or an element of an associative array. But in a tight loop it might still be worthwhile to copy an object property to a local variable.

The PHP compiler is not as smart as a C compiler because it operates under strict time constraints. Do not assume the PHP compiler is going to help you out by optimising away your slow code. With a few exceptions, what you write is what it executes.

Caching and memoizationEdit

Memoization means caching the result of a function call in a way that is transparent to the caller. It is often an easy and effective way to improve performance.

The optimisation operatorEdit

Most languages have an optimisation operator. The optimisation operator makes any code faster when the operator is placed in front of the code in question.

 // $this->slowFunction();
 # $this->slowFunction();

In other words, it is faster to not do a thing than to do the thing. Engineers need to push back against expensive requirements. Product managers do not necessarily understand the cost of a requirement in terms of user observed latency or hardware cost.

AbstractionEdit

Wirth's law states that software becomes slower as hardware becomes faster. Increasing abstractions and features keep pace with hardware improvements so that the benefits of hardware improvements are never seen by users. In fact, latency tends to increase over time.

Wirth's law is inherent in the way engineers think and operate. Engineers cannot resist the dopamine hit which comes from introducing a neat abstraction. It must be consciously and continuously fought to preserve a reasonable user experience.