I love ❤️ generators in PHP. They are like supercharged arrays that can preserve memory when used correctly. I've been
using iterable
instead of array
type-hinting ever since I learned about them.
Generators are callback iterators
Generators are simple functions. But where a regular function will return
a single value or even void
, a generator
can return multiple results. The only thing you have to do to change a function into a generator is to
replace return
with yield
and call it.
A generator is an iterable
, meaning you have to loop over them in order to retrieve the results. You can
simply foreach
over a generator, and it will return every yield
it encounters.
function exampleGenerator() { yield 1; yield 2; yield 3;} $generator = exampleGenerator();foreach ($generator as $value) { echo $value;}// will echo out: 123
Notice that we actually call the function to return the generator. In this example it's pretty obvious we need to do
that, but consider an anonymous function that is stored in the $generator
variable. You might accidentally try to
iterate over that.
$generator = function() { yield 1; yield 2; yield 3;}; // Incorrect: $generator is now an uncalled function.foreach ($generator as $value) // ... // Correct: $generator() is now a `Generator` object.foreach ($generator() as $value) // ...
Advantages of generators over arrays
While creating a function that yields 1,2,3 is very impressive; it's not really practical. So let's look at some reasons why you might consider using generators.
Generators are only executed when you start iterating
This might not seem like a big deal, but it actually is. Consider you have a ChoiceField
-object that
has array $options
, and you have to retrieve these options from a database. When the field is rendered, it obviously
needs to show those options. But when those options aren't rendered in that request, the database call will still be
performed to instantiate the field.
However, if you change array $options
into iterable $options
and provide the options via a generator, the database
call will only ever be executed when you foreach
over those options.
$options = function() { foreach (DB::query('retrieve the options') as $option) { yield $option; }}; $field = new ChoiceField($options());
So calling the function will return the generator, but it will not execute until you start iterating.
Tip: If you already have an iterable result set, like an
array
or any otheriterable
, you can useyield from $resuls
. This will in essenceforeach
over all the results andyield
every one of them.
// Use `yield from` instead of looping the results.$options = (function() { yield from DB::query('retrieve the options');})(); // Notice we called the function directly to return the generator. // Or shorthand$options = (fn() => yield from DB::query('retrieve the options'))();
Generators preserve memory
Besides not preforming any task without iterating, a generator only yields one result at a time, meaning it only has a single reference in memory at all times.
$options = (function() { $results = DB::query('retrieve the options'); foreach ($results as $result) { // This way there is only one `Option` in memory at all times. yield Option::createFromResult($result); } unset($results);})();
In this example we retrieve a simple result set from a database query. Only when we yield
the result, we build up the
Option
model that represents that result. This saves a lot of memory
Code can be executed after returning the results
You might have noticed that we casually called unset($results)
after we returned the results. This is because
the generator will keep going until it no longer yields any results, unlike a return
statement where the function will
end immediately. That's pretty awesome. This way you can even clean up some leftover memory consumption after your
generator finishes.
Keys can be reused
When you yield
a result, there is an implicit numeric 0-based key iterating the result. You can however yield both a
key and a value by adding the =>
arrow.
// Without keys.function fruits() { yield 'apple'; yield 'banana'; yield 'peach';} foreach (fruits() as $key => $fruit) ... // Here key will be 0, 1, 2 // With keys.function fruits() { yield 'zero' => 'apple'; yield 'one' => 'banana'; yield 'two' => 'peach'; yield 'two' => 'lime';} foreach (fruits() as $key => $fruit) // Here $key will be 'zero',' one', 'two', 'two'
Noticed how we returned the same key twice? Unlike an array, this is no problem during the iteration. However, if you
were to change the generator back into an array, by using iterator_to_array()
the key would be there only once,
holding the last result for that key.
Things to consider when using generators over arrays
While generators behave very similar to arrays, they are not of the array
type. This means you can run into these
caveats.
Array functions will not work with generators
PHP's array_
functions all require an actual array. So you cannot for example simply call array_map()
with your
generator. To remedy this, you can use iterator_to_array()
to turn your generator into an array. This will however
reintroduce the memory usage of arrays.
Tip: You might use
iterator_apply
to preform a callback on the yielded result, but this is not recommended as this function does not return an iterator itself or any of the results. It only performs a callback for every iteration, but the callback doesn't receive the result. You have to provide the iterator as an argument, and you can then retrieve thecurrent()
iteration. It's not worth it.
The count
of a generator is not predefined
Since we can yield
as many results as we want, and the generator only has one reference in memory at a time, it's not
possible to count the results without traversing them. To ease this process you can use iterator_count()
. This will
loop over every result and return the actual count.
A Generator instance can only be traversed once
When a generator finishes, it closes itself. Once this happens, you can't traverse it again. When you try to do so, you
will run into this exception: Cannot traverse an already closed generator
.
A solution to this could be to call the generator function again. However, you should probably refactor your code to prevent this.
Note: iterator_count()
also closes the iterator, so you can't do a count and then loop. You should probably just
keep a record of the count while iterating.
In conclusion
Obviously arrays have their time and place. I'd never use a generator to create a simple list. But whenever I'm working with objects or entity models, I'd like to use them to limit the memory usage.
Learned anything new? Don't keep it to yourself, but share it on social media! And if you have any questions or remarks let me know via twitter.