# (Singly) Linked Lists explained in PHP
*Quick & efficient insertion and deletion*

June 22, 2022 — by Doeke Norg

---

As one of the most common data structures, the linked list has to be one of the simplest in concept; yet still very
powerful. In this post we will be looking at what linked lists are; what types of linked lists there are; when to use a
linked list; and whether or why a linked list might be better than an array. We'll even build a simple linked list.
Let's go!

## What are Linked lists?

A linked list is a data structure; a way of organizing data that can be used in a particular way. In its most basic form
a linked list is a single `node` that holds a value, as well as an (optional) reference to another single node. Through
this reference (or pointer) these nodes are linked to each other creating a list. The first node of a list is often
called the **head**, and it represents the entry point of a linked list. The **tail** can either refer to the last node,
or the rest of the list after the head.

## Types of lists

There are a few different types of linked lists; all with their own abilities.

- A **Singly Linked List** is the most basic linked list. With this type every node has a *single* pointer to the next
  node. This makes it uni-directional, as you can only traverse from the **head** to the last node.
- A **Circular Linked List** is just like a singly linked list, but the last node points to the **head**; making the
  list circular. With this list you can start at any node and still traverse the entire list.
- A **Doubly Linked List** has two pointers per node; one to the **next** node, and one to the **previous** node. This
  list is bidirectional, as you can traverse it both ways.
- In a **Circular Doubly Linked List** The first node also points to the last node, making it doubly linked and
  circular.

> In this post, we will primarily be focussing on the **singly linked lists**.

## Strengths of (singly) linked lists

Linked lists are often compared to array's, which in itself is of course a list of some sorts. But the way an array
works has some limitations / drawbacks. For example; it uses adjoining memory locations to store its values. This makes
it super efficient in returning values, as it can look it up very fast. But when it comes to adding, inserting or
removing items it isn't as fast. For deletion, it will mark the memory slot as `vacant`, so it wil skip these slots when
iterating; but the memory isn't freed up. When inserting a value, it needs to allocate more memory, and move every value
that comes after the insertion. When inserting a value in the front of an array, this means it has to move *every*
value.

All-in-all, not very efficient. But this is where linked lists shine. For every point in the post, we'll write a bit of
PHP code to visualize the actions. So let's set up the world real quick and dive in.

> **Note:** When talking about arrays, I mostly mean *real* arrays; like the [`SplFixedArray`](https://www.php.net/manual/en/class.splfixedarray.php) implementation. The `array` in PHP is a mix of all sorts of different data structures, which makes it efficient for lots of use cases. So it will probably out-perform our implementation.

We'll need a simple `Node` class that holds the value and the reference to the next node.

```php
class Node
{
    public function __construct(public $value, public ?Node $next = null) { }
}
```

And to make life a bit easier, let's also create a `LinkedList` class that holds the reference to the **head**. We'll
add any functions to this class for easy use. Add let's also add a `print` function that will show the current list, so
we can see how we're doing.

```php
class LinkedList
{
    public function __construct(public ?Node $head = null) { }

    public function print()
    {
        $node = $this->head;

        while ($node) {
            echo $node->value;

            if ($node = $node->next) {
                echo ' -> ';
            }
        }

        echo "\n";
    }
}
```

This `print` function is a nice example on how to traverse a linked list. First we store the current **head** in
the `$node` variable. Then we will keep a `while`-loop going as long as `$node` has a value (in our case always
a `Node`). Inside the loop we `echo` out the value of the node, and update `$node` to be the **next** node. If it has a
value we will append a little arrow, because another value will follow. Otherwise `$node` is `null` and the loops ends.

### Adding a value to the front of a linked list (prepending)

Because all the nodes of a linked list only point to another value, it doesn't care about adjacent memory locations.
Nodes can be stored anywhere in memory. This makes it very easy to add another node to the front of the list. By
creating a new node, and making it point to the current **head**, you essentially create a new linked list with a *new*
head. This operation is very fast, and doesn't touch any of the other nodes at all.

Let's add this `prepend` function to the `LinkedList` class:

```php
public function prepend($value): Node {
    $this->head = $node = new Node($value, $this->head);

    return $node;
}
```

Here we create a new node that holds the new value, and points to the current **head**. This can be either a `Node`
or `null`. Then we update the **head** to be our new value, and return our new node.

Let's see how this works:

```php
$list = new LinkedList();

$a = $list->prepend('A');
$list->print(); // A

$b = $list->prepend('B');
$list->print(); // B -> A
```

### Inserting a node after another

Inserting a node after another node is a little less efficient than at the front, but still pretty fast. By updating the
previous node to point to the new node, and letting the new node point to the node the previous node used to point to;
the list is already updated.

If you don't already have a reference to the node however, it does require you to iterate over the nodes until you hit
the node you want to follow. This obviously takes time; and in the worst case scenario it would need to iterate ALL
items. But it doesn't create any extra memory overhead, as it will only add the single memory location.

We'll create an `insertAfter($value, Node $after)` method to insert a value after another node.

```php
public function insertAfter($value, Node $after): self
{
    $after->next = new Node($value, $after->next);

    return $after->next;
}
```

Here we create our new node, and point it to whatever `$after` is pointing to. Then we update `$after` to point to our
new node. Let's try it out using our earlier example:

```php
$c = $list->insertAfter('C', $b);
$list->print(); // B -> C -> A
```

### Adding a node to the end of a linked list (appending)

As we just saw, adding a node to the end of the list isn't that efficient, because you would need to iterate all items
first, until there isn't a next one. Then you make that last node point to the new node, which makes that the last node.

```php
public function append($value): Node
{
    // If the head isn't set yet, this is the same as adding a head.
    if (!$node = $this->head) {
        return $this->prepend($value);
    }

    // Find the last node.
    while ($node->next ?? null) {
        $node = $node->next;
    }

    // Insert after the last node.
    return $this->insertAfter($value, $node);
}
```

This process can however be optimized if we keep a reference to the last node, just like the **head**. This only adds a
tiny bit of extra memory, but will make adding a node to the end of the list as efficient as adding it to the front.
Simply inserting the node after the last one, and updating the reference is enough to add it.

```php
// Add `tail` parameter.
public function __construct(public ?Node $head = null, public ?Node $tail = null) { }

public function prepend($value): Node
{
    $this->head = $node = new Node($value, $this->head);
    $this->tail ??= $node; // Store node as tail if tail is empty.

    return $node;
}

public function insertAfter($value, Node $after): Node
{
    $after->next = $node = new Node($value, $after->next);

    // Update tail if we append a new node after it.
    if ($after === $this->tail) {
        $this->tail = $node;
    }

    return $node;
}

// No need to traverse all nodes to find the last node.
public function append($value): Node
{
    if (!$this->head) {
        return $this->prepend($value);
    }

    return $this->insertAfter($value, $this->tail);
}
```

So just by implementing a `$tail` parameter, the `append` function becomes way more performant because it no
longer needs to traverse al nodes to find the last node.

### Retrieving the first (or last) node

Because a linked list has a reference to the **head**, this value is always instantly available. And when a reference to
the last node is kept too, the last node is also instantly accessible. This reading of the first and last node, in
combination with quickly adding values to the front and the end of the list, make the linked list a great solution for
either a **stack** or **queue** (both data structures that provide either a First-In-First-out or Last-In-First-Out way
of iterating).

```php
$first = $list->head;
$last = $list->tail;
```

### Removing a node from the front of a linked list

You can remove a node from the front by changing the **head** of the list to the next node, and removing the pointer of
the old head. This old head node then becomes what's known as an **orphan node**. It can still be used, or even removed;
but it is no longer part of the linked list.

This type of action is usually called `shift`, so let's implement that method on our `LinkedList`.

```php
public function shift(): ?Node
{
    if (!$node = $this->head) {
        return null;
    }

    if (!$this->head = $node->next) {
        $this->tail = null;
    }

    $node->next = null;

    return $node;
}
```

First we make sure there is a **head** to be removed. Then we'll update the head to its **next** pointer. Because we
also keep track of the **tail** we must update that as well if the **head** is empty. Because at that stage we shifted
the last remaining value, so both the **head** and the **tail** should be `null`.

```php
$node = $list->shift(); // $node->value = B
$list->print(); // C -> A -> D
```

Cool, that works. And apart from the `print` function, all our methods are very fast. As long as we have a reference to
a node, we are golden. So now let's look at the downsides of a linked list.

## Weaknesses of (singly) linked lists

Just like any other data structure, a linked list isn't always the best choice for any situation. Here is where a linked
list doesn't perform as efficient.

### Reading a random node is slow

Unlike an array, where you can instantly retrieve a value by its index, you need to iterate over the previous nodes,
until you hit the one you need. In the worst case scenario, this would mean you iterate all nodes. This makes reading a
random node very time-consuming.

Let's pretend our `LinkedList` has a zero-based index, just like an array. And we want the value of `2`. We'll implement
a `get(int $index)` method that will return the value of that node.

```php
public function get(int $index)
{
    $node = $this->head;

    for ($i = 0; $i < $index; $i++) {
        if (!$node) {
            break;
        }

        $node = $node->next;
    }

    if (!$node) {
        throw new \RuntimeException(sprintf('A value with index "%d" does not exist in the list.', $index));
    }

    return $node->value;
}
```

We start a simple `for`-loop that counts until we reach our index, while moving to the **next** node *if possible*. If
we have a node, we'll return the value. Otherwise, we throw an exception that the key does not exist.

So while `$list->get(0)` would be just as fast as calling `$list->head`, calling `$list->get(2)` is slower, because it
needs to traverse all the previous nodes.

### Removing a node from anywhere inside a linked list is slow

Just like with inserting a node or reading a random node, you need to traverse all the nodes until you hit the one that
points to the node you want to remove. You can then "cut the link" by making the previous node point to the next one
after the node you want to remove. Although the deletion is very efficient, and can actually free up memory; it can
still take a lot of time to perform the action.

```php
public function remove(Node $remove): Node
{
    $node = $this->head;
    while ($node && $node->next !== $remove) {
        $node = $node->next;
    }

    $node->next = $remove->next;
    $remove->next = null;

    return $remove;
}
```

Here you can see the inefficient looping until we hit the node before the one to remove. One way to make removal more
efficient is by implementing a **doubly linked list**. Because such a list also contains a reference to the previous
node, we can simply reference that. In that case our code would look something like this:

```php
// `Remove` in a doubly linked list
public function remove(Node $remove): Node
{
    $remove->previous->next = $remove->next;
    $remove->next->previous = $remove->previous;

    $remove->next = $remove->previous = null;

    return $remove;
}
```

### Pointers require extra memory

When using linked lists, the pointers to the next element require extra memory. If you use objects, the pointers can
live on those objects. But when referencing scalar values, these values need to be wrapped by nodes that hold the value
as well as the reference (like we have been doing in our examples). These elements use more memory.

When implementing a **doubly linked list**, there is even more memory consumption, as it also requires a reference to
the previous node.

### Linked lists are traversed slower

Because the memory allocation for the nodes are not adjacent, it can take more time to find the next value. Unlike an
array, where the memory allocation is adjacent; and so the next object is easy and faster to find.

## Circular linked lists

In a **circular linked list** the last node points to the **head**. This makes it possible to traverse the list starting
from any node.

By implementing such a list, we can also remove the reference of the **head**, and only keep the reference to the
**tail**. From that point on, we can use *that* reference to find the **tail**; and use its pointer to find the
**head**. This way we can still append and prepend a node to the list, while maintaining the efficiency, and without the
need for an extra pointer.

## When to use linked lists

To be honest, from a PHP perspective, your best bet is almost certainly a regular PHP `array`. Because this is an
ordered hash map, it is very efficient at insertion, deletion and retrieval of values. One drawback of an `array`
however, is that you are usually working with a copy of the array. This means you cannot change the array while
traversing it, without specifically using the array as a reference (e.g. `&$array`).

PHP itself however has a [`SplDoublyLinkedList`](https://www.php.net/manual/en/class.spldoublylinkedlist) class which,
as the name implies, provides a **doubly linked list** implementation. This class also is the base for `SplStack`
and `SplQueue`. Using these classes you get the added benefit of changing the list during iteration. Other than that it
provides a nice OOP approach working with the list, as well as some syntactical sugar can make the code more readable.

## Conclusion

The goal of this post is to introduce the **linked list** as a data structure. I think it is useful for any programmer
to know and recognize these basic data structures. Because by recognizing it in code, it can help you optimize your
code; making it faster, and more readable.

The code in this post is strictly illustrative. You're obviously more than welcome to test the code; but you should
probably not reinvent the wheel, and use the `SplDoublyLinkedList` or its subclasses as a base for your needs. These
implementations are written in C, instead of uncompiled PHP; and are therefore probably faster than anything we can come
up with. I've added the code from this post as a
[repo to GitHub](https://github.com/doekenorg/linked-lists-explained-in-php).

If you enjoy this type of content, please let me know. And in case you might have missed it, there is currently a
new [`Deque` rfc](https://wiki.php.net/rfc/deque) which I'm quite looking forward too.

*I hope you enjoyed reading this article! If so, please leave a 👍 reaction or a 💬 comment and consider subscribing to
my newsletter! I write posts on PHP almost every other week. You can also follow me on
🐦 [twitter](https://twitter.com/intent/follow?screen_name=doekenorg) for more content and the occasional tip.*

## 🔗 Useful links

- [`SplDoublyLinkedList` documentation](https://www.php.net/manual/en/class.spldoublylinkedlist.php)
- [`Deque` rfc](https://wiki.php.net/rfc/deque)
