Managing Hierarchical Data with PHP and MySQL

Kevin Waterson

PHPRO.ORG

What is Hierarchical Data?

Hierachical Models

Buzzword Alert

MPTT

Large trees with frequent reads and infrequent writes are always going to be much more efficient with nested sets. If you need to do frequent writes, nested sets are not appropriate and this is especially true with large trees as a single change can require the entire tree to be updated. In these circumstances, adjacency list models are much more appropriate.

MySQL Setup

CREATE TABLE categories (
category_id int(11) NOT NULL auto_increment,
name varchar(20) NOT NULL,
left_node int(11) NOT NULL,
right_node int(11) NOT NULL,
PRIMARY KEY (category_id)
);

CREATE TABLE products (
product_id int(11) NOT NULL auto_increment,
name varchar(40) default NULL,
category_id int(11) NOT NULL,
PRIMARY KEY (product_id)
);
For users of Oracle, you may be familiar with Oracle’s CONNECT BY PRIOR syntax. It’s good stuff for sure but it really is only an Oracle proprietary extension for handling adjacency lists directly in the database.

Tree Display

Linear Display

Retrieving Data

class hierachy{

 public function fullTree($parent){
 $stmt = conn::getInstance()->prepare("
 SELECT node.name
 FROM categories AS node,
 categories AS parent
 WHERE node.left_node
 BETWEEN parent.left_node AND parent.right_node
 AND parent.name = :parent
 ORDER BY node.left_node");

 $stmt->bindParam('parent', $parent);
 $stmt->execute();
 $res = $stmt->fetchALL(PDO::FETCH_ASSOC);
 return $res;
 }
}
Here we have a class method that makes use of a singleton PDO class to get a full listing of the database tree.

Iterating over the Tree

/*** a new hierachy instance ***/
$hierachy = new hierachy;
$iterator = new RecursiveIteratorIterator(
    new recursiveArrayIterator($hierachy->fullTree('electronics')));
 try 
 {
    foreach($iterator as $key=>$value)
    {
        echo $value.'
'; } } catch(Exception $e) { echo $e->getMessage(); }

Iterating over the tree

electronics
televisions
tube
lcd
plasma
portable electronics
mp3 players
flash
cd players
2 way radios

Iterating over the tree

You could change the parent to televisions, and get a result like this.

televisions
tube
lcd
plasma

Finding all the Leaf Nodes

public function leafNodes(){
$stmt = conn::getInstance()->prepare("
 SELECT name
 FROM categories
 WHERE right_node = left_node + 1
 ");
$stmt->execute();
return $stmt->fetchALL(PDO::FETCH_ASSOC);
}
Because the left and right leaf nodes are consicutive numbers we can simply look for nodes where right_node = left_node + 1. The class method will look like this:

Access the leaf nodes

$iterator = new RecursiveIteratorIterator(
    new recursiveArrayIterator($hierachy->leafNodes()));
try 
{
    foreach($iterator as $key=>$value)
    {
        echo $value.'
'; } } catch(Exception $e) { echo $e->getMessage(); }
To access the leaf nodes is a simple task as shown here. Once again, the class method returns an array of arrays and we can use SPL goodness to simplify the iteration.

Access the leaf nodes

tube
lcd
plasma
flash
cd players
2 way radios

Retieving a single path

public function singlePath($node_name){
$stmt = conn::getInstance()->prepare("
 SELECT parent.name
 FROM categories AS node,
 categories AS parent
 WHERE node.left_node
 BETWEEN parent.left_node AND parent.right_node
 AND node.name = '{$node_name}'
 ORDER BY node.left_node
 ");
$stmt->execute();
return $stmt->fetchALL(PDO::FETCH_ASSOC);
Using the nested set model allows us to retrieve a single path without messy multiple self-joins. Here we show how to fetch the flash node path.

Retrieving a single path

electronics
portable electronics
mp3 players
flash

Finding the Depth of nodes

public function getNodeDepth(){
$stmt = conn::getInstance()->prepare("
 SELECT node.name, 
 COUNT(parent.name) - 1)
 AS depth
 FROM categories AS node,
 categories AS parent
 WHERE node.left_node BETWEEN parent.left_node AND parent.right_node
 GROUP BY node.name
 ORDER BY node.left_node");

 $stmt->execute();
 return $stmt->fetchALL(PDO::FETCH_ASSOC);
}
The full tree method can be altered a little to produce a listing of each node with its depth. This may be used as a menu hierachy and the depth could be used for placing node names within the menu.

Finding the depth of nodes

$iterator = new RecursiveIteratorIterator(
    new recursiveArrayIterator($hierachy->getNodeDepth()));
try 
{
    foreach($iterator as $key=>$value)
    {
       echo $key.' -- '.$value.'
'; } } catch(Exception $e) { echo $e->getMessage(); }

Depth Display

name -- electronics
product_count -- 10
name -- televisions
product_count -- 5
name -- tube
product_count -- 2
name -- lcd
product_count -- 1
name -- plasma
product_count -- 2
name -- mp3 players
product_count -- 2
name -- portable electronics
product_count -- 5
name -- flash
product_count -- 1
name -- cd players
product_count -- 2
name -- 2 way radios
product_count -- 1

Get Local Sub-Tree Depth

public function getLocalSubNodes($node_name){
$stmt = conn::getInstance()->prepare("SELECT node.name,
 (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
 FROM categories AS node,
 categories AS parent,
 categories AS sub_parent,
        (
        SELECT node.name, (COUNT(parent.name) - 1) AS depth
        FROM categories AS node,
        categories AS parent
        WHERE node.left_node BETWEEN parent.left_node AND parent.right_node
        AND node.name = :node_name
        GROUP BY node.name
        ORDER BY node.left_node
        )AS sub_tree
WHERE node.left_node BETWEEN parent.left_node AND parent.right_node
AND node.left_node BETWEEN sub_parent.left_node AND sub_parent.right_node
AND sub_parent.name = sub_tree.name
GROUP BY node.name
HAVING depth <= 1
ORDER BY node.left_node");
$stmt-&ft;bindParam(':node_name', $node_name, PDO::PARAM_STR);
$stmt->execute();
return $stmt->fetchALL(PDO::FETCH_ASSOC);
}
Suppose now you wish to get all the local, or subordinate, nodes of a category. These are the nodes, or categories, immediately below the specified category, but no deeper within the tree. So if we were to show the Portable Electronics category, we would want to see Mp3 Players, CD Players and 2 Way Radio's, but not the Flash category. So we need to limit our query to those categories HAVING depth less than or equal to one. (HAVING depth <= 1). Our class method will look like this:

Show Local SubNodes

$iterator = new RecursiveIteratorIterator(
   new recursiveArrayIterator($hierachy->getLocalSubNodes('portable electronics')));
try {
    foreach($iterator as $key=>$value)
        {
        echo $key.' -- '.$value.'
'; } } catch(Exception $e) { echo $e->getMessage(); }

Get SubNodes Results

name -- portable electronics
depth -- 0
name -- mp3 players
depth -- 1
name -- cd players
depth -- 1
name -- 2 way radios
depth -- 1

Product Count

public function productCount(){
$stmt = conn::getInstance()->prepare("
 SELECT parent.name,
 COUNT(products.name) AS product_count
 FROM categories AS node ,
 categories AS parent, 
 products  
 WHERE node.left_node BETWEEN parent.left_node AND parent.right_node
 AND node.category_id = products.category_id
 GROUP BY parent.name
 ORDER BY node.left_node ");
$stmt->execute();
return $stmt->fetchALL(PDO::FETCH_ASSOC);
}
So far we have dealt with the categories and sub categories within our hierachy. Now lets introduce the products from the products table. To begin with a simple category tree with a count for each category. Basically what we need is the same as our earlier full tree view, with a count of each product, related to the category.

Show Product Count

$iterator = new RecursiveIteratorIterator(
    new recursiveArrayIterator($hierachy->productCount()));
try
 {
    foreach($iterator as $key=>$value)
    {
        echo $key.' -- '.$value.'
'; } } catch(Exception $e) { echo $e->getMessage(); }

Show Product Count

name -- electronics
product_count -- 10
name -- televisions
product_count -- 5
name -- tube
product_count -- 2
name -- lcd
product_count -- 1
name -- plasma
product_count -- 2
name -- mp3 players
product_count -- 2
name -- portable electronics
product_count -- 5
name -- flash
product_count -- 1
name -- cd players
product_count -- 2
name -- 2 way radios
product_count -- 1

Adding New Nodes

Now that we can query the tree and get sub-trees and product counts, lets see about adding a new Node. Lets look back at our original container diagram. To add a new node between Televisions and Portable Electronics then the new node would have left_node = 10 and right_node = 11. All nodes to the right would have to have thier left_node and right_node values incremented by two to adjust for the extra node. To expediate this process we can use a transaction.

Add New Node Code

public function addNode($left_node, $new_node){
try {
    conn::getInstance()->beginTransaction();
    $stmt = conn::getInstance()->prepare("SELECT @myRight := right_node FROM categories WHERE name = :left_node");
    $stmt->bindParam(':left_node', $left_node);
    $stmt->execute();
    /*** increment the nodes by two ***/
    conn::getInstance()->exec("UPDATE categories SET right_node = right_node + 2 WHERE right_node > @myRight");
    conn::getInstance()->exec("UPDATE categories SET left_node = left_node + 2 WHERE left_node > @myRight");
    /*** insert the new node ***/
    $stmt = conn::getInstance()->prepare("INSERT INTO categories(name, left_node, right_node) VALUES(:new_node, @myRight + 1, @myRight + 2)");
    $stmt->bindParam(':new_node', $new_node);
    $stmt->execute();
    /*** commit the transaction ***/
    conn::getInstance()->commit();
    }
catch(Exception $e)
    {
    conn::getInstance()->rollBack();
    throw new Exception($e);
    }
}
Here we see the advantages of PDO transactions by bundling up SQL statement and commiting them in a single call. Should an error occur, then an exception is thrown and the code jumps to the catch block where the transaction is rolled back. This avoids any dangling references should there be an interuption during the process.

Adding a Child Node

public function addChildNode($node_name, $new_node){
try {
    conn::getInstance()->beginTransaction();
    $stmt = conn::getInstance()->prepare("SELECT @myLeft := left_node FROM categories WHERE name=:node_name");
    $stmt->bindParam(':node_name', $node_name);
    $stmt->execute();
    conn::getInstance()->exec("
    UPDATE categories SET right_node = right_node + 2
    WHERE right_node > @myLeft");
    conn::getInstance()->exec("
    UPDATE categories SET left_node = left_node + 2 
    WHERE left_node > @myLeft");
    $stmt = conn::getInstance()->prepare("
    INSERT INTO categories(name, left_node, right_node) 
    VALUES(:new_node, @myLeft + 1, @myLeft + 2)");
    $stmt->bindParam(':new_node', $new_node);
    $stmt->execute();
    conn::getInstance()->commit();
    }
catch(Exception $e)
    {
    conn::getInstance()->rollBack();
    throw new Exception($e);
    }
}
Adding a node as above gives us a parent, but what if we wish to add a child of a node that has no children? Here we will add "uhf" as a child node of "2 way radios". This entails expending everything to the right of the left_node of the new parent and placing the node to the right of the left_node value. Here is the class method to do it.

Deleting Nodes

public function deleteLeafNode($node_name){
try {
    conn::getInstance()->beginTransaction();
    $stmt = conn::getInstance()->prepare("SELECT @myLeft := left_node, @myRight := right_node, @myWidth := right_node - left_node + 1 FROM categories WHERE name = :node_name");
    $stmt->bindParam(':node_name', $node_name);
    $stmt->execute();
    conn::getInstance()->exec("DELETE FROM categories WHERE left_node BETWEEN @myLeft AND @myRight;");
    conn::getInstance()->exec("UPDATE categories SET right_node = right_node - @myWidth WHERE right_node > @myRight");
    conn::getInstance()->exec("UPDATE categories SET left_node = left_node - @myWidth WHERE left_node > @myRight");
    conn::getInstance()->commit();
    }
catch(Exception $e)
    {
    conn::getInstance()->rollBack();
    throw new Exception($e);
    }
}
Now that we can add nodes we need to be able to delete them. Deleting leaf nodes is easier than deleting parent nodes with children as the orphaned nodes need to be handled. To delete a leaf node is the opposite of adding a new node. We delete the node and its width from every node to its right:

Delete Node Recursive

public function deleteNodeRecursive($node_name){
try 
 {
    conn::getInstance()->beginTransaction();
    $stmt = conn::getInstance()->prepare("
    SELECT @myLeft := left_node,
    @myRight := right_node,
    @myWidth := right_node - left_node + 1
    FROM categories WHERE name = :node_name");
    $stmt->bindParam(':node_name', $node_name);
    $stmt->execute();
    conn::getInstance()->exec("DELETE FROM categories WHERE left_node BETWEEN @myLeft AND @myRight");
    conn::getInstance()->exec("UPDATE categories SET right_node = right_node - @myWidth WHERE right_node > @myRight");
    conn::getInstance()->exec("UPDATE categories SET left_node = left_node - @myWidth WHERE left_node > @myRight");
    conn::getInstance()->commit();
    }
catch(Exception $e)
    {
    conn::getInstance()->rollBack();
    throw new Exception($e);
    }
}

With a little revision, we can us this same approach as above to delete a whole parent node and all its children.

In Summary (Pros)

Pros

  • Avoids Recursion
  • Read operations on the tree are quick and efficient
  • In Summary (Cons>

    Cons

  • Difficult to set up
  • Less efficient than using the adjacency list method for making changes to the tree
  • Summary

    Suited For

    • Carts
    • User Management
    • Breadcrumbs
    • Products
    • Categories
    • Infinite Depth
    • items in categories

    Not Suited For

    • Forums
    • Blog posts/comments/replies to comments
    • Anywhere that requires many tree alterations

    Beer and Pizza

    • Questions?
    • Thank You