BETTER WAY TO HANDLE TREES
2013 May 29Yesterday at Berlin Hack and Tell I tried to improvise a demo of a - IMO - better way to handle trees of data in the database. Unfortunately, my understanding (or lack thereof) of Apples desktop environment made my presentation an excercise in futility. So I’m writing it down here now.
Let’s look at an example that a tree might look like:
-
Root node
-
Branch 1
-
Element 1.1
-
Element 1.2
-
Subbranch 1.3
-
Element 1.3.1
-
Element 1.3.2
-
-
-
Branch 2
-
Element 2.1
-
Element 2.2
-
-
Element 3
-
The ‘traditional’ way you probably know already
In this case, the database structure traditionally looks like this:
CREATE TABLE `tree` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`parent_id` INT NOT NULL ,
`order` INT NOT NULL ,
`name` VARCHAR( 45 ) NOT NULL
);
So you have a unique id, you have the parent_id that this element is sorted into (the branch or the Root node) and a field order which is used to sort the elements in the UI and usually contains an ascending integer. The content of the table would look like this:
+----+-----------+-------+---------------+
| id | parent_id | order | name |
+----+-----------+-------+---------------+
| 1 | 1 | 1 | Root node |
| 2 | 1 | 1 | Branch 1 |
| 3 | 1 | 2 | Branch 2 |
| 4 | 1 | 3 | Element 3 |
| 5 | 3 | 1 | Element 2.1 |
| 6 | 3 | 2 | Element 2.2 |
| 7 | 2 | 1 | Element 1.1 |
| 8 | 2 | 2 | Element 1.2 |
| 9 | 2 | 3 | Subbranch 1.3 |
| 11 | 9 | 1 | Element 1.3.1 |
| 12 | 9 | 2 | Element 1.3.2 |
+----+-----------+-------+---------------+
This layout has several disadvantages:
- No easy way to display the tree in a human readable way. No matter if you sort by id, parent_id, order or any combination thereof, you will never get an easy human parsable representation.
- You need different functions for re-arranging elements within their branch and moving elements between branches.
To tackle this issue I present to you:
The MB-tree
Okay, I admit this has probably a really fancy name already which I simply don’t know and I guess lots of people will laugh at me for not knowing this. Anyway, here goes.
Let’s take the same tree from above, but this time let’s use this database layout:
CREATE TABLE `blubb`.`mbtree` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`tree_start` INT NOT NULL ,
`tree_end` INT NOT NULL ,
`name` VARCHAR( 45 ) NOT NULL,
`locked` INT(1) NOT NULL DEFAULT 0
);
So you have a unique id, you have a tree_start value and a tree_end value. The locked field will be used for manipulation later. The tree_start and tree_end values are doing all the magic here. They define which other elements are contained within and in which order. Let’s start with just the Root node:
+----+------------+----------+-----------+--------+
| id | tree_start | tree_end | name | locked |
+----+------------+----------+-----------+--------+
| 1 | 1 | 2 | Root node | 0 |
+----+------------+----------+-----------+--------+
The fields tree_start and tree_end define a range larger than 1, so the initial Root node ranges from 1 to 2. Let’s add the Root nodes elements:
+----+------------+----------+-----------+--------+
| id | tree_start | tree_end | name | locked |
+----+------------+----------+-----------+--------+
| 1 | 1 | 8 | Root node | 0 |
| 2 | 2 | 3 | Branch 1 | 0 |
| 3 | 4 | 5 | Branch 2 | 0 |
| 4 | 6 | 7 | Element 3 | 0 |
+----+------------+----------+-----------+--------+
As you can see, all elements again have a range larger than 1 (2-3, 4-5 and 6-7 respectively), but the Root node range has increased from 1-2 to 1-8 to include all its elements. Let’s add all remaining elements:
+----+------------+----------+---------------+--------+
| id | tree_start | tree_end | name | locked |
+----+------------+----------+---------------+--------+
| 1 | 1 | 22 | Root node | 0 |
| 2 | 2 | 13 | Branch 1 | 0 |
| 5 | 3 | 4 | Element 1.1 | 0 |
| 6 | 5 | 6 | Element 1.2 | 0 |
| 7 | 7 | 12 | Subbranch 1.3 | 0 |
| 8 | 8 | 9 | Element 1.3.1 | 0 |
| 9 | 10 | 11 | Element 1.3.2 | 0 |
| 3 | 14 | 19 | Branch 2 | 0 |
| 10 | 15 | 16 | Element 2.1 | 0 |
| 11 | 17 | 18 | Element 2.2 | 0 |
| 4 | 20 | 21 | Element 3 | 0 |
+----+------------+----------+---------------+--------+
The advantages of this layout:
- Simply by adding ‘‘ORDER BY tree_start ASC’’ you sort the entire tree in a human-readable way.
- You only need one function to rearrange elements, no matter if they are branches or elements and no matter if you move them around within their branch or into another branch. I have some PHP code here. Let’s say we want to move Subbranch 1.3 to between Element 2.1 and Element 2.2:
<?php
$q = mysql_query("SELECT * FROM mbtree WHERE ID = 7");
$toChange = mysql_fetch_object($q);
$gap = ($toChange->tree_end - $toChange->tree_start) + 1;
/* Take the element(s) we want to move OUT of the structure */
mysql_query("UPDATE mbtree SET tree_start = tree_start - {$toChange->tree_start}, tree_end = tree_end - {$toChange->tree_start}, locked = 1 WHERE tree_start >= {$toChange->tree_start} AND tree_end <= {$toChange->tree_end}");
/* Move everything AFTER it UPWARDS to close the gap */
mysql_query("UPDATE mbtree SET tree_start = tree_start - $gap WHERE tree_start > {$toChange->tree_end} AND locked = 0");
mysql_query("UPDATE mbtree SET tree_end = tree_end - $gap WHERE tree_end > {$toChange->tree_end} AND locked = 0");
$q = mysql_query("SELECT * FROM mbtree WHERE `id` = 10");
$moveAfter = mysql_fetch_object($q);
/* Make a gap for the feed we want to move */
mysql_query("UPDATE mbtree SET tree_end=tree_end+$gap WHERE tree_end > {$moveAfter->tree_end} AND locked = 0");
mysql_query("UPDATE mbtree SET tree_start=tree_start+$gap WHERE tree_start >= {$moveAfter->tree_end} AND locked = 0"); # tree_end is correct!
/* Move the feed to the newly created room */
mysql_query("UPDATE mbtree SET tree_start = tree_start + {$moveAfter->tree_end} + 1, tree_end = tree_end + {$moveAfter->tree_end} + 1, locked = 0 WHERE locked = 1");
?>
As you see, there’s four steps to moving something:
- Take it out of the structure
- Close the gap it left behind
- Open a gap at the destination
- Move it into the gap
You can use this procedure to move anything anywhere you want. Granted, it’s a bit difficult to wrap your head around the concept (it was for me), but once I understood it, I never wanted to use something else anymore.
EOF
Category: blog