Sunday, March 15, 2009

Modified Preorder Tree Traversal

If you are a programmer, hopefully you've heard of this term.  I learned about it 7 years ago while a big city programmer from New York looked down at a country programmer from Sussex County Delaware.  He had a lot to say but of little value.  When we were discussing an e-commerce platform that I had built, he asked which algorithm I was using for parent/child relationships.  I explained how I had a parentCategoryID field that held the value of the parent category in the child category row.  One weakness in this approach is that it is tough to find indirect relationships.  Say I have top level category of animals and plants, then the next level has birds, fish, reptiles, etc.  Then, the next level has the species of animals and plants.  How can you create a query that says display all animals?  Without multiple queries or complex queries, you can't.  

lifeFormIDnameparentLifeFormID
1AnimalNULL
2PlantNULL
3Bird1
4Fish1
5Reptiles1
6Trees2
7Shrubs2
8Cactus2
9Duck3
10Goose3
11Snake4
12Oak Tree6

Using the data above, I want a query that returns duck, goose and snake but not oak tree.  Using this method, there is no clean way.  Either multiple queries 
SELECT lifeFormID, name, parentLifeFormID FROM lifeForm  WHERE parentLifeFormID = 1
then...
SELECT lifeFormID, name, parentLifeFormID FROM lifeForm  WHERE parentLifeFormID IN (list of ID's from the above query)
Yes, if you know that there are only three levels then you could build the query below:
SELECT lf.lifeformID, lf.name, lf.parentLifeformID FROM lifeForm lf WHERE lf.parentLifeFormID IN ( SELECT subLF.lifeFormID FROM lifeForm subLF WHERE subLF.parentLifeFormID = 1 )
But this creates code that is not very modular or efficient.  A better way to perform this is using the Modified Preorder Tree Traversal algorithm.

Lets try setting up the table this way.

lifeFormIDnameparentLifeFormIDtreeLefttreeRighttreeLevel
1AnimalNULL1141
2PlantNULL15241
3Bird1272
4Fish1892
5Reptiles110132
6Trees216192
7Shrubs220212
8Cactus222232
9Duck3343
10Goose3563
11Snake411123
12Oak Tree617183

Now lets say we want to create a SQL query that retrieves all types of animals.
SELECT lifeFormID, name, parentLifeFormID, treeLeft, treeRight, treeLevel FROM lifeForm WHERE 1 < treeLeft AND treeRight < 14
Note: 1 is the treeLeft value of the animal row and 14 is the treeRight value of the animal row 
This method allows for SQL queries to be created that allow for indirect relationships to be found simpler.  The one downside of this algorithm is that it is more complicated to add/edit/delete from the table.  Since most applications use the view data much more often (generally around 90%), it is better to simplify and speed up the view process while losing speed on the admin side.  As for the actual functions and SQL to maintain this, SitePoint has a good article with some useful code that you should be able to migrate into your application.

1 comments:

  1. amazing and interesting post...
    Thank you for the post...Iphone App
    ReplyDelete