Finding the path in a hierarchical tree in MSSQL 2005
I'm working on an table that uses the adjacency list model to store its hierarchical relationship. In a nutshell, it means my table has a parent-to-child relationship using the a foreign key to point to the parent primary key. While this is efficient for storage (and easy to understand,) this model was always extremely inefficient for retrieving the tree structure in MSSQL—that is until SQL Server 2005.
In SQL Server 2005, Microsoft implemented a new featured called Common Table Expressions (CTE,) which among other things allow you to transverse an adjacency list model table by using recursion.
A simple example looks like this:
-- get root of the tree
select
employeeId, employee, managerId
from
Employee
where
managerId is null
union all
-- do a recursive lookup
select
child.employeeId, child.employee, child.managerId
from
Employee as child
inner join
EmployeeTree
on
EmployeeTree.employeeId = child.employeeId
)
-- now grab the data from the CTE table
select
employeeId, employee, managerId
from
EmployeeTree
order by
managerId, employeeId
Today I was trying to figure out a good way to determine the path through the tree. I started thinking that the MSSQL 2005's ROW_NUMBER() function would be a good way to do that. A quick Google search brought up an excellent article by Adam Machanic titled Swinging From Tree to Tree Using CTEs, Part 2: Adjacency to Nested Intervals.
It gives pretty thorough examples and even shows how using a CTE you can convert an adjacency list model table into a nested set model table.
Comments
It is--I should have proofread better. I just had the word "Tree" on the brain. I've corrected it.
Is this kind of a thing even possible in SQL?
You can do sorting based on the child nodes. The trick is to use "row_number() over ()" to specify the sort order. So, what you end up doing is adding a column like the following to the recordset:
right('00000' + convert(varchar(max), row_number() over (order by employee)), 5) as NodeSort
What we're doing is creating a row number based on order the query by the "Employee" column. I then make sure that the value is padding to a 5 digit value. We do this to ensure that string sorting on child nodes will be kept in an expected order.
So, we can modify the above query to look like this:
with EmployeeTree as (
-- get root of the tree
select
employeeId, employee, managerId
, right('00000' + convert(varchar(max), row_number() over (order by employee)), 5) as NodeSort
from
Employee
where
managerId is null
union all
-- do a recursive lookup
select
child.employeeId, child.employee, child.managerId
, EmployeeTree.NodeSort + '.' + right('00000' + convert(varchar(max), row_number() over (order by child.employee)), 5) as NodeSort
from
Employee as child
inner join
EmployeeTree
on
EmployeeTree.employeeId = child.employeeId
)
-- now grab the data from the CTE table
select
employeeId, employee, managerId
from
EmployeeTree
order by
NodeSort
So, what we end up with is a NodeSort column that might look like this:
00001
00001.00001
00001.00002
00002
00003
00003.00001
00003.00001.00001
00003.00001.00002
00003.00002
00004
00005
If you want to change how the nodes are sorted, just change the "(order by employee)" statement to the order you want. Just make sure to make the change in both the root and recursive queries.
Thank you!
