Finding the path in a hierarchical tree in MSSQL 2005

Posted by Dan on Nov 30, 2007 @ 12:39 PM

I'm working on an table that uses the adjacency list model to store its hierarchical relationship. In a nutshell, it means my table has a parent-to-child relationship using the a foreign key to point to the parent primary key. While this is efficient for storage (and easy to understand,) this model was always extremely inefficient for retrieving the tree structure in MSSQL—that is until SQL Server 2005.

In SQL Server 2005, Microsoft implemented a new featured called Common Table Expressions (CTE,) which among other things allow you to transverse an adjacency list model table by using recursion.

A simple example looks like this:

with EmployeeTree as (
    -- get root of the tree
    select
        employeeId, employee, managerId
    from
        Employee
    where
        managerId is null
    union all
    -- do a recursive lookup
    select
        child.employeeId, child.employee, child.managerId
    from
        Employee as child
            inner join
        EmployeeTree
            on
        EmployeeTree.employeeId = child.managerId
)
-- now grab the data from the CTE table
select
    employeeId, employee, managerId
from
    EmployeeTree
order by
    managerId, employeeId

Today I was trying to figure out a good way to determine the path through the tree. I started thinking that the MSSQL 2005's ROW_NUMBER() function would be a good way to do that. A quick Google search brought up an excellent article by Adam Machanic titled Swinging From Tree to Tree Using CTEs, Part 2: Adjacency to Nested Intervals.

It gives pretty thorough examples and even shows how using a CTE you can convert an adjacency list model table into a nested set model table.

Categories: SQL, HTML/ColdFusion, Source Code

7 Comments

  • I thought a CTE is a Common Table Expression, instead of Common Tree Expression
  • @Eric:

    It is--I should have proofread better. I just had the word "Tree" on the brain. I've corrected it.
  • This is a very neat solution of common problems vith tree structures. Still I think that it is missing a feature of sorting child nodes of same parent by some field.

    Is this kind of a thing even possible in SQL?
  • @Uģis:

    You can do sorting based on the child nodes. The trick is to use "row_number() over ()" to specify the sort order. So, what you end up doing is adding a column like the following to the recordset:

    right('00000' + convert(varchar(max), row_number() over (order by employee)), 5) as NodeSort

    What we're doing is creating a row number based on order the query by the "Employee" column. I then make sure that the value is padding to a 5 digit value. We do this to ensure that string sorting on child nodes will be kept in an expected order.

    So, we can modify the above query to look like this:

    with EmployeeTree as (
      -- get root of the tree
      select
        employeeId, employee, managerId
        , right('00000' + convert(varchar(max), row_number() over (order by employee)), 5) as NodeSort

      from
        Employee
      where
        managerId is null
      union all
      -- do a recursive lookup
      select
        child.employeeId, child.employee, child.managerId
        , EmployeeTree.NodeSort + '.' + right('00000' + convert(varchar(max), row_number() over (order by child.employee)), 5) as NodeSort
      from
        Employee as child
          inner join
        EmployeeTree
          on
        EmployeeTree.employeeId = child.employeeId
    )

    -- now grab the data from the CTE table
    select
      employeeId, employee, managerId
    from
      EmployeeTree
    order by
      NodeSort

    So, what we end up with is a NodeSort column that might look like this:

    00001
    00001.00001
    00001.00002
    00002
    00003
    00003.00001
    00003.00001.00001
    00003.00001.00002
    00003.00002
    00004
    00005

    If you want to change how the nodes are sorted, just change the "(order by employee)" statement to the order you want. Just make sure to make the change in both the root and recursive queries.
  • Didn't know about row_number() over till today; seems like a nice solution.
    Thank you!
  • FYI,

    from
        Employee as child
          inner join
        EmployeeTree
          on
        EmployeeTree.employeeId = child.employeeId

    Should read

    from
        Employee as child
          inner join
        EmployeeTree
          on
        EmployeeTree.employeeId = child.managerId

    Otherwise you just get a recursive overflow error.
  • @Nick:

    Thanks for the correction. I've updated the blog entry w/the correction.

Add Comment

Leave this field empty