Pagination in MSSQL 2005 with one-to-many joins

Posted by Dan on May 12, 2009 @ 1:37 PM

I was working on restructuring some old code that needed some pagination. The query in question used a one-to-many join the required information together. Imagine a search engine where you're wanting to search over orders, but want to group the results by customer. The output might look something like:

Gary Dell'Abate
  Order #: 12098
  Order #: 13232
  Order #: 14551
  Order #: 16770
Fred Norris
  Order #: 11021
  Order #: 11029
Robin Quivers
  Order #: 10010
  Order #: 11001
  Order #: 12001
Howard Stern
  Order #: 13001

So, in my situation I want to also paginate by the customers. When SQL Server 2005 was introduced, it added a new feature called Common Table Expressions (CTEs.) One of the most useful features of CTEs is to paginate recordsets. Typically when paginating results, you will use the row_number() function—which creates a new unique row number for each row in your recordset:

-- create the Common Table Expression, which is a table called "pagination"
with pagination as
(
    -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration
    select
        row_number() over (order by Name) as rowNo,
        -- a list of the column you want to retrieve
        CustomerId, Name, Phone
    from
        Customer
    where
        State = 'NY'
)
-- we now query the CTE table
select
    -- add an additional column which contains the total number of records in the query
    *, (select count(*) from pagination) as TotalRows
from
    pagination
where
    RowNo between 11 and 20
order by
    rowNo

However, when joining a one-to-many you don't want a unique number for each row, you want a unique number for each unique primary-to-foreign key match. So, let's adjust the above query to have our one-to-many join:

-- create the Common Table Expression, which is a table called "pagination"
with pagination as
(
    -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration
    select
        row_number() over (order by c.Name) as rowNo,
        -- a list of the column you want to retrieve
        c.CustomerId, c.Name, c.Phone, o.OrderNo
    from
        Customer as c
          left join
        Order as o
          on
        c.CustomerId = o.CustomerId
    where
        c.State = 'NY'
)
-- we now query the CTE table
select
    -- add an additional column which contains the total number of records in the query
    *, (select count(*) from pagination) as TotalRows
from
    pagination
where
    RowNo between 11 and 20
order by
    rowNo

When we run that query, our results look something like:

CustomerId Name Phone OrderNo Row TotalRows
945 Gary Dell'Abate (202) 555-1212 12098 1 10
945 Gary Dell'Abate (202) 555-1212 13232 2 10
945 Gary Dell'Abate (202) 555-1212 14551 3 10
945 Gary Dell'Abate (202) 555-1212 16770 4 10
639 Fred Norris (202) 555-1213 11021 5 10
639 Fred Norris (202) 555-1213 11029 6 10
898 Robin Quivers (202) 555-1214 10010 7 10
898 Robin Quivers (202) 555-1214 11001 8 10
898 Robin Quivers (202) 555-1214 12001 9 10
314 Howard Stern (202) 555-1215 13001 10 10

As you can see, our each record has a unqiue row—which won't help us paginate our records because it's not grouping the customer's orders together. What we need is a unique number for each customer. Fortunately, Microsoft has an answer for this—the dense_rank() function.

Along with the row_number() function, Microsoft also has offers the rank() and dense_rank() functions. 4 Guys From Rolla has a good article Returning Ranked Results with Microsoft SQL Server 2005 which shows off the how the different functions affect the recordset.

The dense_rank() function creates a unique number for each matching condition in the ORDER BY clause, which is exactly what we need to paginate our one-to-many join. So let's modify our query to use the dense_rank() function:

-- create the Common Table Expression, which is a table called "pagination"
with pagination as
(
    -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration
    select
        dense_rank() over (order by c.Name) as rowNo,
        -- a list of the column you want to retrieve
        c.CustomerId, c.Name, c.Phone, o.OrderNo
    from
        Customer as c
          left join
        Order as o
          on
        c.CustomerId = o.CustomerId
    where
        c.State = 'NY'
)
-- we now query the CTE table
select
    -- add an additional column which contains the total number of records in the query
    *, (select count(*) from pagination) as TotalRows
from
    pagination
where
    RowNo between 11 and 20
order by
    rowNo

So let's see what our recordset looks like now:

CustomerId Name Phone OrderNo Row TotalRows
945 Gary Dell'Abate (202) 555-1212 12098 1 10
945 Gary Dell'Abate (202) 555-1212 13232 1 10
945 Gary Dell'Abate (202) 555-1212 14551 1 10
945 Gary Dell'Abate (202) 555-1212 16770 1 10
639 Fred Norris (202) 555-1213 11021 2 10
639 Fred Norris (202) 555-1213 11029 2 10
898 Robin Quivers (202) 555-1214 10010 3 10
898 Robin Quivers (202) 555-1214 11001 3 10
898 Robin Quivers (202) 555-1214 12001 3 10
314 Howard Stern (202) 555-1215 13001 4 10

As you can see, we're now getting back a unique number for each customer.

There really is a ton of flexibility in the row_number(), rank() and dense_rank() function. I definitely recommend getting familiar with the functions—especially with the PARTITION clause. There's another good article on these function over at ROW_NUMBER(), RANK(), and DENSE_RANK() – Flexibility at a Price.

Categories: SQL, HTML/ColdFusion

Comments for this entry have been disabled.