I was working on restructuring some old code that needed some pagination. The query in question used a one-to-many join the required information together. Imagine a search engine where you're wanting to search over orders, but want to group the results by customer. The output might look something like:
Gary Dell'Abate | |
Order #: 12098 | |
Order #: 13232 | |
Order #: 14551 | |
Order #: 16770 | |
Fred Norris | |
Order #: 11021 | |
Order #: 11029 | |
Robin Quivers | |
Order #: 10010 | |
Order #: 11001 | |
Order #: 12001 | |
Howard Stern | |
Order #: 13001 |
So, in my situation I want to also paginate by the customers. When SQL Server 2005 was introduced, it added a new feature called Common Table Expressions (CTEs.) One of the most useful features of CTEs is to paginate recordsets. Typically when paginating results, you will use the row_number() function—which creates a new unique row number for each row in your recordset:
-- create the Common Table Expression, which is a table called "pagination" with pagination as ( -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration select row_number() over (order by Name) as rowNo, -- a list of the column you want to retrieve CustomerId, Name, Phone from Customer where State = 'NY' ) -- we now query the CTE table select -- add an additional column which contains the total number of records in the query *, (select count(*) from pagination) as TotalRows from pagination where RowNo between 11 and 20 order by rowNo
However, when joining a one-to-many you don't want a unique number for each row, you want a unique number for each unique primary-to-foreign key match. So, let's adjust the above query to have our one-to-many join:
-- create the Common Table Expression, which is a table called "pagination" with pagination as ( -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration select row_number() over (order by c.Name) as rowNo, -- a list of the column you want to retrieve c.CustomerId, c.Name, c.Phone, o.OrderNo from Customer as c left join Order as o on c.CustomerId = o.CustomerId where c.State = 'NY' ) -- we now query the CTE table select -- add an additional column which contains the total number of records in the query *, (select count(*) from pagination) as TotalRows from pagination where RowNo between 11 and 20 order by rowNo
When we run that query, our results look something like:
CustomerId | Name | Phone | OrderNo | Row | TotalRows |
945 | Gary Dell'Abate | (202) 555-1212 | 12098 | 1 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 13232 | 2 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 14551 | 3 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 16770 | 4 | 10 |
639 | Fred Norris | (202) 555-1213 | 11021 | 5 | 10 |
639 | Fred Norris | (202) 555-1213 | 11029 | 6 | 10 |
898 | Robin Quivers | (202) 555-1214 | 10010 | 7 | 10 |
898 | Robin Quivers | (202) 555-1214 | 11001 | 8 | 10 |
898 | Robin Quivers | (202) 555-1214 | 12001 | 9 | 10 |
314 | Howard Stern | (202) 555-1215 | 13001 | 10 | 10 |
As you can see, our each record has a unqiue row—which won't help us paginate our records because it's not grouping the customer's orders together. What we need is a unique number for each customer. Fortunately, Microsoft has an answer for this—the dense_rank() function.
Along with the row_number() function, Microsoft also has offers the rank() and dense_rank() functions. 4 Guys From Rolla has a good article Returning Ranked Results with Microsoft SQL Server 2005 which shows off the how the different functions affect the recordset.
The dense_rank() function creates a unique number for each matching condition in the ORDER BY clause, which is exactly what we need to paginate our one-to-many join. So let's modify our query to use the dense_rank() function:
-- create the Common Table Expression, which is a table called "pagination" with pagination as ( -- your normal query goes here, but you put your ORDER BY clause in the rowNo declaration select dense_rank() over (order by c.Name) as rowNo, -- a list of the column you want to retrieve c.CustomerId, c.Name, c.Phone, o.OrderNo from Customer as c left join Order as o on c.CustomerId = o.CustomerId where c.State = 'NY' ) -- we now query the CTE table select -- add an additional column which contains the total number of records in the query *, (select count(*) from pagination) as TotalRows from pagination where RowNo between 11 and 20 order by rowNo
So let's see what our recordset looks like now:
CustomerId | Name | Phone | OrderNo | Row | TotalRows |
945 | Gary Dell'Abate | (202) 555-1212 | 12098 | 1 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 13232 | 1 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 14551 | 1 | 10 |
945 | Gary Dell'Abate | (202) 555-1212 | 16770 | 1 | 10 |
639 | Fred Norris | (202) 555-1213 | 11021 | 2 | 10 |
639 | Fred Norris | (202) 555-1213 | 11029 | 2 | 10 |
898 | Robin Quivers | (202) 555-1214 | 10010 | 3 | 10 |
898 | Robin Quivers | (202) 555-1214 | 11001 | 3 | 10 |
898 | Robin Quivers | (202) 555-1214 | 12001 | 3 | 10 |
314 | Howard Stern | (202) 555-1215 | 13001 | 4 | 10 |
As you can see, we're now getting back a unique number for each customer.
There really is a ton of flexibility in the row_number(), rank() and dense_rank() function. I definitely recommend getting familiar with the functions—especially with the PARTITION clause. There's another good article on these function over at ROW_NUMBER(), RANK(), and DENSE_RANK() – Flexibility at a Price.
Comments for this entry have been disabled.