I was talking to a buddy this morning and he'd never seen the syntax for a derived table in MSSQL before. He couldn't quite grasp why they might be useful and I was having difficulty explaining via instant messenger, so I thought I'd blog a real world example of how derived tables can be useful.
In very basic terms, a derived table is a virtual table that's calculated on the fly from a select statement. They can be tremendously useful in certain situations. For my example below, I'm going to use the Northwind database that included with MSSQL 2000.
Let's say you've been asked to generate a report that shows off the total number of orders each customer placed in 1996. "Not a problem." you think to yourself. "This is just an easy aggregated join query." So, you sit down and generate your query and come up with:
Looks good, eh? But there's something missing. Customers that didn't place an order in 1996 aren't showing up. You could leave off customers that didn't place an order, but in many cases the customers that didn't buy place orders are the ones that are of the most interest.
If you think you can include those customers with a "is null" check, you'd be wrong. Run the following code and see if you notice what is wrong:
At first glance this may look right, but if you examine the results carefully you'll see that the customers with the zero count have never placed an order. If a customer has placed an order, but just not in the year 1996 they won't show up. This is because the "is null" check finds customers that have never placed an order—it still doesn't do anything to add customers who've placed an order, but just not in 1996.
This is where a derived table can come in handy. Instead of using the "Orders" table, we'll reduce the Orders table to a snapshot of itself—including only data from the year 1996.
Run the above query and examine the results. You should now see a row returned for each Customer and the total number or orders placed in the year 1996—including the customers that didn't place an order.
The reason this works is because the LEFT JOIN will include all matches or null values. If the LEFT JOIN has matches (like in our first query,) but non that match the criteria those rows are excluded. In our derived table, since we're only running against the orders from the year 1996 we're ensuring that we return results for each customer.
Hopefully this helps explain why derived tables can be useful.