dans.blog: Dynamically generating a table of dates in Microsoft SQL (Part Deux!)

Yesterday I wrote a blog entry talking about dynamically generating a table of dates in MSSQL 2005. My original solution involved common table expressions (CTE) and involved recursion. While there's nothing wrong with using recursion this way, I mainly was using recursion to simplify the UDF code into a single SQL expression.

However, the more I started thinking about the problem last night, I decided I really needed to revise things a bit further. I decided I really needed to have the query return both a start and end date for each row in the recordset—as this would make it much easier to things such as find all the orders placed during each hour of the day.

So, this morning I came in and re-wrote my query and optimized a few things. The first thing I did was remove the recursion, so this UDF should now work in SQL Server 2000 as well as SQL Server 2005. It's important to note that you could do the same thing with a physical table, but for my uses my date range table generally would only include at most a few dozen rows and there's not enough of a performance difference to warrant the pain of maintaining a table of every date and possible combination you may want to return aggregate results over.

The new code looks like this:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:        Dan G. Switzer, II
-- Create date: Jan 8, 2009
-- Description:    Creates a table with a set of date ranges
-- =============================================
create function generateDateTable
(
    -- Add the parameters for the function here
    @start_date datetime
    , @end_date datetime
    , @datepart varchar(20) = 'day'
    , @step int = 1
)
returns @dates table
(
    -- Add the column definitions for the TABLE variable here
    start_date datetime,
    end_date datetime
)
as
begin
    -- if we're doing calculations based on the days (not time) then strip the time out
    if( @datepart in ('year', 'yy', 'yyyy', 'quarter', 'qq', 'q', 'month', 'mm', 'm', 'dayofyear', 'dy', 'y', 'day', 'dd', 'd', 'week', 'wk', 'ww') )
    begin
        set @start_date = cast(floor(cast(@start_date as float)) as datetime)
        set @end_date = cast(floor(cast(@end_date as float)) as datetime)
    end

    declare @new_start datetime

    while @start_date <= @end_date
    begin
        -- get the new starting row
        set @new_start = (case
            when @datepart in ('year', 'yy', 'yyyy') then dateadd(yy, @step, @start_date)
            when @datepart in ('quarter', 'qq', 'q') then dateadd(qq, @step, @start_date)
            when @datepart in ('month', 'mm', 'm') then dateadd(mm, @step, @start_date)
            when @datepart in ('dayofyear', 'dy', 'y') then dateadd(dy, @step, @start_date)
            when @datepart in ('day', 'dd', 'd') then dateadd(dd, @step, @start_date)
            when @datepart in ('week', 'wk', 'ww') then dateadd(ww, @step, @start_date)
            when @datepart in ('hour', 'hh') then dateadd(hh, @step, @start_date)
            when @datepart in ('minute', 'mi', 'n') then dateadd(n, @step, @start_date)
            when @datepart in ('second', 'ss', 's') then dateadd(s, @step, @start_date)
            when @datepart in ('millisecond', 'ms') then dateadd(ms, @step, @start_date)
            else dateadd(dd, @step, @start_date)
        end)

        -- insert a new row
        insert
            @dates
        (
            start_date
            , end_date
        ) values (
            @start_date
            -- since MSSQL is only accurate to 3.33ms, we need to subtract 3 ms to get the upper range in time
            , dateadd(ms, -3, @new_start)
        )

        -- update the starting row
        set @start_date = @new_start
    end

    return
end
GO

The key changes are the query now returns both a start_date and end_date for each row, there's no no recursion and for dateparts that don't involve time intervals, the UDF will strip out the time automatically. This makes it very easy to use with the getdate() or getutcdate() functions in SQL Server.

Now I can easily generate a table of date ranges for the past 30 days:

select
    *
from
    dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

This will generate a table that looks like this:

Resultset for last 30 days

Or, I can get a resultset broken down by hours in the day (from 8am until 5pm:)

select
    *
from
    dbo.generateDateTable('01-01-2009 8:00:00', '01-01-2009 17:00:00', 'hour', 1)

This would look like this:

Resultset of Jan 1, 2009 from 8am until 5pm

The introduction of the start_date and end_date columns also changes the syntax of the queries a bit. To take use of the start_date and end_date columns, you'll want to join rows based on whether the time falls between the start_date and end_date columns.

So, if we wanted to find out the total number of orders placed in each our of our business day (from 8am until 6pm) we'd use a query like this:

select
    /*
        we need to add all the order counts together, since there will be multiple rows
        for each match found within our start_date and end_date range
    */
    d.start_date as OrderDate, sum(isnull(o.OrderCount, 0)) as OrderCount
from
    -- create a table with hour intervals between 8am and 5pm
    dbo.generateDateTable('1-6-09 8:00:00', '1-6-09 17:00:00', 'hour', 1) as d
        left outer join
    -- generate an aggregate count of orders within our timeframe
    (
        select
            OrderDate, count(OrderId) as OrderCount
        from
            Order
        group by
            OrderDate
    ) as o
        on
    -- join each match based if they fit within our timeframe
    o.OrderDate between d.start_date and d.end_date
-- we need to by our start_date for our aggregate count
group by
    d.start_date
order by
    d.start_date

This would produce a resultset that might look like this:

Resultset of orders placed between 8am and 5pm

Here's what our recordset would look like for orders placed on Jan 6, 2009:

Resultset for Jan 6, 2009

As you can see, our aggregate query is showing the correct results for hours broken down by hour for this day.

If you have any other tips or suggestions, I'd love to hear them.

This approach might not be the best approach for every problem, but it'll work well for when you have date/time ranges that are very dynamic and are limited to a pretty small resultset (under a few hundred rows.) This works perfectly for creating records sets of the last few years broken into quarters, or the past year broken down by month, etc.

Categories: SQL, HTML/ColdFusion

8 Comments

Michael Imhoff
Jan 8, 2009 @ 4:07 PM

This is very handy Dan. I need to dig up my old business day routines and see if I could make this apply.

Thanks for posting.
Kubilay
Jan 13, 2009 @ 11:35 AM

Hi Dan

I find your function very useful!

Cool runtime "Time Dimension" on demand in a data warehouse environment, I would say. Just enough to display those 0 values.

One question, although when I pass in string dates like '2009-01-01 00:00:00' it works but when I pass in dynamic MS SQL date functions like the following it breaks.

When I run this in MS SQL 2000
select
*
from
dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

I get a :

SQL Error: Line 4: Incorrect syntax near '('.

Is the function accepting only string value dates?

Thank you

Kubilay
Dan G. Switzer, II
Jan 13, 2009 @ 1:32 PM

@Kubilay:

Are you using the example you posted exactly? I don't have quick access to a MSSQL 2000 at the moment, but it should work fine with dynamic datetime functions. I'm using the exact syntax above in a query I've been running. The function just requires a datetime object.

From your error message, it looks like maybe you don't have the correct number of closing parenthesis in your source code (but the example you posted above is good.)
Laurie
Mar 13, 2009 @ 1:20 PM

this is a seriously useful function - thanks for taking the time to post it online
Abraham Lloyd
May 12, 2009 @ 2:55 PM

You are a rockstar, daddio. Your function performs well, is crazy flexible, and your *.sql code is neat / clean / easy to follow. +1, and well done!
John Schroeder
Nov 18, 2009 @ 9:21 PM

@Kubilay, @Dan,

First, thanks for the great function, Dan!

I get the same error Kubilay does with dynamic datetime functions IF I run the generateDateTable() function from another database. This happens even if I prefix the function with the name of the database in which I've stored it. This does NOT happen when static values are used. I'm using the examples from the body of Dan's post:

-- ex 1: throws Kubilay's error when db is not prr_adjunct, works fine in prr_adjunct
select
  *
from
  prr_adjunct.dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

-- ex 2: works fine from another database
select
  *
from
  prr_adjunct.dbo.generateDateTable('01-01-2009 8:00:00', '01-01-2009 17:00:00', 'hour', 1)

Hmmm...

John
Dan G. Switzer, II
Nov 19, 2009 @ 8:43 AM

@John:

While the error is extremely unhelpful, I wonder if it's permissions problem.

I just tried running the function in Management Studio from different databases and had no problems. This syntax worked fine for me in various databases, including the AdventureWorks:

select
*
from
prr_adjunct.dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

It is strange that it appears to fail on the internal SQL functions, but still feels like this might be permission based.

However, I just found this page:
http://www.megasolutions.net/Sqlserver/Cross-datab...

This person was having the same problem and their solution was to use OPENQUERY to solve the issue.
das
Mar 22, 2014 @ 9:39 PM

Thanks very much for this procedure Dan. It's helped with a number of reports! Recently I needed it in MySQL, converted it and thought I'd share.

drop procedure if exists generateDateTable;

DELIMITER //

CREATE procedure generateDateTable
(
  start_date datetime
  , end_date datetime
  , datepart varchar(20)
)

begin

    DECLARE step INT DEFAULT 1;
    DECLARE new_start datetime DEFAULT 0;

    if datepart = null OR datepart = '' then
        set datepart = 'day';
    end if;

  if datepart in ('year', 'yy', 'yyyy', 'quarter', 'qq', 'q', 'month', 'mm', 'm', 'dayofyear', 'dy', 'y', 'day', 'dd', 'd', 'week', 'wk', 'ww') then
         set start_date = cast(floor(cast(start_date as datetime)) as datetime);
    set end_date = cast(floor(cast(end_date as datetime)) as datetime);
  end if;
DROP TEMPORARY TABLE IF EXISTS dates;
CREATE TEMPORARY TABLE IF NOT EXISTS dates
(                 start_date datetime        ,         end_date datetime);

  while start_date <= end_date do
    set new_start = (case
      when datepart in ('year', 'yy', 'yyyy') then date_add(start_date,INTERVAL step YEAR)
      when datepart in ('month', 'mm', 'm') then date_add(start_date,INTERVAL step MONTH)
      when datepart in ('day', 'dd', 'd') then date_add(start_date,INTERVAL step DAY)
      when datepart in ('week', 'wk', 'ww') then date_add(start_date,INTERVAL step WEEK)
      when datepart in ('hour', 'hh') then date_add(start_date,INTERVAL step HOUR)
      when datepart in ('minute', 'mi', 'n') then date_add(start_date,INTERVAL step MINUTE)
      when datepart in ('second', 'ss', 's') then date_add(start_date,INTERVAL step SECOND)
      else ''
    end);
    insert into
      dates
    (
      start_date
      , end_date
    ) values (
      start_date
      , date_add(new_start,INTERVAL -3 MICROSECOND)
    );
    set start_date = new_start;
  end while;
            select * from dates;
end//

Comments for this entry have been disabled.