Dynamically generating a table of dates in Microsoft SQL (Part Deux!)

Posted by Dan on Jan 8, 2009 @ 11:06 AM

Yesterday I wrote a blog entry talking about dynamically generating a table of dates in MSSQL 2005. My original solution involved common table expressions (CTE) and involved recursion. While there's nothing wrong with using recursion this way, I mainly was using recursion to simplify the UDF code into a single SQL expression.

However, the more I started thinking about the problem last night, I decided I really needed to revise things a bit further. I decided I really needed to have the query return both a start and end date for each row in the recordset—as this would make it much easier to things such as find all the orders placed during each hour of the day.

So, this morning I came in and re-wrote my query and optimized a few things. The first thing I did was remove the recursion, so this UDF should now work in SQL Server 2000 as well as SQL Server 2005. It's important to note that you could do the same thing with a physical table, but for my uses my date range table generally would only include at most a few dozen rows and there's not enough of a performance difference to warrant the pain of maintaining a table of every date and possible combination you may want to return aggregate results over.

The new code looks like this:

SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO -- ============================================= -- Author: Dan G. Switzer, II -- Create date: Jan 8, 2009 -- Description: Creates a table with a set of date ranges -- ============================================= create function generateDateTable ( -- Add the parameters for the function here @start_date datetime , @end_date datetime , @datepart varchar(20) = 'day' , @step int = 1 ) returns @dates table ( -- Add the column definitions for the TABLE variable here start_date datetime, end_date datetime ) as begin -- if we're doing calculations based on the days (not time) then strip the time out if( @datepart in ('year', 'yy', 'yyyy', 'quarter', 'qq', 'q', 'month', 'mm', 'm', 'dayofyear', 'dy', 'y', 'day', 'dd', 'd', 'week', 'wk', 'ww') ) begin set @start_date = cast(floor(cast(@start_date as float)) as datetime) set @end_date = cast(floor(cast(@end_date as float)) as datetime) end declare @new_start datetime while @start_date <= @end_date begin -- get the new starting row set @new_start = (case when @datepart in ('year', 'yy', 'yyyy') then dateadd(yy, @step, @start_date) when @datepart in ('quarter', 'qq', 'q') then dateadd(qq, @step, @start_date) when @datepart in ('month', 'mm', 'm') then dateadd(mm, @step, @start_date) when @datepart in ('dayofyear', 'dy', 'y') then dateadd(dy, @step, @start_date) when @datepart in ('day', 'dd', 'd') then dateadd(dd, @step, @start_date) when @datepart in ('week', 'wk', 'ww') then dateadd(ww, @step, @start_date) when @datepart in ('hour', 'hh') then dateadd(hh, @step, @start_date) when @datepart in ('minute', 'mi', 'n') then dateadd(n, @step, @start_date) when @datepart in ('second', 'ss', 's') then dateadd(s, @step, @start_date) when @datepart in ('millisecond', 'ms') then dateadd(ms, @step, @start_date) else dateadd(dd, @step, @start_date) end) -- insert a new row insert @dates ( start_date , end_date ) values ( @start_date -- since MSSQL is only accurate to 3.33ms, we need to subtract 3 ms to get the upper range in time , dateadd(ms, -3, @new_start) ) -- update the starting row set @start_date = @new_start end return end GO

The key changes are the query now returns both a start_date and end_date for each row, there's no no recursion and for dateparts that don't involve time intervals, the UDF will strip out the time automatically. This makes it very easy to use with the getdate() or getutcdate() functions in SQL Server.

Now I can easily generate a table of date ranges for the past 30 days:

select * from dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

This will generate a table that looks like this:

Resultset for last 30 days

Or, I can get a resultset broken down by hours in the day (from 8am until 5pm:)

select * from dbo.generateDateTable('01-01-2009 8:00:00', '01-01-2009 17:00:00', 'hour', 1)

This would look like this:

Resultset of Jan 1, 2009 from 8am until 5pm

The introduction of the start_date and end_date columns also changes the syntax of the queries a bit. To take use of the start_date and end_date columns, you'll want to join rows based on whether the time falls between the start_date and end_date columns.

So, if we wanted to find out the total number of orders placed in each our of our business day (from 8am until 6pm) we'd use a query like this:

select /* we need to add all the order counts together, since there will be multiple rows for each match found within our start_date and end_date range */ d.start_date as OrderDate, sum(isnull(o.OrderCount, 0)) as OrderCount from -- create a table with hour intervals between 8am and 5pm dbo.generateDateTable('1-6-09 8:00:00', '1-6-09 17:00:00', 'hour', 1) as d left outer join -- generate an aggregate count of orders within our timeframe ( select OrderDate, count(OrderId) as OrderCount from Order group by OrderDate ) as o on -- join each match based if they fit within our timeframe o.OrderDate between d.start_date and d.end_date -- we need to by our start_date for our aggregate count group by d.start_date order by d.start_date

This would produce a resultset that might look like this:

Resultset of orders placed between 8am and 5pm

Here's what our recordset would look like for orders placed on Jan 6, 2009:

Resultset for Jan 6, 2009

As you can see, our aggregate query is showing the correct results for hours broken down by hour for this day.

If you have any other tips or suggestions, I'd love to hear them.

This approach might not be the best approach for every problem, but it'll work well for when you have date/time ranges that are very dynamic and are limited to a pretty small resultset (under a few hundred rows.) This works perfectly for creating records sets of the last few years broken into quarters, or the past year broken down by month, etc.

Categories: SQL, HTML/ColdFusion

8 Comments

  • This is very handy Dan. I need to dig up my old business day routines and see if I could make this apply.

    Thanks for posting.
  • Hi Dan

    I find your function very useful!

    Cool runtime "Time Dimension" on demand in a data warehouse environment, I would say. Just enough to display those 0 values.

    One question, although when I pass in string dates like '2009-01-01 00:00:00' it works but when I pass in dynamic MS SQL date functions like the following it breaks.

    When I run this in MS SQL 2000
    select
      *
    from
      dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

    I get a :

    SQL Error: Line 4: Incorrect syntax near '('.

    Is the function accepting only string value dates?

    Thank you

    Kubilay
  • @Kubilay:

    Are you using the example you posted exactly? I don't have quick access to a MSSQL 2000 at the moment, but it should work fine with dynamic datetime functions. I'm using the exact syntax above in a query I've been running. The function just requires a datetime object.

    From your error message, it looks like maybe you don't have the correct number of closing parenthesis in your source code (but the example you posted above is good.)
  • this is a seriously useful function - thanks for taking the time to post it online
  • You are a rockstar, daddio. Your function performs well, is crazy flexible, and your *.sql code is neat / clean / easy to follow. +1, and well done!
  • @Kubilay, @Dan,

    First, thanks for the great function, Dan!

    I get the same error Kubilay does with dynamic datetime functions IF I run the generateDateTable() function from another database. This happens even if I prefix the function with the name of the database in which I've stored it. This does NOT happen when static values are used. I'm using the examples from the body of Dan's post:

    -- ex 1: throws Kubilay's error when db is not prr_adjunct, works fine in prr_adjunct
    select
      *
    from
      prr_adjunct.dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

    -- ex 2: works fine from another database
    select
      *
    from
      prr_adjunct.dbo.generateDateTable('01-01-2009 8:00:00', '01-01-2009 17:00:00', 'hour', 1)

    Hmmm...

    John
  • @John:

    While the error is extremely unhelpful, I wonder if it's permissions problem.

    I just tried running the function in Management Studio from different databases and had no problems. This syntax worked fine for me in various databases, including the AdventureWorks:

    select
    *
    from
    prr_adjunct.dbo.generateDateTable(dateadd(d, -30, getutcdate()), getutcdate(), 'day', 1)

    It is strange that it appears to fail on the internal SQL functions, but still feels like this might be permission based.

    However, I just found this page:
    http://www.megasolutions.net/Sqlserver/Cross-datab...

    This person was having the same problem and their solution was to use OPENQUERY to solve the issue.
  • Thanks very much for this procedure Dan. It's helped with a number of reports! Recently I needed it in MySQL, converted it and thought I'd share.

    drop procedure if exists generateDateTable;

    DELIMITER //

    CREATE procedure generateDateTable
    (
      start_date datetime
      , end_date datetime
      , datepart varchar(20)
    )

    begin

        DECLARE step INT DEFAULT 1;
        DECLARE new_start datetime DEFAULT 0;

        if datepart = null OR datepart = '' then
            set datepart = 'day';
        end if;

      if datepart in ('year', 'yy', 'yyyy', 'quarter', 'qq', 'q', 'month', 'mm', 'm', 'dayofyear', 'dy', 'y', 'day', 'dd', 'd', 'week', 'wk', 'ww') then
             set start_date = cast(floor(cast(start_date as datetime)) as datetime);
          set end_date = cast(floor(cast(end_date as datetime)) as datetime);
      end if;
    DROP TEMPORARY TABLE IF EXISTS dates;
    CREATE TEMPORARY TABLE IF NOT EXISTS dates
    (                 start_date datetime        ,         end_date datetime);

      while start_date <= end_date do
        set new_start = (case
          when datepart in ('year', 'yy', 'yyyy') then date_add(start_date,INTERVAL step YEAR)
          when datepart in ('month', 'mm', 'm') then date_add(start_date,INTERVAL step MONTH)
          when datepart in ('day', 'dd', 'd') then date_add(start_date,INTERVAL step DAY)
          when datepart in ('week', 'wk', 'ww') then date_add(start_date,INTERVAL step WEEK)
          when datepart in ('hour', 'hh') then date_add(start_date,INTERVAL step HOUR)
          when datepart in ('minute', 'mi', 'n') then date_add(start_date,INTERVAL step MINUTE)
          when datepart in ('second', 'ss', 's') then date_add(start_date,INTERVAL step SECOND)
          else ''
        end);
        insert into
          dates
        (
          start_date
          , end_date
        ) values (
          start_date
          , date_add(new_start,INTERVAL -3 MICROSECOND)
        );
        set start_date = new_start;
      end while;
                select * from dates;
    end//

Comments for this entry have been disabled.