David Aldridge challenged me to write about functions that return lists of dates. So here goes
In blogs past I wrote about the joys of calculating year-to-date, YTD, on sparse data sets. To my mind sparsity is a good reason to not include ytd values in slightly aggregated summaries.
But that still leaves us a few methods to calculate YTD on the fly.
- David already mentioned transformation tables
- but dismissed the use of BETWEEN
- in lists of dates can also be used
- as can functions that return tables
I did a few tests on a 600,000,000 row partitioned (816 partitions) table running on Oracle 9.2.0.7 with a degree of parallel set to 2 (see Doug Burns) and bitmap indexes on all of the dimension keys. And to make things slightly more different the year starts on 1-APR-2006. So it’s apples and oranges time if you want to compare things!
The tests
My denormalised time dimension table PS_DN_DT is fairly normal as things go except that it contains various additional columns to contain the date of the beginning of the year (it changes each year) and offsets for same-date-last-year. I also have defined a few data types including DAY_LIST
create or replace TYPE “DAY_LIST” AS
TABLE OF DATE;
And my own package of time series functions. One of these returns a table of date.
function ydt_day (p_day in date) return day_list
as
l_data day_list;
l_first_day date;
begin
select first_day into l_first_day from ps_dn_dt where actual_dt= p_day;
select cast (multiset (select actual_dt from ps_dn_dt
where actual_dt between l_first_day and p_day) as day_list)
into l_data from dual;
return l_data;
end;
Now I will look at three queries based on the same range of dates, 1-APR-2006 to 16-AUG-2006, one branch ‘xxxx’ and one product ‘M111111′ and compare three of the ways to calculate YDT – I’ll leave out David’s table – even though we have something very similar in place in this database. See attached file. Note I ran each query twice
The results
On the face of it there is little real difference in the clock time for any of these queries. There are differences in the query plans but probably the most significant effect is the bitmap indexes allowing the query to zero in on the rows that match.
To tell you the truth – I use the transformation table method in real-life; the table function was a “just for fun to prove I still could code” thing


An alternative solution the YTD problem is by altering the date dimension (assuming one exists).
Add a Y/N flag YTD_COMPARE that gets updated daily based on the “current” date.
So if the year started on 1-Apr then on 15-April then for all years in my dimension the YTD_COMPARE flag would be Y for dates between 1 and 15 Apr inclusive and N otherwise.
This means reports can just join to the date dimension and include YTD_COMPARE = ‘Y’ in the where clause. Simple.
Any thoughts?
By: Paul James on August 17, 2006
at 8:08 am
Paul – sounds a nice idea but I see a few problems – true they can be worked around. Do you clear that flag at the start of a new year?
What happens if I wanted to run a YDT to the 31-Jul some time in Agust?
and if I wanted to look at YTD for a date last year.
By: Pete_S on August 17, 2006
at 12:09 pm
I like these functions for stuff that is very tricky to calculate for some reason, but for simple data sets like this they have the major failing (IMHO) of obscuring the cardinality of the result set from the optimizer.
Which suddenly gives me an idea … back in a moment …
By: David Aldridge on August 17, 2006
at 5:53 pm
I like these functions for stuff that is very tricky to calculate for some reason, but for simple data sets like this they have the major failing (IMHO) of obscuring the cardinality of the result set from the optimizer.
Which is of course what it does – see my next post
By: Pete_S on August 17, 2006
at 6:00 pm
“What happens if I wanted to run a YDT to the 31-Jul some time in Agust?
and if I wanted to look at YTD for a date last year.”
All good points, and shows that my users don’t really use data warehousing to it’s potential.
Mainly because I wouldn’t know how to present YTD at “some point in time” via Business Objects.
We do have a (month*100)+day column so this could be used for these types of queries.
And yes the flag it does get reset at the start of every year.
By: Paul James on August 18, 2006
at 7:44 am
Instead of the table of date function, how about a with clause that generates the table of dates via a connect-by query and then join that to your main query?
By: Michael Smith on November 22, 2006
at 1:33 pm
Michael,
the tricky parts would probably be that the optimizer would be unsure of the cardinality of the subquery, and that it might interfere with effective partition pruning.
By: David Aldridge on November 23, 2006
at 1:56 am
How do I calculate a specific date of the year? Is there a formula or equation to calculate it?
By: rahman on February 9, 2007
at 7:57 am
Rahman
Can you give me an example of what you mean by calculate a specific date of the year?
By: Peter Scott on February 9, 2007
at 1:47 pm
Calculate a specific date of a year from a given date of any year and precisely find the data for daylight savings
eg. 03/11/07 is the daylight savings date for 2007 so what is the date for next year daylight savings date which is 03/09/07. Thank you
By: rahman on March 14, 2007
at 5:33 am
Sorry, that is not the sort of thing I would do.
I would think about using inbuilt timezone functionality within Oracle or the operating system to work this sort of thing out. You have to trust the vendor to get it right though.
By: Peter Scott on March 16, 2007
at 10:48 pm