I love the little challenges Ayende Rahien puts on his blog. The last SQL challenge was to convert a table like this:
| FromDate |
ToDate |
| 01/01/2000 |
01/31/2000 |
| 02/01/2000 |
02/28/2000 |
| 03/05/2000 |
03/31/2000 |
Into this:
| FromDate |
ToDate |
| 01/01/2000 |
02/28/2000 |
| 03/05/2000 |
03/31/2000 |
Where adjacent date ranges collapse to a single record. The first step was to setup some experimental tables.
CREATE TABLE #InDates
(
FromDate datetime,
ToDate datetime
)
INSERT INTO #InDates VALUES('1/1/2000', '1/31/2000')
INSERT INTO #InDates VALUES('2/1/2000', '2/28/2000')
INSERT INTO #InDates VALUES('3/5/2000', '3/31/2000')
CREATE TABLE #OutDates
(
FromDate datetime,
ToDate datetime
)
I have to think about SQL in small chunks, so I first thought about how to get a resultset with a FromDate minus one day. That would give me a column to compare against another record's ToDate:
SELECT DATEADD(dd, -1, FromDate) AS FromDateMinusOne, ToDate FROM #InDates
Typically these kind of solutions involve a self join, so I added the above into the following query:
SELECT *
FROM #InDates I
LEFT JOIN
(SELECT
DATEADD(dd, -1, FromDate) AS FromDateMinusOne,
ToDate
FROM #InDates) M
ON I.ToDate = M.FromDateMinusOne
Which generates the following resultset:
| FromDate |
ToDate |
FromDate |
ToDate |
| 2000-01-01 |
2000-01-31 |
2000-01-31 |
2000-02-28 |
| 2000-02-01 |
2000-02-28 |
NULL |
NULL |
| 2000-03-05 |
2000-03-31 |
NULL |
NULL |
Looking at that resultset it was easy to see that coalescing the ToDate into the NULL value left behind by a LEFT JOIN and using a GROUP BY would process the sample data.
INSERT INTO #OutDates
SELECT
MIN(I.FromDate),
COALESCE(M.ToDate, I.ToDate) AS ToDate
FROM #InDates I
LEFT JOIN
(SELECT DATEADD(dd, -1, FromDate) AS FromDateMinusOne, ToDate
FROM #InDates) M ON I.ToDate = M.FromDateMinusOne
GROUP BY COALESCE(M.ToDate, I.ToDate)
Then Ayende had to up the ante by updating the post with a harder challenge. This one has me scratching my head. I ended up with the above SELECT query inside a SQL 2005 common table expression. I’m sure someone will come along and tell me the following query is horribly inefficient, and terribly wrong:
WITH CollapsedDates(FromDate, ToDate)
AS
(
SELECT
MIN(T1.FromDate),
COALESCE(T2.ToDate, T1.ToDate)
FROM #InDates T1
LEFT JOIN
(SELECT DATEADD(dd, -1, FromDate) AS FromDateMinusOne, ToDate
FROM #InDates) T2 ON T1.ToDate = T2.FromDateMinusOne
GROUP BY COALESCE(T2.ToDate, T1.ToDate)
)
SELECT
MIN(FromDate),
ToDate
FROM
(SELECT
CD1.FromDate AS FromDate,
(SELECT MAX(ToDate) FROM CollapsedDates CD2
WHERE CD1.ToDate BETWEEN CD2.FromDate AND CD2.ToDate) AS ToDate
FROM CollapsedDates CD1) UGLY
GROUP BY ToDate
The query plan is an ugly beast (but not the ugliest). I think I need to step away from this problem. There has to be a more elegant solution.