OdeToCode IC Logo

GROUP BY and Column Aliases

Monday, April 18, 2005

Here is a subtle bug you can introduce into a query. Let’s pretend we are tracking coffee sales.

CREATE TABLE CoffeeSales

(

    TransactionID int IDENTITY PRIMARY KEY,

    Customer varchar(10) NOT NULL,

    CoffeeType varchar(10) NOT NULL,

    TransactionDate datetime NOT NULL,

    Cost money NOT NULL

)

 

INSERT INTO CoffeeSales

   VALUES('Joe', 'Decaf', '2004-04-17 07:30:00', 2.25)

INSERT INTO CoffeeSales

   VALUES('Joe', 'Decaf', '2004-04-17 09:00:00', 3.25)

INSERT INTO CoffeeSales

   VALUES('Joe', 'Mocha', '2004-04-17 17:00:00', 5.00)

 

INSERT INTO CoffeeSales

   VALUES('Amy', 'Maple', '2004-04-17 15:00:00', 1.25)

INSERT INTO CoffeeSales

   VALUES('Amy', 'Mocha', '2004-04-17 16:00:00', 4.75)

INSERT INTO CoffeeSales

   VALUES('Amy', 'Mocha', '2004-04-17 17:00:00', 3.50)

Here is a query attempting to total up the sales for each customer by coffee type and date.

SELECT

    Customer,

    CoffeeType,

    CONVERT(varchar(10), TransactionDate, 101) AS TransactionDate,

    SUM(Cost) AS TotalCost,

    COUNT(*) AS TotalTransactions

 

FROM

    CoffeeSales

 

GROUP BY

    Customer, CoffeeType, TransactionDate

 

ORDER BY Customer

 

Customer   CoffeeType TransactionDate TotalCost            TotalTransactions

---------- ---------- --------------- --------------------- -----------------

Amy        Maple      04/17/2004      1.2500                1

Amy        Mocha      04/17/2004      4.7500                1

Amy        Mocha      04/17/2004      3.5000                1

Joe        Decaf      04/17/2004      2.2500                1

Joe        Decaf      04/17/2004      3.2500                1

Joe        Mocha      04/17/2004      5.0000                1

We have 6 records, and when eyeballing the resultset we don’t see why Amy’s mocha purchases and Joe’s decaf purchases are not grouping up, after all, the name, coffee type, and transaction date are identical.

To realize what went wrong we have to think about how the database engine executes the query. Records are grouped before the CONVERT expression in the SELECT list executes. We think we are grouping by the converted TransactionDate field used as a column alias, but what SQL is doing is grouping on the raw TransactionDate field from the table, then executing the CONVERT and showing the field as just a date.

A working version follows.

SELECT

    Customer,

    CoffeeType,

    CONVERT(varchar(10), TransactionDate, 101) AS TransactionDate,

    SUM(Cost) AS TotalCost,

    COUNT(*) AS TotalTransactions

 

FROM

    CoffeeSales

 

GROUP BY

    Customer, CoffeeType, CONVERT(varchar(10), TransactionDate, 101)

 

ORDER BY Customer