Joe Developer is working on a new application for a book publisher. Authors can publish zero or more books. Books can have zero or more authors.
Joe wrote a query to get a total count of all authors, and a total count of all books. Joe read on the Internet that the DISTINCT keyword is good to use in these scenarios.
The problem is - the numbers seem too high. What's wrong? Is it easy to fix?
Comments
SELECT
COUNT(DISTINCT Authors.Id) AS TotalAuthors,
COUNT(DISTINCT Books.Id) As TotalBooks
FROM
Authors
FULL JOIN AuthorsBooks AB ON Authors.Id = AB.AuthorID
FULL JOIN Books ON AB.BookID = Books.Id
SELECT
COUNT(DISTINCT Authors.Id) AS TotalAuthors,
COUNT(DISTINCT Books.Id) As TotalBooks
FROM
Authors
FULL JOIN AuthorsBooks AB ON Authors.Id = AB.AuthorID
FULL JOIN Books ON AB.BookID = Books.Id
I also updated the post to fix a small typo (not in the code).
-- use count(1) not count(*) to speed up
select
(select count(1) from authors) as [AuthorsCount]
, (select count(1) from books) as [BooksCount]
@ yaip: This is the database diagram tool in SQL Server 2005. It is similar to the same tool in SQL 2000, just a slightly different look.
just trying to squash that myth