Unit Tests and LINQ Queries

Wednesday, July 7, 2010

It’s easy to test LINQ queries when the LINQ queries always execute entirely in memory – like with LINQ to objects or LINQ to XML. All you need to do is put some in-memory data together and pass it to code executing the query.

It’s an entirely different scenario when you work with a remote LINQ provider (like NHibernate, Entity Framework, WCF Data Services, LINQ to Twitter, and all those types of technologies). Take the following simple query as an example.

var movies =
    _ctx.Movies
        .Where(movie => movie.Title.Contains("star"))
        .Take(1);

Let's imagine this query exists inside of a component (like an MVC controller). You want to unit test the query by giving the controller an in-memory fake for the database, while in production the code will execute against SQL Server.

What could possibly go wrong?

Operators Behave Differently

If you run the query against a SQL Server database full of movies you'll find a number of results, including "Star Wars" and "Starship Dave". In a unit test, let's imagine you've faked up some Movie data:

new Movie { ID=1, Title="Star Wars" },
// ...

Now hand the data off to a controller:

var controller = new HomeController(new FakeDbContext());
var result = controller.Index() as ViewResult;
var model = result.ViewData.Model as IQueryable<Movie>;

Assert.IsTrue(model.Any());

The unit test will never find Star Wars because String.Contains does a case and culture sensitive comparison. Most LINQ to SQL Server implementations will map Contains to the LIKE operator (WHERE [t].[title] LIKE N'%star%'), and LIKE will use the case sensitivity collation of SQL Server (which is case insensitive by default).

Not a huge problem, but if you are using "real" test data in-memory you'll see a difference in how the query behaves.

Another example is what happens if Title is nullable in the database. You might want to see what happens when you use nulls in your test data.

new Movie { ID=4, Title=null }

Now the query will work against SQL Server but the unit test will fail with a NullReferenceException. In-memory LINQ is trying to invoke the Contains method on a null string reference. But, the query works against SQL Server and the LIKE operator handles NULL values differently.

Not Supported Exceptions

Not every LINQ provider supports every LINQ operator. For example, the following query will happily work in-memory, or with EF / LINQ to SQL.

var movies = ctx.Movies
                .GroupBy(movie => movie.ReleaseDate.Year);

But, if you throw this query at WCF Data Services you'll get back a System.NotSupportedException {"The method 'GroupBy' is not supported."}.

It's also easy to break most LINQ providers by asking for something that only works in-memory.

var movies = _ctx.Movies
                 .Where(movie => movie.Title.IndexOf('s') == 0);

With EF this generates a NotSupportedException: {"LINQ to Entities does not recognize the method 'Int32 IndexOf(Char)' method, and this method cannot be translated into a store expression."}

Moral Of the Story

The goal of using test doubles and fakes is not to test your LINQ queries.

The goal is to test the objects and business logic consuming the results of the queries, and to do so without setting up and waiting for a real database.

As always, only integration tests will give you a total and complete assurance that your software will work 100% correctly in the real world*.

* Your mileage may vary.


Comments
Emad Thursday, July 8, 2010
This is not the first post that shows a flaw in "using" LINQ; I read another post talking about using LINQ against IQuerable or IEnumerable giving different results (it was really tricky).

If all of this is surrounding LINQ, then why to use it? why not just continue our coding like we have been doing through the past 5 years, use Criteria with NHibernate, use Xpath with XML...etc.

Should we wait something better from LINQ in the future to limit such scenarios?
gravatar Daniel Auger Thursday, July 8, 2010
Interesting post, and I have to admit I never really considered this before.

LINQ syntax is is just that - a syntax. The providers are the implementations, and they are a leaky abstraction due to behavioral differences. This is not unlike SQL behaving differently on different database implementations.
gravatar Scott Allen Thursday, July 8, 2010
@Daniel - Agreed.

@Emad - I think it's a reasonable problem to solve, but one has to be aware of the issues. For the most part things just work, and you can take some approaches to mitigate risks against LINQ providers that are not fully implemented (like returning IEnumerable from a layer that uses only the "safe" operators inside - that forces the rest of the operations to happen in memory).
gravatar Ryan Thursday, July 8, 2010
@Emad We use it because in many scenarios it makes our job easier, you just have to be aware of the limitations of the abstraction. The reason I prefer LINQ over ICriteria is that it is very easy to refactor because I don't have magic strings in my queries.
gravatar Doeke Friday, July 9, 2010
We prefer automated integration testing above unit testing. The cost of making code unit testable is often to high.
gravatar zihotki Monday, July 12, 2010
@Doeke, unit tests force you to write much less coupled code. And from my experience, maintaining legacy code without unit tests is a very large pain. And I hope I'll never work with your code :-p
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!