There Is Always Risk In Portability

Wednesday, May 7, 2008 by scott

roll the dice with LINQ After my last post, someone asked me if the "portable" repository pattern was really a good idea. He was referring to the fact the LINQ queries in the MVC Storefront and Background Motion applications would sometimes execute against in-memory collections (for unit testing), while the rest of the time the queries would execute against a relational database. Isn't there a huge risk in developers not knowing if the software really works with the database?

I don't think of the repository as a "portability" layer, although since it is an abstraction layered on top of the data access code it can provide some nice indirections, like the ability to switch the persistence store. Is this risky? Sure, there is always some element of risk in portability. Just ask anyone who has written code with a portable UI toolkit, or in HTML for that matter. You don't know what is going to happen until the 1s and 0s hit the silicon.

But …

That's not the job for unit tests. Ideally, you'll have some other tests to verify what happens when the "production" code runs.

Before continuing, I must say that in the last post I neglected to tell you that the brainy Mindscape team and Andrew Peters are responsible for the Background Motion web site, and the code that powers the site. Make sure to visit the site and marvel at the beauty of New Zealand, then drop into the Mindscape blogs. Everyone - let's hear it for New Zealand!

What Is This Risk You Speak Of?

You can write a LINQ query that works fine against in-memory collections, but that can fail spectacularly when you swap in a remote LINQ provider. Here is an obvious example:

var result =

    from a in dc.Addresses

    where !String.IsNullOrEmpty(a.PostalCode)

    select a;

This query is happy to execute using LINQ to Objects, but it fails with an exception if LINQ to SQL is sitting behind the sequence (NotSupportedException: Method 'Boolean IsNullOrEmpty(System.String)' has no supported translation to SQL).

Those types of problems are easy to spot in automated integration testing because exceptions are relatively easy to track down. The real risk is in the queries that don't flame out in spectacular fashion, but execute successfully with slight variations. Here is one example:

var distinctPostalCodes =

(

 from a in addresses

 orderby a.State ascending

         select new { a.PostalCode, a.State }

 ).Distinct();

This query wants to get distinct list of zip codes and states for all our customers, and order the list by state. Works perfectly with LINQ to objects, and executes successfully in LINQ to SQL. Just one tiny problem you might observe in the generated SQL:

SELECT DISTINCT [t0].[PostalCode], [t0].[State] FROM [dbo].[Address] AS [t0]

Notice the distinct (pardon the pun) lack of an ORDER BY clause. If the upper layers were expecting the results sorted by State then we have problems.

It turns out that LINQ to SQL throws out an inner OrderBy operator when the Distinct operator comes into play. This could be for several reasons, but the most likely reason is DISTINCT and ORDER BY have an uneasy relationship in ANSI SQL (it's not just MS SQL). You can read more about this on Jeff Smith's blog: SELECT DISTINCT and ORDER BY, and there is another good explanation here: Some Common Mis-conceptions about DISTINCT.

One also has to wonder if Distinct might reorder the results in its quest to remove duplicates - it's not explicitly documented that it doesn't. In this case, it's better to forego the query comprehension syntax and make the pipeline of operators more explicit:

var distinctPostalCodes =

addresses.Select(a => new { a.PostalCode, a.State })

               .Distinct()

               .OrderBy(a => a.State);

This forces LINQ to SQL to generate a safe query with the expected results.

Is there risk? Sure – and it's not just in LINQ to SQL. Any multi-target technology runs the same risk. You just need an awareness and safety net (in the form of tests) to mitigate the risk.

Contrasting Two MVC / LINQ to SQL Applications for the Web

Tuesday, May 6, 2008 by scott

There are two applications on CodePlex that are interesting to compare and contrast. The MVC Storefront and Background Motion.

MVC Storefront

MVC Storefront is Rob Conery's work. You can watch Rob lift up the grass skirt as he builds this application in a series of webcasts (currently up to Part 8). Rob is using the ASP.NET MVC framework and LINQ to SQL. The Storefront is a work in progress and Rob is actively soliciting feedback on what you want to see.

At the it's lowest level, the Storefront uses a repository type pattern.

public interface ICatalogRepository {

    IQueryable<Category> GetCategories();

    IQueryable<Product> GetProducts();

    IQueryable<ProductReview> GetReviews();

    IQueryable<ProductImage> GetProductImages();

}

The repository interface is implemented by a TestCatalogRepository (for testing), and a SqlCatalogRepository (when the application needs to get real work done). Rob uses the repositories to map the LINQ to SQL generated classes into his own model, like the following code that maps a ProductImage (the LINQ generated class with a [Table] attribute) into a ProductImage (the Storefront domain class).

public IQueryable<ProductImage> GetProductImages() {

    return from i in ReadOnlyContext.ProductImages

           select new ProductImage

           {

               ID = i.ProductImageID,

               ProductID = i.ProductID,

               ThumbnailPhoto = i.ThumbUrl,

               FullSizePhoto = i.FullImageUrl

           };

}

Notice the repository also allows IQueryable to "float up", which defers the query execution. The repositories are consumed by a service layer that the application uses to pull data. Here is an excerpt of the CatalogService.

public Category GetCategory(int id) {

    Category result = _repository.GetCategories()

        .WithCategoryID(id)

        .SingleOrDefault();

    return result;

}

Controllers in the web application then consume the CatalogService.

public ActionResult Show(string productcode) {

    CatalogData data = new CatalogData();

    CatalogService svc = new CatalogService(_repository);

    data.Categories = svc.GetCategories();

    data.Product = svc.GetProduct(productcode);

    return RenderView("Show",data);

}

Another interesting abstraction in Rob's project is LazyList<T> - an implementation of IList<T> that wraps an IQueryable<T> to provide lazy loading of a collection. LINQ to SQL provides this behavior with the EntitySet<T>, but Rob is isolating his upper layers from LINQ to SQL needs a different strategy. I'm not a fan of the GetCategories method in CatalogService – that looks like join that the repository should put together for the service, and the service layer itself doesn't appear to add a tremendous amount of value, but overall the code is easy to follow and tests are provided. Keep it up, Rob!

Background Motion

The Background Motion (BM) project carries significantly more architectural weight. Not saying this is better or worse, but you know any project using the Web Client Software Factory is not going to be short on abstractions and indirections.

Unlike the Storefront app, the BM app uses a model that is decorated with LINQ to SQL attributes like [Table] and [Column]. BM has a more traditional repository pattern and leverages both generics and expression trees to give the repository more functionality and flexibility.

public interface IRepository<T> where T : IIdentifiable

{

  int Count();

  int Count(Expression<Func<T, bool>> expression);

  void Add(T entity);

  void Remove(T entity);

  void Save(T entity);

  T FindOne(int id);

  T FindOne(Expression<Func<T, bool>> expression);

  bool TryFindOne(Expression<Func<T, bool>> expression, out T entity);

  IList<T> FindAll();

  IList<T> FindAll(Expression<Func<T, bool>> expression);

  IQueryable<T> Find();

  IQueryable<T> Find(int id);

  IQueryable<T> Find(Expression<Func<T, bool>> expression);

}

Notice the BM repositories will "float up" a deferred query into higher layers by returning an IQueryable, and allow higher layers to "push down" a specification in the form of an expression tree. Combining this technique with generics means you get a single repository implementation for all entities and minimal code. Here is the DLinqRepository implementation of IRepository<T>'s Find method.

public override IQueryable<T> Find(Expression<Func<T, bool>> expression)

{

  return DataContext.GetTable<T>().Where(expression);

}

Where FindOne can be used like so:

Member member = Repository<Member>.FindOne(m => m.HashCookie == cookieValue);

BM combines the repositories with a unit of work pattern and consumthines both directly in the website controllers.

public IList<Contribution> GetMostRecentByContentType(int contentTypeId, int skip)

{

  using (UnitOfWork.Begin())

  {

    return ModeratedContributions

      .Where(c => c.ContentTypeId == contentTypeId)

      .OrderByDescending(c => c.AddedOn)

      .Skip(skip)

      .Take(Constants.PageSize).ToList();

  }

}

The Background Motion project provides stubbed implementation of all the required repositories and an in-memory unit of work class for unit testing, although the test names leave something to be desired. One of the interesting classes in the BM project is LinqContainsPredicateBuilder – a class whose Build method takes a collection of objects and a target property name. The Build method returns an expression tree that checks to see if the target property equals any of the values in the collection (think of the IN clause in SQL).

If you want to see Background Motion in action, check out backgroundmotion.com!

The XML Namespace Tax

Monday, May 5, 2008 by scott

While XML literal features in Visual Basic get all the love, the new XElement API for the CLR makes working with XML in C# a bit more fun, too. It's a prime cut of functional programming spiced with syntactic sugar.

One example is how the API works with XML namespaces. When namespaces are present, they demand attention in almost every XML operation you can perform. It's like a tax you need to pay that doesn't pay back any benefits. An old poll on xml-dev once asked people to list their "favorite five bad problems" with XML, to which Peter Hunsberger replied:

Namespaces
Namespaces
Namespaces
Namespaces
Namespaces

And in a different message, Joe English hit the nail on the head:

I'd rather treat element type names and attribute names as simple, atomic strings. This is possible with a sane API, but most XML APIs aren't sane.

The API we had pre .NET 3.5 was a fill-out-this-form-in-triplicate-and-wait-quietly-in-line bureaucracy living inside System.Xml. The new API tries to be a bit saner:

        XNamespace xmlns =
        "http://schemas.foo.com/widgets";        

            

        XDocument doc
            = new 
        XDocument(

                new
            XElement(xmlns
                + "Widgets",

                            new XElement(xmlns
                            + "Widget",
                            new
        XAttribute("ID", 1)),

                        new XElement(xmlns
                        + "Widget",
                        new
        XAttribute("ID", 2))));


            
    

Which produces:

        <Widgets xmlns="http://schemas.foo.com/widgets">

              <Widget 
                    ID="1"
                    />

                      <Widget
        ID="2" />

                </Widgets>
    

It's little things you don't notice at first that makes the API easier. Like XNamepace has an implicit conversion from string, and redefines the + operator to combine itself with a string to form a full XName (which also has an implicit string conversion operator).

Someone spent some time designing this API for users instead of for a standards body, and it's much appreciated.

Mocks - It's A Question Of When

Friday, May 2, 2008 by scott

Ross Neilson reminded me about a question I left hanging - "when should I use a mock object framework?"

If you have to ask "when", the answer is probably "not now". I feel that mock object frameworks are something you have to evolve into.

First, we can talk about mocks in general. Some people have a misconception that mock objects are only useful if you need to simulate interaction with a resource that is difficult to use in unit tests - like an object that communicates with an SMTP server. This isn't true. Colin Mackay has an article on mocks that outlines other common scenarios:

The real object has nondeterministic behavior
The real object is difficult to setup
The real object has behavior that is hard to trigger
The real object is slow
The real object is a user interface
The real object uses a call back
The real object does not yet exist

If we were to step back and generalize this list, we'd say test doubles are useful when you want to isolate code under test. Isolation is good. Let's say we are writing tests for a business component. We wouldn't want the tests to fail when someone checked in bad code for an auditing service the business component uses. We only want the test to fail when something is wrong with the business component itself. Providing a mock auditing service allows us to isolate the business component and control any stimuli the component may pick up from its auditing service. When you start feeling the pain of writing numerous test doubles by hand, you'll know you need a mock object framework.

Mocks aren't just about isolation, however. Mocks are also play a role in test driven development, which is what Colin's last bullet point alludes. The authors of "Mock Roles, Not Objects" say that mocks are:

"… a technique for identifying types in a system based on the roles that objects play … In particular, we now understand that the most important benefit of Mock Objects is what we originally called interface discovery".

Using a mock object framework alongside TDD allows a continuous, top-down design of software. Many TDD fans know they need a mock object framework from day one.

But Aren't Mock Object Frameworks Complex?

This is another question I've been asked recently. Mock object frameworks are actually rather simple and expose a small API. There is complexity, though, just not in the framework itself. As I said earlier, I think there is a path you can follow where you evolve into using mock object frameworks. The typical project team using a mock object framework is experienced with TDD and inversion of control containers. Trying to get up to speed on all these topics at once can be overwhelming.

There is also some complexity in using mocks effectively. In Mocks and the Dangers of Overspecified Software, Ian Cooper says:

"When you change the implementation of a method under test, mocks can break because you now make additional or different calls to the dependent component that is being mocked. … The mocks began to make our software more resistant to change, more sluggish, and this increased the cost to refactoring. As change becomes more expensive, we risked becoming resistant to making it, and we risk starting to build technical debt. A couple of times the tests broke, as developers changed the domain, or changed how we were doing persistence, without changing the test first, because they were frustrated at how it slowed their development. The mocks became an impedance to progress."

Mock object frameworks make interaction based testing easy, but can also lead to the problems Ian outlines. Here a couple more reads on this topic:

In summary – mock object frameworks aren't for everyone. You'll know when you need one!

Microsoft versus Open Source Software - ALT.NET notes

Tuesday, April 22, 2008 by scott

At the Seattle alt.net conference, I co-sponsored a session with Justin Angel. The topic was "Choosing Microsoft versus Mature Open Source Alternatives". We wanted to hear the rationale people were using when making choices, like:

LINQ to SQL or Castle Active Record
Entity Framework or NHibernate
Subversion and assorted tools or Team Foundation Server

Not once do I remember price being a factor. Most of the fishbowl conversation revolved around risk. There are risks that technical people don't like, and risks that business people don't like. I tried to take all the major topics mentioned and fit them into the following table.

Choose Microsoft

Choose OSS

Business Risks

License issues

Lack of formal support

Hard to hire experts

Technical Risks

V1 and V2 won't always work

Waiting on bug fixes

Friction

Small communities

Lack of training material

Quick summary - Microsoft is a safe choice from the business perspective, but MSFT products can create an uphill struggle for developers. Brad Abrams and ScottGu both popped into the fishbowl to talk about Microsoft's change of direction in building closed source frameworks with "big bang" releases. ScottGu also reminded us that patent trolls create problems for everyone in the ecosystem.

ALT.NET Trivia

How much ALT.NET can you fit in a Hyundai?

According to Hertz, the Hyundai Elantra will accommodate 5 people and 3 pieces of luggage.

The Elantra I drove into Redmond accommodated 5 people (me, Jeremy Miller, Udi Dahan, Steven "I Love The Back Middle Seat" Harman, and Ayende), 6 pieces of luggage, and 3, maybe 4 laptop bags. It was tight.

Who said developers can't optimize for space anymore?

A Gentle Introduction to Mocking

Thursday, April 17, 2008 by scott

At the last CMAP Code Camp I did a "code-only" presentation entitled "A Gentle Introduction to Mocking". We wrote down some requirements, opened Visual Studio, and started writing unit tests. Matt Podwysocki provided color commentary. Code download is here.

I started "accepting" mock objects as one tool in my unit testing toolbox about three years ago (see "The 5 Stages Of Mocking"). Times have changed quite a bit since then, and the tools have improved dramatically. During the presentation we used the following:

Rhino Mocks – the first mocking framework used in the presentation. Years ago, Oren and Rhino Mocks saved us from "string based" mock objects. Rhino Mocks can easily conjure up a strongly typed mock object. The strong typing results in fewer errors, and greatly enhances the refactoring experience.

moq – is the latest mocking framework in the .NET space and is authored by kzu and friends. moq uses lambda expressions and expression trees to define mock object behavior, and also provides strongly typed mocks. The recent addition of factories and mock verification means you can do traditional interaction style testing with moq, if that is the path you choose. The primary differentiator between the two frameworks is that moq does not use a record / playback paradigm.

Here is a test we wrote with Rhino Mocks:

[Fact] 

public void Does_Not_Make_Deposit_When_Verification_Fails()

{

    MockRepository mocks = new MockRepository();

    IAuditService auditService = mocks.DynamicMock<IAuditService>();

    IVerificationService verificationService = mocks.CreateMock<IVerificationService>();

    decimal amount = 1000;

    BankAccount account = new BankAccount(auditService,

                                          verificationService);

    using (mocks.Record())

    {

        Expect.Call(verificationService.VerifyDeposit(account, amount))

              .Return(false);

    }

    using (mocks.Playback())

    {

        account.Deposit(amount);

    }

    account.Balance.ShouldEqual(0);

}

The same test using moq:

[Fact]

public void Does_Not_Make_Deposit_When_Verification_Fails()

{

    Mock<IAuditService> _auditMock = new Mock<IAuditService>();

    Mock<IVerificationService> _verificationMock = new Mock<IVerificationService>();

    decimal amount = 1000;

    BankAccount account = new BankAccount(_auditMock.Object, _verificationMock.Object);

    _auditMock.Expect(a => a.WriteMessage(It.IsAny<string>()));

    _verificationMock.Expect(v => v.VerifyDeposit(account, amount))

                     .Returns(false);

    account.Deposit(amount);

    account.Balance.ShouldEqual(0);            

}

xUnit.net – although not featured in the presentation, xUnit.net drove all the unit tests. xUnit is a new framework authored by Jim Newkirk and Brad Wilson. The framework codifies some unit testing best practices and takes advantage of new features in the C# language and .NET framework. I like it.

One question that came up a few times was "when should I use a mock object framework"? Turns out I've been asked a lot of questions starting with when lately, so I'll answer that question in the next post.

Following Principles

Wednesday, April 9, 2008 by scott

A dictionary definition of principle often uses the word "law", but principles in software development still require judgment. Sometimes the judgment requires some technical knowledge, like knowing the strengths and weaknesses of a particular technology. Other times the judgment requires some business knowledge, like the ability to anticipate where change is likely to occur.

Asking someone to make a sensible judgment about a principle is difficult when all you see is a snippet of code in a blog. The code is outside of its context. Take Leroy's BankAccount class. We don't really know what sort of business Leroy works for, or even what type of software Leroy is building. Nevertheless, let's apply a few principles to see what's bothering Leroy.

Remaining Single

Does Leroy's original BankAccount class violate the Single Responsibility Principle? I think so. The class is opening text files for logging, calculating interest, and oh, by the way, it needs to provide all the state and behavior for a financial account, too. Even without knowing the context, it seems reasonable to remove the auditing cruft into a separate class. After writing some tests, and implementing a concrete auditing class, Leroy's BankAccount might look like the following.

public class BankAccount

{        

    public void Deposit(decimal amount)

    {

        Balance += amount;

        _log.Write("Deposited {0} on {1}", amount, DateTime.Now);

    }

   // ...

   AuditLog _log = new AuditLog();

}

Leroy has an almost infinite number of choices to make before coming up with the above implementation, though. Leroy could have derived BankAccount from an Auditable base class, or forced BankAccount to implement an IAuditable interface. But what guides Leroy to this particular solution in the universe of a million possibilities are other principles - like the Interface Segregation Principle, and Composition Over Inheritance.

An Addiction to Auditing

Leroy might still frown at his class, feeling he has violated the Dependency Inversion Principle. Without any additional information, we have to trust Leroy's judgment when he decides to make some additional changes.

public class BankAccount

{        

    public BankAccount (IAuditLog auditLog)

    {

        _log = auditLog;

    }

    public void Deposit(decimal amount)

    {

        Balance += amount;

        _log.Write("Deposited {0} on {1}", amount, DateTime.Now);

    }

   // ...

   IAuditLog _log;

}

Perhaps Leroy already knew about some future changes in his auditing implementation, or perhaps Leroy just wanted to make his class more testable. Some of us view software as a massive heap of dependencies, and we fight to reduce the brittleness created by dependencies using inversion of control and dependency injection techniques. In some environments, this isn't needed. The principles to apply depend on the language you use, the tools you use, and ultimately depend on the problem the software is trying to solve.

Is There Still Something Wrong With This Code?

WWWTC has had a run of 19 episodes. I have some material for at least another 20. Problem is, most of the material deals with API trivia and edge cases you might never see. Interesting? To me, at least, but I'm thinking of introducing more squishy design type entries. I know a lot of people struggle to apply the latest frameworks and libraries, but design questions are always enlightening and produce the most spirited debate, giving us all something we can learn from.

All My Pluralsight Courses

OdeToCode by K. Scott Allen