OdeToCode IC Logo

Identity Maps

Monday, December 8, 2008

There are a couple of important patterns in play when you use a persistence framework. These patterns have been around for quite some time but are relatively new to .NET developers who are jumping into LINQ to SQL and the ADO.NET Entity Framework.

The first of these patterns is the Identity Map pattern. There are various aliases for this pattern. Some frameworks call it the “identity cache” while others have an “entity uniqueing” implementation. Using examples as tests, we can describe its behavior with a passing test (the code is written with the Entity Framework, but you could substitute a LINQ to SQL DataContext (or an LLBLGen Context, or an NHibernate Session) and observe the same behavior – the point being that identity maps can be found in almost every framework for persisting objects).

public void Context_Implements_An_Identity_Map()
    var context = new MovieReviewsContext();

    var movie1 = context.Movies.Where(m => m.ID == 100).First();
    var movie2 = context.Movies.Where(m => m.ID == 100).First();

    var movie3 = context.Movies.Where(m => m.ID > 98 && m.ID < 105)
                               .ToList()  // bring them all into memory

    Assert.IsTrue(Object.ReferenceEquals(movie1, movie2));
    Assert.IsTrue(Object.ReferenceEquals(movie1, movie3));

The test will execute three queries against the database. All three queries will retrieve the movie record with an ID of 100, with the last query also pulling some additional records. No matter how many times we query the database, even using different queries, exactly ONE object represents the movie with an ID of 100. The context uses an identity map to track the entities it creates. The mapping information we provide to the framework allows the context to know which property uniquely identifies each entity (the integer ID property in the case of a a Movie).

You can think of the identity map as a Dictionary<int, Movie>. Each time the context sees a resultset from the database, it checks the dictionary to see if it already created an entity for the data. If so, it just returns a reference to the entity it already created. This is different than the behavior of ADO.NET DataSets, where you can issues three queries for Movie.ID == 100 and get back three distinct DataRows. Also, its important to know that each DataContext/ObjectContext in LINQ to SQL/Entity Framework maintains its own identity map.

public void Each_Context_Has_Distinct_Identity_Map()
    var context1 = new MovieReviewsContext();
    var context2 = new MovieReviewsContext();

    var movie1 = context1.Movies.Where(m => m.ID == 100).First();
    var movie2 = context2.Movies.Where(m => m.ID == 100).First();

    Assert.IsFalse(Object.ReferenceEquals(movie1, movie2));

Why Is This Important?

Some people I’ve talked to don’t like the behavior of the identity map, or are surprised by the behavior. But, imagine if we could create two objects from the same context and both objects represent the movie with an ID of 100 – and then we start making changes to those objects.  Which object should “win” when we save changes to the database? This is the scenario Fowler analogizes in his definition of the identity map pattern.

An old proverb says that a man with two watches never knows what time it is. If two watches are confusing, you can get in an even bigger mess with loading objects from a database. If you aren't careful you can load the data from the same database record into two different objects. Then, when you update them both you'll have an interesting time writing the changes out to the database correctly.

Identity maps enforce object identity in memory in much the same way that a database enforces row identity in a database table using a primary key value.

Not every application needs the safety net afforded by an identity map implementation, but, when one is in place you  need to be aware of its ramifications. We’ll cover these in a future post on another important pattern – the Unit of Work.