And Equality for All ... Anonymous Types

Wednesday, March 26, 2008

Given this simple Employee class:

public class Employee
{
    
public int ID { get; set; }
    
public string Name { get; set; }    
}

How many employees do you expect to see from the following query with a Distinct operator?

var employees = new List<Employee>
{
    
new Employee { ID=1, Name="Barack" },
    
new Employee { ID=2, Name="Hillary" },
    
new Employee { ID=2, Name="Hillary" },
    
new Employee { ID=3, Name="Mac" }
};

var query =
        (
from employee in employees        
        
select employee).Distinct();

foreach (var employee in query)
{
    
Console.WriteLine(employee.Name);
}

The answer is 4 – we'll see both Hillary objects. The docs for Distinct are clear – the method uses the default equality comparer to test for equality, and the default comparer sees 4 distinct object references. One way to get around this would be to use the overloaded version of Distinct that accepts a custom IEqualityComparer.

Let's try the query again and project a new, anonymous type with the same properties as Employee.

var query =
  (
from employee in employees                            
  select new { employee.ID, employee.Name }).Distinct();

That query only yields three objects – Distinct removes the duplicate Hillary! How'd it suddenly get so smart?

Turns out the C# compiler overrides Equals and GetHashCode for anonymous types. The implementation of the two overridden methods uses all the public properties on the type to compute an object's hash code and test for equality. If two objects of the same anonymous type have all the same values for their properties – the objects are equal. This is a safe strategy since anonymously typed objects are essentially immutable (all the properties are read-only). Fiddling with the hash code of a mutable type gets a bit dicey.

Interestingly – I stumbled on the Visual Basic version of anonymous types as I was writing this post and I see that VB allows you to define "Key" properties. In VB, only the values of Key properties are compared during an equality test. Key properties are readonly, while non-key properties on an anonymous type are mutable. That's a very C sharpish thing to do, VB team.


Comments
gravatar soundararajan Tuesday, November 3, 2009
Exactly what i am looking for,,, Thanks a lot
gravatar ashar Tuesday, March 9, 2010
Thank you very much. My code was working as I thought it should but translated to VB.NET it wasn't doing the distinct correctly. Reading up on the MSDN took care of that!
gravatar Sultan Thursday, September 30, 2010
What if my collection is
var employees = new List<Employee>
{
new Employee { ID=1, Name="Barack" },
new Employee { ID=2, Name="Hillary" },
new Employee { ID=3, Name="Hillary" },
new Employee { ID=4, Name="Mac" }
};

And I performe this Query Again in C#

var query =
(from employee in employees
select new { employee.ID, employee.Name }).Distinct();


Will this remove hillary now.. Notice IDs are different...

My issue is i want to remove itms based on Key .. values for that key will be different but it should be considerd as duplicate... Will this help in my case.. In C# if so please tell how can this be possible ?? Thanks in Advance.
Gabriel Tuesday, February 8, 2011
It's not working! It keeps on brings all registeres. What do I do?
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!