Abstractions, Patterns, and Interfaces

Thursday, November 15, 2012

Someone recently asked me how to go about building an application that processes customer information. The customer information might live in a database, but it also might live in a .csv file.

The interesting thing is I’m in the middle of building a tool for DBAs that one day will be a fancy WPF desktop application integrated with source control repositories and relational databases, but for the moment is a simple console application using only the file system for storage. Inside I’ve faced many scenarios similar to the question being asked, and these are scenarios I’ve faced numerous times over the years.

There are 101 ways to solve the problem, but let’s work through one solution together.

Getting Started

We might start by writing some tests, or we might start by jumping in and trying to display a list of all customers in a data source, but either way we’ll eventually find ourselves with the following code, which contains the essence of the question:

var customers = // how do I get customers?

 

To make things more concrete, let’s take that line of code and put it in a class that will do something useful. A class to put all customer names into the output of a console program.

public class CustomerDump

{
public void Render()
{
var customers = // how ?
foreach (var customer in customers)
{
Console.WriteLine(customer.Name);
}
}
}


Although we might not know how to retrieve the customer data, we probably do know what data we need about each customer. We’ll go ahead and define a Customer class for objects to hold customer data.

public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public string Location { get; set; }
}

Now we can work on the main question. The business has told us we need to be flexible with the customer data, so how will we go about retrieving customers?

Defining an Interface

Interfaces are wonderful for a language like C#. Interfaces give us everything we need to work with an object in a strongly-typed manner, but place the least number of constraints on the object implementing the interface. Interfaces make the C# compiler happy without forcing us to pay an inheritance tax for working with a class hierarchy. We’ll define an interface that describes exactly how we want to fetch customers and how we want the customers packaged for us to consume.

public interface ICustomerDataSource
{
IList<Customer> FetchAllCustomers();
}

There are many subtleties to interface design. Even the simple interface here required us to make a number of decisions.

First, what is the name of the operation? Do we want to FetchAllCustomers? SelectAllCustomers? GetCustomers? I believe names are important at this level, but you don’t want to give too much away. A name like SelectAllCustomers is biased towards working with a relational database, and we know we’ll be working with more than just a SQL database.

Often the name is influenced by what we know about the project and the business. Fortunately, refactoring tools make names easy to change.

Another design decision is the return type. When you are trying to abstract away some operation, you have to decide if you’ll go for the lowest common denominator (anything can return IEnumerable), or something that might only be achieved by an advanced data source (like IQueryable). In this example we are forcing all implementations to return a list, which has some tradeoffs, but at least we know we’ll be getting specific type of data structure. IEnumerable would be targetting the lowest common denominator and means the interface is easier to implement, but we might not have all the convenience features we need.

Once again, knowing a bit about the direction of the project and being in tune with the business needs will help in determining when to add flexibility and when to enforce constraints. 

Implementing the Interface

One question we might have had in the back of our mind is how to provide an implementation of the data loading interface when some implementations might need parameters like a database connection string, while other implementations might need file system details, like the path the to the .csv file with customers inside.

When designing an interface we need to put those thoughts in the back of our mind and focus entirely on the client’s needs first. Just watch how this unfolds as we build a class to read custom data from a csv file.

class CustomerCsvDataSource : ICustomerDataSource
{
public CustomerCsvDataSource(string path)
{
_path = path;
}



public IList<Customer> FetchAllCustomers()
{
return File.ReadAllLines(_path)
.Select(line => line.Split(','))
.Select((values, index) =>
new Customer
{
Id = index,
Name = values[0],
Location = values[1]
}).ToList();
}



readonly string _path;
}


This isn’t the most robust CSV parser in the world (it won’t deal with embedded commas, so we might want to get some help), but it does demonstrate a pattern I’ve been using over and over again recently. Class implements interface, stores constructor parameters in read-only fields, exposes methods to implement the interface, and above all keep things simple, small, and focused.

Here is the pattern again, this time in a class that uses Mark Rendle’s  Simple.Data to access SQL Server, but we could do the same thing with raw ADO.NET, the Entity Framework, or even MongoDB.

class CustomerDbDataSource : ICustomerDataSource
{
public CustomerDbDataSource(string connectionString)
{
_connectionString = connectionString;
}



public IList<Customer> FetchAllCustomers()
{
var db = Database.OpenConnection(_connectionString);
return db.Customers.All().ToList<Customer>();
}



readonly string _connectionString;
}


We can see now that worrying about connection strings and file names while defining the interface was premature worrying. These were all implementation details the interface isn’t concerned with, as the interface only exposes the operations clients need, like the ability to fetch customers.

Instead, these classes are “programmed” with implementation specific instructions given by constructor parameters, and the instructions give them everything they need to do the work required by the interface. The classes never change the instructions (they are all saved in read-only fields), but they use the instructions to produce new results.

We have now reached the point where we have two different classes to deal with two different sources of data, but how do we use them?

Consuming the Interface

Returning to our CustomerDump class, one obvious approach to producing results is the following.

public class CustomerDump
{
public void Render()
{
var dataSource = new CustomerCsvDataSource("customers.csv");
var customers = dataSource.FetchAllCustomers();

foreach (var customer in customers)
{
Console.WriteLine(customer.Name);
}
}
}

The above approach can work, but we’ve tied the CustomerDump class to the CSV data source by instantiating CustomerCsvDataSource directly. If we need CustomerDump to only work with a CSV data source, this is reasonable, but we know most of the application needs to work with different data sources so we’ll need to avoid this approach in most places.

Instead of CustomerDump choosing a data source and coupling itself to a specific class, we’ll force someone to give CustomerDump the data source to use.

public class CustomerDump
{
public CustomerDump(ICustomerDataSource dataSource)
{
_dataSource = dataSource;
}

public void Render()
{
var customers = _dataSource.FetchAllCustomers();

foreach (var customer in customers)
{
Console.WriteLine(customer.Name);
}
}

readonly ICustomerDataSource _dataSource;
}

Now, any logic we have inside of CustomerDump can work with customers from anywhere, and we can add new data sources in the future. We’ve gained a lot of flexibility in an area where the business demands flexibility, and hopefully didn’t build a mountain of abstractions where none were required. All the pieces are small and focused, and they way they will fit together depends on the application you are building. Which leads to the next question – who is responsible for putting CustomerDump together?

At the top level of every application built in this fashion you’ll have some bootstrapping code to arrange all the pieces and set them in motion. For a console mode application it might look like this:

static void Main(string[] args)
{
// arrange
var connectionString = @"server=(localdb)\v11.0;database=Customers";
var dataSource = new CustomerDbDataSource(connectionString);
var dump = new CustomerDump(dataSource);

// execute
dump.Render();
}

Here we have hard-coded values again, but you can imagine hard-coded connection strings and class names getting intermingled or replaced with if/else statements and settings from the app.config file. As the application becomes more complex, we could turn to tools like MEF or StructureMap to manage the construction of the building blocks we need.

Going Further

One of the biggest challenges in building well factored software is knowing when to stop adding abstractions. For example, we can say the CustomerDump class is currently tied too tightly to Console.Out. To remove the dependency we’ll instead inject a Stream for CustomerDump to use.

public CustomerDump(ICustomerDataSource dataSource,
Stream output)
{
_dataSource = dataSource;
_output = new StreamWriter(output);
}
Alternatively, we could say CustomerDump shouldn’t be responsible for both getting and formatting each customer as well as sending the result to the screen. In that case we’ll just have CustomerDump create the formatted string, and leave it to the caller to decide what to do with the result.
public string CreateDump()
{
var builder = new StringBuilder();
var customers = _dataSource.FetchAllCustomers();

foreach (var customer in customers)
{
builder.AppendFormat("{0} : {1}",
customer.Name, customer.Location);
}
return builder.ToString();
}

Now we might look at the code and decide that getting and formatting are two different responsibilities, so we’ll need someone to pass the list of customers to format instead of having the method use the data source directly. And so on, and so on.

Where do we stop?

That’s where most samples break down because the right place to stop is the place where we have just enough abstraction to make things work and still meet our requirements for testability, maintainability, scalability, readability, extensibility, and all the other ilities we need. Samples like this can show you the patterns you can use to achieve specific results, but only in the context of a specific application do we know the results we need. We need to apply both YAGNI and SRP in the right places and at the right time.

Comments
gravatar Brent Thursday, November 15, 2012
Great post, I would think this would be a very useful example for demonstrating the power of dependency injection to developer who have never really worked with it. You can seamlessly replace your ICustomerDataSource implementation on the fly because you coded to the interface. Nice work.
gravatar Xiaohong Thursday, November 15, 2012
When to stop abstraction is most difficult point, I always feel guilty anytime when I stopped.
Charlie Mouse Friday, November 16, 2012
I find these small concise examples so useful compared to the reams of text on this oo patterns. Thanks
gravatar Tom Pester Friday, November 16, 2012
Nice buildup leading to a good design. People wondering the reason for this indirection can be helped by reading your post.
Once internalized you can think of a database as an after thought, something that is done at the end of the project. This leads to better OO design.

gravatar Gareth Stephenson Friday, November 16, 2012
I hope you build on this example with a way to unit test this kind of pattern, specifically using mock objects (hand rolled or using a mocking framework).

This is a great intro to building loosely coupled applications!
tobi Saturday, November 17, 2012
This is the 101 of abstraction and composition. Great post.
gravatar madhu Monday, November 19, 2012
Great post. Need more of these posts to illustrate good use of abstraction in real word applications.

gravatar levin Wednesday, November 28, 2012
This is great post. I agree with Brent, it seems to lead right up to dependency injection/composition. I would love to see you write a post of this quality on one/both of those topics.
gravatar Andrew Friday, November 30, 2012
You've basically summarized (quite nicely) what my last year of relearning how I wrote code went. The "when do I stop question" is important, but as I've become more comfortable writing code in this fashion, I've grown an almost intuitive sense of knowing when I've reached the beginning of YAGNI. It's one of those things that feels different for each project, but the more you practice it, the quicker you learn to recognize it in your particular situation. Great write up.
gravatar Steve Gentile Tuesday, December 11, 2012
Well written Scott - doesn't try to complicate things instead has a nice logical and pragmatic approach - I particularly like the way it dialogues with the reader
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!