Rethinking Biggy

Wednesday, March 19, 2014

A few weeks ago Rob unveiled Biggy, a simple file based document store for .NET inspired by NeDB. Since then there’s been a few additions and Biggy also works with relational databases and MongoDB.

Looking through the code and the enhancement list, I couldn’t help wondering if Biggy might benefit from a different design. I started to open an issue on GitHub based on a completely experimental branch I created, then decided it would be better off as a post.

How It Currently Works

Looking over the Biggy implementation, every different data store becomes coupled to an InMemoryList<T> class through inheritance. The coupling isn’t necessarily wrong, but it does complicate the implementation of each new data store. For example, for JSON storage the Add method has to remember to call into the base class in order to raise the proper events:

public void Add(T item) {
  var json = JsonConvert.SerializeObject(item);
  using (var writer = File.AppendText(this.DbPath)) {
    writer.WriteLine(json);
  }
  base.Add(item);
}

The inheritance relationship also complicates life for consumers, as InMemoryList supports both Clear and Purge methods in the API, but the JSON implementation only supports Clear. Not sure if this was intentional or not, but I did find it confusing.

I also thought an alternate approach that clearly defines the responsibilities of the in-memory data manager and the backing data store might be helpful…

Separating Lists From Stores

First we’ll start off with an abstraction that clearly defines the capabilities of a Biggy in-memory list.

public interface IBiggy<T> : IEnumerable<T>
{
    void Clear();
    int Count();
    T Update(T item);
    T Remove(T item);
    T Add(T item);
    IList<T> Add(IList<T> items);
    IQueryable<T> AsQueryable();

    event EventHandler<BiggyEventArgs<T>> ItemRemoved;
    event EventHandler<BiggyEventArgs<T>> ItemAdded;
    event EventHandler<BiggyEventArgs<T>> Changed;
    event EventHandler<BiggyEventArgs<T>> Loaded;
    event EventHandler<BiggyEventArgs<T>> Saved;
}

Methods like Add and Remove will return the affected item in case something changed, like if the underlying store populates a key or version field. There are two versions of Add, because batch inserts are a common scenario.

The implementation of an IBiggy can now focus on manipulating in-memory data and firing events. All data persistence is handled by stores.

public virtual T Add(T item)
{
    _store.Add(item);
    _items.Add(item);
    Fire(ItemAdded, item: item);
    return item;
}

A Biggy store has a simple, brute force API.

public interface IBiggyStore<T>
{
    IList<T> Load();
    void SaveAll(IList<T> items);
    void Clear();     
    T Add(T item);
    IEnumerable<T> Add(IEnumerable<T> items);
}

But I’m also thinking some data stores might support additional features that make updates and queries more efficient. These capabilities are segregated into separate interfaces.

public interface IUpdateableBiggyStore<T> : IBiggyStore<T>
{
    T Update(T item);
    T Remove(T item);
}

public interface IQueryableBiggyStore<T> : IBiggyStore<T>
{
    IQueryable<T> AsQueryable();
}

And when constructing a BiggyList<T>, you have to inject a specific data store. BiggyList<T> can query the available interfaces in a data store to understand the store’s capabilities, but once the store is inject a list client never needs to know about the store.

public BiggyList(IBiggyStore<T> store)
{
    _store = store;
    _queryableStore = _store as IQueryableBiggyStore<T>;
    _updateableBiggyStore = _store as IUpdateableBiggyStore<T>;
    _items = _store.Load();
}

Now, the implementation of an actual data store doesn’t need to call into a base class or worry about raising events. The store only does what it is told. Circling back to the JSON backing store, an implementation might look like:

T IBiggyStore<T>.Add(T item)
{
    var json = JsonConvert.SerializeObject(item);
    using (var writer = File.AppendText(DbPath))
    {
        writer.WriteLine(json);
    }
    return item;
}

Another useful benefit to this approach is that each store can specify generic constraints independently of the Biggy list or other data stores. For example, a data store for SQL Server can specify that T has to implement an interface with an ID property, with one for Azure Table Storage can enforce partition and row keys.

Is It Useful?

All the static typing and interfaces might move Biggy away from Rob’s initial vision of a lightweight and easy to use library, but I think the ability to cleanly separate stores from lists is valuable not just for extensibility but in the simplicity of the design.

All My Pluralsight Courses

OdeToCode by K. Scott Allen