Rethinking Biggy

Wednesday, March 19, 2014

A few weeks ago Rob unveiled Biggy, a simple file based document store for .NET inspired by NeDB. Since then there’s been a few additions and Biggy also works with relational databases and MongoDB.

Looking through the code and the enhancement list, I couldn’t help wondering if Biggy might benefit from a different design. I started to open an issue on GitHub based on a completely experimental branch I created, then decided it would be better off as a post.

How It Currently Works

Looking over the Biggy implementation, every different data store becomes coupled to an InMemoryList<T> class through inheritance. The coupling isn’t necessarily wrong, but it does complicate the implementation of each new data store. For example, for JSON storage the Add method has to remember to call into the base class in order to raise the proper events:

public void Add(T item) {
  var json = JsonConvert.SerializeObject(item);
  using (var writer = File.AppendText(this.DbPath)) {
    writer.WriteLine(json);
  }
  base.Add(item);
}

The inheritance relationship also complicates life for consumers, as InMemoryList supports both Clear and Purge methods in the API, but the JSON implementation only supports Clear. Not sure if this was intentional or not, but I did find it confusing.

I also thought an alternate approach that clearly defines the responsibilities of the in-memory data manager and the backing data store might be helpful…

Separating Lists From Stores

First we’ll start off with an abstraction that clearly defines the capabilities of a Biggy in-memory list.

public interface IBiggy<T> : IEnumerable<T>
{
    void Clear();
    int Count();
    T Update(T item);
    T Remove(T item);
    T Add(T item);
    IList<T> Add(IList<T> items);
    IQueryable<T> AsQueryable();

    event EventHandler<BiggyEventArgs<T>> ItemRemoved;
    event EventHandler<BiggyEventArgs<T>> ItemAdded;
    event EventHandler<BiggyEventArgs<T>> Changed;
    event EventHandler<BiggyEventArgs<T>> Loaded;
    event EventHandler<BiggyEventArgs<T>> Saved;
}

Methods like Add and Remove will return the affected item in case something changed, like if the underlying store populates a key or version field. There are two versions of Add, because batch inserts are a common scenario.

The implementation of an IBiggy can now focus on manipulating in-memory data and firing events. All data persistence is handled by stores.

public virtual T Add(T item)
{
    _store.Add(item);
    _items.Add(item);
    Fire(ItemAdded, item: item);
    return item;
}

A Biggy store has a simple, brute force API.

public interface IBiggyStore<T>
{
    IList<T> Load();
    void SaveAll(IList<T> items);
    void Clear();     
    T Add(T item);
    IEnumerable<T> Add(IEnumerable<T> items);
}

But I’m also thinking some data stores might support additional features that make updates and queries more efficient. These capabilities are segregated into separate interfaces.

public interface IUpdateableBiggyStore<T> : IBiggyStore<T>
{
    T Update(T item);
    T Remove(T item);
}

public interface IQueryableBiggyStore<T> : IBiggyStore<T>
{
    IQueryable<T> AsQueryable();
}

And when constructing a BiggyList<T>, you have to inject a specific data store. BiggyList<T> can query the available interfaces in a data store to understand the store’s capabilities, but once the store is inject a list client never needs to know about the store.

public BiggyList(IBiggyStore<T> store)
{
    _store = store;
    _queryableStore = _store as IQueryableBiggyStore<T>;
    _updateableBiggyStore = _store as IUpdateableBiggyStore<T>;
    _items = _store.Load();
}

Now, the implementation of an actual data store doesn’t need to call into a base class or worry about raising events. The store only does what it is told. Circling back to the JSON backing store, an implementation might look like:

T IBiggyStore<T>.Add(T item)
{
    var json = JsonConvert.SerializeObject(item);
    using (var writer = File.AppendText(DbPath))
    {
        writer.WriteLine(json);
    }
    return item;
}

Another useful benefit to this approach is that each store can specify generic constraints independently of the Biggy list or other data stores. For example, a data store for SQL Server can specify that T has to implement an interface with an ID property, with one for Azure Table Storage can enforce partition and row keys. 

Is It Useful?

All the static typing and interfaces might move Biggy away from Rob’s initial vision of a lightweight and easy to use library, but I think the ability to cleanly separate stores from lists is valuable not just for extensibility but in the simplicity of the design.


Comments
gravatar Oskar Austegard Friday, March 14, 2014
Nice. I hope to see this project continue evolving, with serious contributions like this. Separating the backing from the list seems like the right choice, and reminds me of how something like NServiceBus allows for easy backing with a multitude of options.
gravatar Khalid Abuhakmeh Wednesday, March 19, 2014
I am a big fan of interfaces, and I think you are right to separate the store from the interface, but like you said: "All the static typing and interfaces might move Biggy away from Rob’s initial vision of a lightweight and easy to use library". I've always respected Rob's idea that stuff should work out of the box with some sane defaults. Massive is still one of my favorite libraries (especially when dealing with Oracle *blech*).
gravatar Michael Wednesday, March 19, 2014
Separating concerns by letting features float to the top in the form of interfaces and allowing the store implementations to specialize on their own looks to me like a nice idea to help move this forward in a more flexible way. I love the idea of Biggy and while lightweight/easy to use is a must, that all depends on the consumer. Some might love the initial vision while others may need more flexibility down the road. Who is really looking at this right now? My hunch is that those catching on to Biggy may want to see more flexibility as this project moves along in order to adopt it wholly. Regardless of that debate, the idea is a great one and I look forward to seeing where this goes.
gravatar pdsullivan.com Wednesday, March 19, 2014
I am a big fan of biggy. These changes sound like they would be a welcomed improvement, and I am happy to see some thought being put towards improvement of this library.
gravatar Richard Saturday, April 5, 2014
Any thoughts on how to make Biggy work with multiple processes, e.g. in a load-balanced scenario? Perhaps via some kind of pub/sub mechanism, or a listener thing?
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!