One of the questions I've been getting lately goes like this:
Should a Repository class have a Save method?
And the standard answer:
It depends.
It's hard to keep track of all the different approaches to implementing the repository pattern these days, but I assume when someone asks me this type of question they are thinking of using code like this:
var employee = new Employee(); employeeRepository.Save(employee); var account = new Account(); accountRepository.Save(account);
This style always strikes me as odd. It looks like code that wants to move away from an active record approach, but the code hasn't quite reached escape velocity. That's not to say the approach doesn't work. I can think of a couple scenarios where this approach can work well:
In other scenarios I think this approach only creates additional complications in sharing connections, transaction management, and maintaining entity identities. That's why I expect most repositories to work in conjunction with a unit of work object.
var employee = new Employee(); employeeRepository.Add(employee); var account = new Account(); accountRepository.Add(account); uow.Commit();
It's a small difference in code, but a considerable difference in philosophy. No entities are persisted until the unit of work commits. Behind the scenes, each repository working inside the same logical transaction is associated with the same unit of work. This approach simplifies the issues revolving around connections, transactions, and identity.
Comments
You're uncovering why POJOs and CRUDs are, well, cruddy.
Suggest you explore the Oracle ADF BC framework which has a highly disciplined approach to transactions and commits.
Disclaimers: I am a proud Oracle employee but cannot speak for Oracle at all. I'm also interested in comments about my blog post blog.schemaczar.com/2008-12/crud-in-crud-out/
Using the Session/Request pattern, I can inject my UoW into my repositories and call Commit at the end of my request.
var employee = new Employee();
uow.Add(employee);
var account = new Account();
uow.Add(account);
uow.Commit();
Then the repositories are only responsible for querying... Also when dealing with ORMs that support persistence by reachability, it becomes much clearer when you need to call Save and when you don't. You could have UnitOfWork.Add(T obj) and UnitOfWork.Delete(T obj) constrained to be an aggregate root (marker interface or abstract base class). Hmmm... This is becoming a blog post... :)
Try to think about more complex object model - for example think on model of social network where you have people and relations.
For sure you have PeopleRepository so you can retrieve all the people and it is also possible you have RelationRepository so you can retrieve all the relations but what about RelationRepository of a specific people?
When you have specific people (p) you might want to ask for all the relations she has (p.Relations) which return a repository with only the relations this person has - now changing this repository should change the main repository, and other repositories because if I delete relation from one person it should be deleted from the other side.
Another reason to have this design is that now you can load only the relations of the people you really need, not having to load the whole relations just to work with 2 out or 2M.
Thanks for the post,
Ido.
I agree there are a ton of different implementations of the repository pattern. I'm curious what API you typically create when saving an existing object (i.e., update), perhaps one that was passed in from another tier (e.g., it came in from a web service call and the original object context is long gone). Your examples above have an Add() method for inserting a new object. But do they also have some Update() method or updating an existing object? From an EF-centric perspective, where do you typically handle the "Attach" of an existing entity? Curious to see how others are doing it. Thanks.
I've manage to avoid Attach for the most part, but if I had to use it I think I'd make it part of the IRepository interface.