Seeding an Entity Framework Database From File Data

Monday, September 10, 2012

The migrations feature of the Entity Framework includes a Seed method where you can populate the database with the initial static data an application needs. For example, a list of countries.

 

protected override void Seed(Movies.Infrastructure.MovieData context)
{
context.Countries.AddOrUpdate(c => c.Name,
new Country {Name = "India"},
new Country {Name = "Japan"},
// ..
);
}

You'll find the Seed method in the migration configuration class when you "enable-migrations" in a project. Since the migrations framework will call the Seed method with every database update you make, you have to be sure you don't create duplicated records in the database. The AddOrUpdate method will help to prevent duplicate data. With the first parameter to the method you specify a property the framework can use to identify each entity. The identifier could be the primary key value of the entity, but if your database is generating the value, as it would with identity columns, you won't know the value. To manage a list of countries, we just want to make sure we don't duplicate a country, so identifying a record using the name of the country will suffice.
 
Behind the scenes, AddOrUpdate will ultimately need to first query the database to see if the record exists and if so, update the record. If the record doesn't exist migrations will insert a new record. AddOrUpdate is not something you'd want to use for millions of records, but it works well for reasonable sized data sets.  However, sometimes you already have the data you need in a CSV or XML file and don't want to translate the data into a mountain of C# code. Or, perhaps you already have initialization scripts written in SQL and just need to execute the scripts during the Seed method.
 
It's relatively easy to parse data from an XML file, or read a SQL file and execute a command. What can be tricky with the Seed method is knowing your execution context. Seed might be called as part of running "update-database" in the Visual Studio Package Manager Console window. But the Seed method might also be called when a web application launches in an IIS worker process and migrations are turned on by configuration. These are two entirely different execution contexts that complicate the otherwise mundane chore of locating and opening a file.

Putting It Somewhere Safe

One solution is to embed the data files you need into the assembly with the migration code. Add the files into the project, open the properties window, and set the Build Action property to "Embedded Resource".
 
Embedding Resources
 
Now you can open the file via GetManifestResourceStream. The Seed method might look like:
 
protected override void Seed(CitationData context)
{

var resourceName = "Citations.App_Data.Countries.xml";
var assembly = Assembly.GetExecutingAssembly();
var stream = assembly.GetManifestResourceStream(resourceName);
var xml = XDocument.Load(stream);
var countries = xml.Element("Countries")
.Elements("Country")
.Select(x => new Country
{
Name = (string)x.Element("Name"),
}).ToArray();
context.Countries.AddOrUpdate(c=>c.Name, countries);
}

 

Notes

The name of the resource is derived from the location of the resource ({ddefault_namespace}.{folder}.{subfolder}.{filename}). In the example above, countries.xml lives in the App_Data folder of the project, and the project's default namespace is "Citations". If you don't like the name, you can change it.
 
For .sql scripts you can use context.Database.ExecuteSqlCommand, but you'll need to break the file into separate commands if there are delimiters inside (like "GO").
 
Finally, if the data is truly static you might consider executing the data load during the Up method of a migration.

Comments
gravatar Fred Morrison Monday, September 10, 2012
A quick tip:
The line that reads
Name = (string)x.Element("Name"),
can be simplified to just
Name = x.Element("Name").Value,
(since .Value is a string property of an XElement instance)
scott Monday, September 10, 2012
@Fred:

Yes, but it has a potentially different behavior (if Name returns null).
Kyle Monday, September 10, 2012
Wouldn't it be easier to do this at the database level?
gravatar scott Monday, September 10, 2012
@Kyle Yes, it's mostly personal preference. With this approach all you need is to get the latest from source control and run "update-database", so it is not difficult.
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!