I’m looking at 5-7 files of data in CSV format. The columns and format can change every 3 to 6 months. What I wanted was a strongly type wrapper / data transfer object for all the CSV file formats I had to work with, and I thought this would be a good opportunity to try a T4 template.
If you haven’t heard of T4 templates before, then it is a technology that is worth your time to investigate. Hanselman has a pile of links in his post T4 (Text Template Transformation Toolkit) Code Generation – Best Kept Visual Studio Secret.
Each CSV file I’m working with has a header row:
PID,CaseID,BirthDate,DischargeDate,Status …
1001,2001,1/1/1970,6/1/2009,Foo …
… and so on for ~ 100 columns.
I wanted a template that could look at every CSV file it finds in the same directory as itself and create a class with a string property for each column. The template would derive the name of the class from the filename. Here’s a starting point:
<#@ template language="c#v3.5" hostspecific="true" #> <#@ import namespace="System.IO" #> <#@ import namespace="System.Collections.Generic" #> using System; namespace Acme.Foo.CsvImport { <# foreach(var fileName in GetCsvFileNames()) { var fields = GetFields(fileName); var className = GenerateClassName(fileName); #> public class <#= className #> : ImportRecord { public <#= className #>(string data) { var fields = data.Split(','); <# for(int i = 0; i < fields.Length; i++) { #> <#= fields[i] #> = (fields[<#= i #>]); <# } #> } <# foreach(var field in fields) { #> public string <#= field #> { get; set; } <# } #> } <# } #> } // end namespace <#+ private string GenerateClassName(string fileName) { return Path.GetFileName(fileName).Substring(0,2) + "_ImportRecord"; } private string[] GetFields(string fileName) { using (var stream = new StreamReader( File.OpenRead(fileName))) { return stream.ReadLine().Split(','); } } private IEnumerable<string> GetCsvFileNames() { var path = Host.ResolvePath(""); return Directory.GetFiles(path, "*.csv"); } #>
The template itself needs some polishing, and the generated code needs some error handling, but this was enough to prove how easy it is to spit out type definitions from a CSV header:
public class AA_ImportRecord { public AA_ImportRecord(string data) { var fields = data.Split(','); ID = fields[0]; CaseID = fields[1]; BirthDate = fields[2]; } public string ID { get; set; } public string CaseID { get; set; } public string BirthDate { get; set; } // ... }