Easily Generate Microsoft Office Files From C#

Wednesday, July 30, 2014

It was over a decade ago when I was first asked to generate Microsoft Office files from a web application. In those days there weren’t many options available to create Office files, but for small applications it was possible to automate Office programs on the server with COM. Since Office wasn’t designed to run on a server, the automation approach always felt like climbing a frozen waterfall. From a support page:

Microsoft does not currently recommend, and does not support, automation of Microsoft Office applications from any unattended, non-interactive client application or component … because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.

These days, Office files are no longer in a proprietary binary format, and are we can create the files directly without using COM automation. A .docx Word file, for example, is a collection of XML documents zipped into a single file. The official name of the format is Open XML.

There is an SDK to help with reading and writing OpenXML, and a Productivity Tool that can generate C# code for a given file. All you need to do is load a document, presentation, or workbook into the tool and press the “Reflect Code” button.

 

Document Reflector

The downside to this tool is that even a simple document will generate 4,000 lines of code. Another downside is that the generated code assumes it will write directly to the file system, however it is easy to pass in an abstract Stream object instead.

So while this code isn’t perfect, the code does produce valid document and  is useful for figuring out how the SDK classes work together. It’s also not difficult to rework the logic so the class functions as a “view”. You can pass in a model object and have the code dynamically generate a document by replacing hard coded content inside with data from the model.


Comments
gravatar Eber Wednesday, July 30, 2014
For Excel files there's EPPlus
gravatar Pedro Wednesday, July 30, 2014
I've used ClosedXML to create spreadsheets and I found it much easier than the raw OpenXML SDK, which I think it's too low level for most needs.
gravatar Nicholas Stein Wednesday, July 30, 2014
I am only interested in Excel and Word files so I did not Investigate NetOffice and LinqToExcel_x64 EPPlus is good for xlsx files. For Word Files and legacy xls files I use NPOI. Both read and write but you might get locked out if someone else has a file open. Since they are both available on NuGet, they are easy to integrate into my project. Able Perez has combined them into a project that is good for getting started. https://nuget.org/packages/ExcelHandler His blog is about the only documentation he has for the project http://abelperezmartinez.blogspot.com/2012/03/handling-excel-spreadsheets-xls-and.html But by installing his wrapper I got both EPPlus and NPOI installed in your project. I also installed the OpenXml SDK and added references to DocumentFormat and WindowsBase. I can then dredge as deeply as I need to into the api layers to get what I need. If you only need to "read" excel files Excel Data Reader on Codeplex is really great because it puts the Workbook into a DataSet. private DataSet GetSpreadsheetAsDataSet() { IExcelDataReader excelReader = null; FileInfo fi = new FileInfo(ExcelFilePath); switch (fi.Extension) { case ".xls": isExcelXML = false; excelReader = ExcelReaderFactory.CreateBinaryReader(GetWorkbookStream(ExcelFilePath)); break; case ".xlsx": isExcelXML = true; excelReader = ExcelReaderFactory.CreateOpenXmlReader(GetWorkbookStream(ExcelFilePath)); break; case ".xlsm": isExcelXML = true; excelReader = ExcelReaderFactory.CreateOpenXmlReader(GetWorkbookStream(ExcelFilePath)); break; default: throw new Exception("Invalid file extension. Must be .xls, .xlsx, or .xlsm"); } DataSet result = excelReader.AsDataSet(); excelReader.Close(); excelReader.Dispose(); return result; } Each tab is a DataTable. This setup is very convenient for iterating through DataRow and ItemIndex to get values from cells. Another good thing about this library is that you can read a file while it is in use. I find that convenient when developing because I can have the file open while I am debugging. Nick
gravatar Sean O Wednesday, July 30, 2014
For those using PHP, the MSDoc Generator class: http://www.phpclasses.org/package/2631-PHP-Create-Microsoft-Word-document-without-COM-objects.html is quite dated, but is easy to use, can generate .DOCX & legacy .DOC, and still works well today.
gravatar Doeke Thursday, July 31, 2014
Open XML and the productivity tool are nice, and much better options than we had in the past. But OpenXML is the assembler of Office documents. It's too low level for most usage. An higher level API would be nice, but a haven't seen (free) adequate ones though...
gravatar John Atten Thursday, July 31, 2014
For working with Word documents, check out DocX, available via Nuget. DocX provides the higher-level abstraction over OpenXml that Doeke is suggesting. For working with Excel documents, LinqToExcel is wonderful, but takes a dependency on ACE. ExcelDataReader is a little more limited in scope, but has no such requirement (if I recall, this one also is an abstraction on OpenXml, though I could be wrong. All are available via Nuget.
gravatar Dave Robbins Thursday, July 31, 2014
For Excel I have been uisng NPOI, which support OpenXML. It is a port POI and I would say c. compatible syntax, so there are many examples that you can review in the Java implementation that can help. For in depth .Net tutorials, you can read the excellent blog post here: http://www.leniel.net/2009/10/npoi-with-excel-table-and-dynamic-chart.html. NPOI is a very nice abstraction over the OpenXML library as well. The downside is that it's supported by just one guy. NPOI also claims to support Word docs, but I haven't had direct experience with the tool.
gravatar Hari Thursday, September 4, 2014
I want to upload the excel 2010 files and read the content and save it to sql server. After uploading , I don't want to save the file in server. How to achive this?
gravatar Scott Thursday, September 4, 2014
@Hari: Use streams, like in one of the answers here: http://stackoverflow.com/questions/13917038/uploading-documents-in-sql-server-2008-using-asp-net-c-sharp
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!