Tuesday, 19 October 2010

Document Generation with Word 2007 and ASP.NET - part 2

In the first part, we looked at how to build a document ready to accept dynamic data which can be used to generate dynamic reports. Whether we are planning to use a blank template, individual read/write documents or merging documents together, the basic idea is the same:

  1. Get data from data source

  2. Create XML part for this data

  3. Insert this data into the document

  4. Return the result to the client

Getting data from the data source is beyond the scope of this document but in my case, I have an existing database access layer which I call to return a single row of data (a DataRow).
Creating the XML part is quite easy using the XmlWriter class as in the following function:

private void GetData(Stream stream, string quoteRef)
// DataRow dr = etc..
XmlWriter writer = XmlWriter.Create(stream);
writer.WriteAttributeString("Reference", quoteRef);
writer.WriteElementString("QuoteName", "Test New Quote");
writer.WriteElementString("TotalSellingPrice", Convert.ToDecimal(dr["TotalSellingPrice"]).ToString("N2"));
writer.WriteElementString("TotalCost", Convert.ToDecimal(dr["TotalCost"]).ToString("N2"));

Since the XML is text, it is convenient to format numbers at this point rather than playing with it in the document (but you can if you need to).
To insert this into the document, we need to use the System.IO.Packaging classes which allow you to work on the zip file (which is the docx). You might need to reference WindowsBase.dll if you haven't already to get these classes.
My function is then:

private void InsertCustomXml(MemoryStream memoryStream)
// Open the document in the stream and replace the custom XML part
Package pkgFile = Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite);
PackageRelationshipCollection pkgrcOfficeDocument = pkgFile.GetRelationshipsByType(strRelRoot);
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
if (pkgr.SourceUri.OriginalString == "/")
// Add a custom XML part to the package
Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative);
if (pkgFile.PartExists(uriData))
// Delete template "/customXML/item1.xml" part
// Load the custom XML data
PackagePart pkgprtData = pkgFile.CreatePart(uriData, "application/xml");
GetData(pkgprtData.GetStream(), "QUO-016952");
// Close the file

This function takes a stream which represents a docx document and adds the custom XML in, there is nothing in this function which you would change (except the string parameter passed to GetData) you need a const defined which matches the correct namespace:

private const string strRelRoot = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";

The reason I use streams is because I use the system in different ways. To retrieve a single document:

public MemoryStream RetrieveDocument(string fileName, bool addCustomXml)
// Read the file into memory - default constructor is expandable
MemoryStream memoryStream = new MemoryStream();
byte[] buffer = File.ReadAllBytes(fileName);
// If we want to add in the XML, do it here otherwise we might want to add it at top level
if (addCustomXml)

return memoryStream;

NOTE the comment that you need to use the default constructor for the memory stream otherwise it will not be expandable when you insert your custom XML.
If I am merging several documents (in my case using the Aspose Words dlls from Aspose) I merge the docs and THEN add the custom XML to the main document:

public Document MergeDocuments(List docs, string templateDoc, string outputPath)
Document dstDoc = new Document(templateDoc);

foreach (MemoryStream Doc in docs)
Document srcDoc = new Document(Doc);
dstDoc.AppendDocument(srcDoc, ImportFormatMode.UseDestinationStyles);
// Add in the custom XML via the memory stream
MemoryStream ms = new MemoryStream();
dstDoc = new Document(ms);
return dstDoc;

Aspose merges actual docs whereas I need a memory stream for my functions so I use the Aspose doc class to append documents and then convert it to memory stream to add the custom XML, once it is done, I return a new Aspose doc so it can be saved appropriately. One of the things that is easier in docx is that to merge docs, you can simply merge the content and keep a single set of styles, custom xml and the rest of it but it does mean that any custom XML/styles in individual docs is lost when the documents are merged.
In my web layer then, I call these functions and return the result to the browser using content-disposition to hint that it can be saved instead of viewed inline:

protected void Page_Load(object sender, EventArgs e)
TenderGenerator gen = new TenderGenerator();
const string TemplateFile = @"~/App_Data/QuoteTemplate.docx";
List myList = new List();
myList.Add(gen.RetrieveDocument(@"c:\work\QuoteDocuments\QUO-016952\Documents\UniqueReport.docx", false));
myList.Add(gen.RetrieveDocument(@"c:\work\QuoteDocuments\QUO-016952\Documents\Financial Summary.docx", false));
Document doc = gen.MergeDocuments(myList, Server.MapPath(TemplateFile), "");
doc.Save(Response, "CustomerDocument.docx", ContentDisposition.Attachment, SaveOptions.CreateSaveOptions(SaveFormat.Docx));

In this case, I have a test page so the doc paths are hard-coded and the whole thing is driven from the page load event, in real life, this will be driven from a button press and database driven list of docs. Note that I use the template file stored in App_Data which already has the custom XML linked to it so that the top level document can have the custom XML replaced (in the previous post I mentioned that for some reason, if the custom XML is not present, the new data is not added in its place). Hopefully this is all easy enough to understand.
Post a Comment