Monday, February 28, 2011

Reading Content Control values programmatically

Content control is one of the cool features in Microsoft Word 2010 (first introduced in 2007). Imaging cases, where user submits their monthly/yearly report by MS Word file. Those files definitely contain quantitative & qualitative data that need to be entered into a main system, used by the company, for review and future planning. Generally there are data entry people who entered these values manually; but with very minimal effort we can automate the process.

We can give template MS Word file to those users which they will fill-up and submit each month/year. In the template file we will have content controls (textbox, drop down list etc). Programmatically we can extract and process these values very easily. And then we will enter these values into the main system without any delaying.

To accomplish this, first enable the developer tab: (Steps from MS):
  1. Start the application.
  2. Click the File tab.
  3. Click Options.
  4. In the categories pane, click Customize Ribbon.
  5. In the list of main tabs, select Developer.
  6. Click OK to close the Options dialog box.
Create your template by adding controls:
  1. Make sure you are at Developer tab (Marked as [1])
  2. You can drag and drop control from Ribbon (Marked as [2]) and set properties.
  3. Set a meaningful Title (Marked as [3]) which you can use in code.
  4. In case of drop down list add possible value (Marked as [4]).
  5. You can check the “Content control cannot be deleted” (Marked as [5]) to make sure user cannot delete the controls by mistake.

After this you can use the following code snippets to get all values from the doc file.

            WordprocessingDocument docC = WordprocessingDocument.Open("SampleSale.docx", false);
            MainDocumentPart mainPart = docC.MainDocumentPart;
            
            //will get the content controls that are part of a line
            List SdtRunList = mainPart.Document.Descendants().ToList();

            //will get the content controls that are paragrph
            List SdtBlockList = mainPart.Document.Descendants().ToList();
            
            for (Int32 i = 0; i < SdtRunList.Count; i++)
            {
                Console.WriteLine("(SdtRun) Title: " + SdtRunList[i].Descendants().First().Val.Value); //Printing title
                Console.Write("Value: ");                
                foreach (Text t in SdtRunList[i].Descendants())
                    Console.Write(t.InnerText); //Printing values

                Console.WriteLine(Environment.NewLine); // Adding some extra line
            }

            for (Int32 i = 0; i < SdtBlockList.Count; i++)
            {
                Console.WriteLine("(SdtBlock) Title: " + SdtBlockList[i].Descendants().First().Val.Value); //Printing title 
                Console.Write("Value: ");                
                foreach (Text t in SdtBlockList[i].Descendants())
                    Console.Write(t.InnerText); //Printing values
                
                Console.WriteLine(Environment.NewLine); // Adding some extra line
            }
           
            docC.Close();
            Console.ReadKey();
           //Ignore this line:  


Make sure you are getting both SdtRun and SdtBlock objects to read entire content controls. To understand the Open XML Object Model you can go through several articles but you must use “Open XML SDK 2.0 Productivity Tool” to explore the file by yourself. It will speed-up the learning process.
Here is a screenshot of the tools (exploring SampleSale.docx file) 


Feel free to leave any comment :)


Download
SampleSale.docx | Code.