Friday, June 12, 2009

Ruby & Word: Counting Words and Pages

Someone recently asked how to get a count of the number of words and pages in a Microsoft Word document. This is done by calling the ComputeStatistics() method on a Range or Document object.

As an example (play along at home), let's imagine that you have a Word document open. Your first step is to use the win32ole library's connect() method to connect to the existing instance of Word:


require 'win32ole'
word = WIN32OLE.connect('Word.Application')

You pass the ComputeStatistics() method an integer representing the type of statistic that you want to calculate. In other words, "What do you want to count?" So let's take a moment to define constants for those values:

WdStatisticCharacters = 3
WdStatisticCharactersWithSpaces = 5
WdStatisticWords = 0
WdStatisticLines = 1
WdStatisticParagraphs = 4
WdStatisticPages = 2

You can call the ComputeStatistics() method on a Document object...

doc = word.ActiveDocument
word_count = doc.ComputeStatistics(WdStatisticWords)
page_count = doc.ComputeStatistics(WdStatisticPages)

...or on a Range object...

paragraph = doc.Paragraphs(27)
word_count = paragraph.Range.ComputeStatistics(WdStatisticWords)
char_count = paragraph.Range.ComputeStatistics(WdStatisticCharacters)

When called on a Document object, the method accepts an optional second parameter, IncludeFootnotesAndEndnotes, a boolean which (obviously) specifies if the calculation should include footnotes and endnotes:

word_count = doc.ComputeStatistics(WdStatisticWords, true)

The IncludeFootnotesAndEndnotes parameter defaults to false.

Official details on the ComputeStatistics() method are available from MSDN here.

That's all for now. Thanks for stopping by!

No comments: