Monday, August 24, 2009

Ruby & Word: Inserting Pictures Into Documents

In an earlier article, I explained how to insert an image into a range of cells in an Excel worksheet. Today we'll look at how to insert an image into a Word document.

Setting the Scene

Feel free to play along at home: Imagine that you have a Word document already open...

require 'win32ole'
word = WIN32OLE.connect('Word.Application')
doc = word.ActiveDocument

...and you want to insert an image from your PC...

image = 'C:\Pictures\Picture1.jpg'

...into a given Range object in the document:

range = doc.Sentences(2)

The InlineShapes Collection

The InlineShapes object represents a collection of pictures, OLE objects, and ActiveX controls contained within a given Document or Range object.

The AddPicture() Method

To insert an image into a document or range, we call the AddPicture() method on the InlineShapes collection. This method accepts up to four parameters:

* FileName (required) - The full path and filename of the image to insert.

* LinkToFile (optional) - If true the inserted picture will be linked to the file from which it was created. You'll usually set this to false to make the picture an independent copy of the file. The default value is false.

* SaveWithDocument (optional) - Set to true to save the linked picture with the document. The default is false. This value will be ignored unless LinkToFile is set to true.

* Range (optional) - A Range object representing the position in which to insert the picture.

The AddPicture() method returns a reference to the newly created InlineShape object that is your picture.

So, to insert an image at the start of the range that we defined earlier, we could call the method on our range object...

range = doc.Sentences(2)
pic = range.InlineShapes.AddPicture( { 'FileName' => image } )

...or we could call it on the Document object and pass the Range object as a parameter:

range = doc.Sentences(2)
pic = doc.InlineShapes.AddPicture( { 'FileName' => image,
'Range' => range } )

When providing all four parameters, the syntax looks like this:
pic = doc.InlineShapes.AddPicture( { 'FileName' => image,
'LinkToFile' => false,
'SaveWithDocument' => false,
'Range' => range } )

If your only parameter is the FileName, you could bypass the hash format and simply pass the string to the method, like this...

pic = range.InlineShapes.AddPicture( 'C:\Pictures\Picture1.jpg' )

To insert the image at the currently selected position, use the word.Selection.Range object:

pic = word.Selection.Range.InlineShapes.AddPicture( image )
pic = doc.InlineShapes.AddPicture( { 'FileName' => image,
'Range' => word.Selection.Range } )

And that's our show for today. Thanks for stopping by!

Thursday, August 20, 2009

About the Book

A few readers have asked me for details on the book-in-progress.

The working title is "Automating Windows Applications with Ruby". As the title implies, the focus is on automating applications, usually via Win32OLE/COM.

This is not an "Introduction to Ruby" book: I make the assumption that the reader already has a working knowledge of the language and now wants to put it to good use driving applications and processes on Windows. Some of the topics covered include:

* Microsoft Excel
* Microsoft Word
* Microsoft Outlook
* Microsoft PowerPoint
* Database Access with ADO
* Microsoft Access
* Microsoft SQL Server
* Microsoft Internet Explorer
* Microsoft Office Document Imaging (OCR)
* iTunes
* Windows Media Player
* Text-to-Speech
* Microsoft Agent
* Hummingbird HostExplorer
* Attachmate Extra
* Windows Management Instrumentation

Each chapter will include step-by-step tutorials, and many chapters will also include a reference section covering commonly-used classes, properties, and methods. Some chapters are extensive, while others may be only a few pages in length.

This isn't intended to be an "Everything You'd Ever Want To Know About..." resource, but it is my intention to provide you with a book that will serve as both a quick-start guide and a go-to reference for automating a wide variety of applications.

Stay tuned...

Wednesday, August 5, 2009

Ruby & Excel: Inserting Pictures Into Cells (New and Improved!)

In a previous article, I discussed a method for inserting images into an Excel worksheet. It seems that the Worksheet.Pictures.Insert() method that I demonstrated in that article, though frequently used, is not actually officially documented in the Excel Object Model Reference. An astute reader has called my attention to this fact and, in reply, I hereby present the officially documented---and probably preferred---method for adding an image to a worksheet.

The Worksheet object's Shapes collection includes an AddPicture() method that creates a picture from an existing file and returns a Shape object that represents the new picture. The syntax is:

.AddPicture(Filename, LinkToFile, SaveWithDocument, Left, Top, Width, Height)

All seven arguments are required, but this allows you to specify the position and size of the picture in the method call.

The following code inserts an image into the range of cells from C3 to F5 in the active worksheet:

require 'win32ole'

xl = WIN32OLE.connect('Excel.Application')
ws = xl.ActiveSheet

range = ws.Range('C3:F5')

pic = ws.Shapes.AddPicture( {
'FileName' => 'C:\Pictures\Image1.jpg',
'LinkToFile' => false,
'SaveWithDocument' => true,
'Left' => range.Left,
'Top' => range.Top,
'Width' => range.Width,
'Height' => range.Height
} )

You can find further details on the AddPicture() method on MSDN.

My thanks to Charles Roper for his inquiry, prompting me to dig a little deeper.

And my thanks to you for stopping by!

Sunday, August 2, 2009

OCR: Converting Images to Text with MODI

Joe Schmoe from Kokomo has a scanned image of a 300-page contract. Joe wishes he could search this file for certain rates and terms, but it's an image, not a text file. OCR might be just what the doctor ordered.

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of images of handwritten, typewritten or printed text (usually captured by a scanner) into machine-editable text. --Wikipedia

One such OCR solution that you may already have available to you is Microsoft Office Document Imaging (MODI), part of the Microsoft Office suite. Let's look at how you can use Ruby and the MODI API to automate the conversion of a scanned document into text.

Installing MODI

MODI might not have been installed when you installed Microsoft Office, so your first step may be to install it from the Office install disks. If installed, you will probably find an icon for "Microsoft Office Document Imaging" located on your Windows Start/Programs menus under "Microsoft Office Tools". If it's not there, go to your Add/Remove Software control panel, select your Microsoft Office installation, and select the option to add features. Then follow the necessary steps, which may vary depending on your version of Windows and Office.

Accessing the MODI API

To begin with, we'll use the win32ole module to create a new instance of the MODI.Document object:

require 'win32ole'
doc ='MODI.Document')

Loading the Image

The next step is to call the document object's Create() method, passing it the name of the .TIF file to load:


NOTE: MODI only works with TIFF files. If your image is in another format (.JPG or .PNG, for example), you can use an image editor (such as Paint.NET or Photoshop) or code library (such as RMagick) to convert it to TIFF format.

Performing the OCR

The OCR() method performs the optical character recognition on the document. The mthod can be called without parameters...


...or with any of three optional parameters:

doc.OCR( { 'LangId' => 9,
'OCROrientImage' => true,
'OCRStraightenImage' => true } )

LangId: An integer representing the language of the document. English = 9, French = 12, German = 7, Italian = 16, Spanish = 10. This value defaults to the user's regional settings.

OCROrientImage: This boolean value specifies whether the OCR engine attempts to determine the orientation (portrait versus landscape) of the page. The value defaults to true.

OCRStraightenImage: This boolean value specifies whether the OCR engine attempts to "deskew" the image to correct minor misalignments. The value defaults to true.

You may find that tweaking these parameters from their default values produces better results, depending on the individual image(s) you're working with.

Getting the Text

Naturally, you'll want to get your hands on the text produced by the OCR process. Each page of the Document is represented by an Image object. The Image object contains a Layout object; and that Layout object's Text property represents the text for that image/page. So the hierarchy looks like this:


To accrue the entire text, simply iterate over the Document.Images collection and grab the Layout.Text values. For example:'my_text.txt', 'w') do |f|
for image in doc.Images
f.puts("\n" + image.Layout.Text + "\n")

Text, But Not Formatting

No OCR process can guarantee 100% accuracy, but I've found that MODI does a pretty good job recognizing text. Results will vary, of course, depending on the quality of the TIFF image. Note, however, that it cannot preserve formatting of tabular data. So while the text in a series of columns may be produced with a high degree of accuracy, that text will probably be produced with one value per line. So...

apple orange pear

...comes out as...


Paragraphs of text have, in my experience, been produced with the proper line feeds. Play around with it and see if it meets your needs.

That concludes our show for today. Thanks for tuning in!