Sunday, April 15, 2007

Automating Excel with Ruby: Defining a Range and Inserting Data

As I previously mentioned, calling the Value method on an Excel Range object returns an array of arrays of values. Here's a quick way to get a 2-dimensional array of all values from a worksheet...


data = worksheet.UsedRange.Value

But, as I failed to mention in that article, you can also insert a 2-dimensional array into a Range of cells:

worksheet.Range("A2:G7").Value = data

If you have a lot of data to insert, this will be much faster than inputting the data cell-by-cell. But it is important that the range of cells be the same size (number of rows, number of columns) as your 2d array.

Your Array: Number of Rows

Assuming you have data, a 2-dimensional array, you can determine the number of rows by calling data.size:

rows = data.size


Your Array: Number of Columns

You can determine the number of columns by calling size on one of the data array's rows:

cols = data[0].size


Defining Your Worksheet Range

But to use the Range method, you need to define the column letters (ie, "A2:G7"). You probably already know that you want your starting point to be cell A1 or A2. But how do you programmatically determine the final column letter for an array that is 104 columns wide?

Each cell (or row, or column) object has an Address method that returns its location in letter-number format, so...

worksheet.Cells(5, 3).Address

...returns "$C$5", and we can use that string in our upcoming Range definition. So if you have a range that is 30 columns wide by 100 rows in length, which we want to insert into a Range, starting at cell A1:

first = worksheet.Cells(1,1).Address # => "$A$1"
last = worksheet.Cells(100,30).Address # => "$AD$100"
worksheet.Range("#{first}:#{last}").Value = data

If your worksheet has a header row and you will be inserting your data after that, bump up the row value by 1:

first = worksheet.Cells(2,1).Address # => "$A$2"
last = worksheet.Cells(101,30).Address # => "$AD$101"
worksheet.Range("#{first}:#{last}").Value = data


I hope that some of you find this useful. My thanks goes to Charlie for suggesting the topic.

10 comments:

Mike Woodhouse said...

Alternatively, you could use the handy Range.Resize() method, which would give you something like

Worksheets.Cells(1,1).Resize(100,30)

...which is nice. I suppose one could fold everything into

target.Resize(data.size, data[0].size).Value = data

And thanks for reminding me that I could do more Excel automation with Ruby - I've been working in VBA for so long that I often forget.

Mike

Unknown said...

how can i insert a row to excel doc?
eg:
title1
1 test row
title2 < i want insert a row here
2 test row

i inspect many methods of the range and worksheet,but i can't work out,
please help me

David Mullet said...

@dsf-

In response to your question, I have posted a new article about inserting and deleting rows and columns.

-David

Unknown said...

thx a lot

Anonymous said...

Hello. You have some great articles here and I really appreciate you taking the time to post the information to help us out.

I have a quick question. I really would like to use this method to extract the data out of my excel files but unfortunately, it appears that it will only let me extract the value rather than the text.

I tried "data = worksheet.UsedRange.Text" but that doesn't seem to work. I also tried a few alternatives. Do you have any idea if this is possible and how so? Right now I am using the .each and it is taking forever (about 5 seconds per file...).

Thanks much and have a great day!

David Mullet said...

Thank you very much, Topher!

Microsoft's docs claim that the Range object has a read-only Text property. And the methos does exist, but always returns nil in my testing.

As you mentioned, iterating over rows and cells to access the Text value can be painfully slow.

I don't have much to offer on this, unfortunately. If the issue is formatting, there are possible methods for determining the format of the data (ie, decimal versus currency, versus percentage). You could, for example, query the NumberFormat property for each column:

for column in worksheet.UsedRange.Columns
formats << column.NumberFormat
end

Then use worksheet.UsedRange.Value to get your 2-d array quickly, and use the array or hash of column formats to format your data upon output.

Hope that helps!

~viper~ said...

I had a 1-dimensional array with 100 string elements, and attempted at first to copy what you had written

sheet.Range("A2:A101").Value = array

Your example was of a 2-dimensional array, and, for some reason, all those cells were filled with only the first element in the array. If, however, I first do something like

array[0]= [array[0],'test']

then somehow a pseudo-2d array is created, and the above command works. I had attempted it by chance, and I do not understand why this hack works

David Mullet said...

@viper:

The Range.Value= method expects a 2-dimensional array -- an array of arrays -- even if the Range object spans only a single column. So, as you discovered, you must take your data array into the Second Dimension, so to speak.

Another method for doing that is:

ws.Range("A1:A500").Value = data.collect!{|x| [*x]}

collect! iterates over an array. [*x] converts an object (x) to an array. It's the replacement for the soon-to-be-obsolete to_a method.

David

Unknown said...

How would one go about drawing borders on a range? It's the one thing I haven't quite been able to figure out. So say I want a box around A2:B4 or whatever...the closest I've come is to have the thing draw the complete grid for a given range and I need to be more selective than that!

Unknown said...

Well it figures as soon as I ask I figure it out. This code would draw a box around the range F2:H7

sheet.Range("F2:F7").Borders(1).Linestyle=1 sheet.Range("H2:H7").Borders(2).Linestyle=1
sheet.Range("F2:H2").Borders(3).Linestyle=1 sheet.Range("F7:H7").Borders(4).Linestyle=1