Back to contents PHP Python Ruby Choose a language:

Microsoft Excel spreadsheet files can contain lots of data in cells, formulae, notes and so on.

For more details, read the spreadsheet gem documentation.

Opening the file

Download the Excel file, and load it into the Spreadsheet class. This example is taken from an FOI request for tram accidents. require 'spreadsheet' require 'open-uri' url = "http://www.whatdotheyknow.com/request/82804/response/208592/attach/2/ACCIDENTS%20TRAMS%20Laurderdale.xls" book = nil open url do |f| book = Spreadsheet.open f end

Directly select the worksheet you want by number or name.

sheet = book.worksheet 0 sheet = book.worksheet 'LAUDERDALE AVE'

If you're not sure what sheets there are, this prints them out.

book.worksheets.each do |sheet| puts "Sheet called #{sheet.name} has #{sheet.row_count} rows and #{sheet.column_count} columns" end

Reading values

Read the values of a row in one go.

row = sheet.row(4) p row

Read individual cells like this.

p row[0]

Saving to the datastore

Work out sensible heading names. In this case we remove the "." from "FLEET NO.".

keys = sheet.row(2) keys[1] = keys[1].gsub('.', '') p keys

Make a hash from the rows and save them.

sheet.each_with_index do |row, rownumber| # create dictionary of the row values data = {} row.each_index do |i| data[keys[i]] = row[i] end data['rownumber'] = rownumber # remove the empty column (which has a blank heading) data.delete(nil) # only save if it is a full row (rather than a blank line or a note) if data['DATE'] != 'DATE' and data['DATE'] != nil and data['FLEET NO'] != nil ScraperWiki::save_sqlite(unique_keys=['rownumber'], data) end end