Win32OLE + DRb - Windows = Fun

K

Keith Fahlgren

Hey,

The following is a message posted to the Boston Ruby Group mailing list=20
after our second meeting last Tuesday.

[ANNOUNCE]
Boston folks that weren't at the meeting, check out=20
http://boston.rubygroup.org/boston/show/HomePage for more info.=20
[/ANNOUNCE]

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

It sounds like quite a few of us have already discovered how nice DRb is=20
for helping less-able machines do more interesting things. =A0I've used=20
it twice: once because I couldn't make SSL connections with Net:HTTPS=20
on Solaris 5.6 and once when I wanted to be able to use Word from *nix.=20

My Word solution consists of a few small parts in different places, a=20
command-line program for human use, a little module to encapsulates=20
what I wanted to do with Word, and a DRb server running on the Windows=20
box.=20

Note: The following code is not intended to be a secure, elegant=20
solution ready for production deployment--clean this up if you want to=20
use it for real.

Requirements: A Windows box with Word that you can run Ruby on and is=20
addressable. The Windows box and the calling boxen should share some=20
drive that you know how to get to. Mine is just an NFS mounted drive.=20
Some understanding of the Word object model is extremely useful.=20


This example explains converting Word (.doc) files into WordprocessingML=20
(.xml, though I call them .wml) [the XML file format in Word2003]=20
files.


Command line tool to convert on *nix:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
#!/usr/bin/env ruby
require 'drb/drb'
PORT =3D 2774 =A0 # Some open port
HOSTNAME =3D 'foo.bar.com' =A0 # IP of Windows box
DRb.start_service

# Connect to the Windows box
drb =3D DRbObject.new(nil, "druby://#{HOSTNAME}:#{PORT}")

# Ask it to make sure Word is running
word =3D drb.start_word

ARGV.each {|f|
=A0 # inelegant way of converting my *nix paths to something the=20
=A0 # Windows box liked
=A0 unix_filename =3D File.expand_path(f)
=A0 win_filename =3D unix_filename.gsub(/\//, "\\")
=A0 win_filename.sub!(/^\\work/, "R:") =A0
=A0=20
=A0 # Call the transformation, macro, whatever
=A0 resp =3D drb.wdtowml(win_filename)
=A0 puts "Converted to WML file: #{resp}"
}
drb.quit
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

My server for the Windows box:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
require 'drb'
require 'thread'
require 'drb/acl'
require 'wordhelper' # the module that does the work

PORT =3D 2774
HOSTNAME =3D 'foo.bar.com'

# Security?
acl =3D ACL.new(%w(deny all
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0allow localhost
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0allow zoo.bar.com
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0allow goo.bar.com)) # Some set of boxen =
you like
DRb.install_acl(acl)

# Let people talk to me, bind me to the Word module
DRb.start_service("druby://#{HOSTNAME}:#{PORT}", WordHelper::Word.new)

# Keep running
DRb.thread.join
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

The WordHelper module, where the work is done:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
module WordHelper
=A0 class Word
=A0 =A0 require 'win32ole'

=A0 =A0 WORD_HTML =3D 8 =A0# Ugly, don't use
=A0 =A0 WORD_XML =3D 11 =A0# Much nicer, you should use this
=A0 =A0 WORD_95 =3D 106 =A0# Help old programs
=A0 =A0 WORD_DOC =3D 0 =A0# The regular filetype

=A0 =A0 attr_reader :wd, :wrd

=A0 =A0 def start_word
=A0 =A0 =A0 @wd =3D WIN32OLE.new('Word.Application')
=A0 =A0 =A0 # Win32OLE sometimes barf, so try to start Word
=A0 =A0 =A0 # in two ways
=A0 =A0 =A0 begin
=A0 =A0 =A0 =A0 @wrd =3D WIN32OLE.connect('Word.Application')
=A0 =A0 =A0 rescue WIN32OLERuntimeError
=A0 =A0 =A0 =A0 @wrd =3D WIN32OLE.new('Word.Application')
=A0 =A0 =A0 end
=A0 =A0 =A0=20
=A0 =A0 =A0 # Set this to 0 if you want to run invisibly
=A0 =A0 =A0 # Be warned: you'll end up with a lot of zombie Word=20
=A0 =A0 =A0 # processes if you're not careful
=A0 =A0 =A0 @wd.Visible =3D 1
=A0 =A0 =A0 return @wd, @wrd
=A0 =A0 end

=A0 =A0 # Word to WordprocessingML (xml)
=A0 =A0 def wdtowml(file)
=A0 =A0 =A0 begin
=A0 =A0 =A0 =A0 # Expect a proper Windows-ready filename
=A0 =A0 =A0 =A0 doc =3D @wd.Documents.Open(file)
=A0 =A0 =A0 =A0 new_filename =3D file.sub(/doc$/, "wml")
=A0 =A0 =A0 =A0 doc.SaveAs(new_filename, WORD_XML)
=A0 =A0 =A0 =A0 doc.Close()
=A0 =A0 =A0 =A0 return new_filename
=A0 =A0 =A0 rescue
=A0 =A0 =A0 =A0 # Just fail blindly on errors
=A0 =A0 =A0 =A0 @wd.Quit()
=A0 =A0 =A0 =A0 raise "Word encountered an unknown error and crashed."
=A0 =A0 =A0 end
=A0 =A0 end

=A0 =A0 # Almost the same method, just as an example
=A0 =A0 def wdtohtml(file)
=A0 =A0 =A0 begin
=A0 =A0 =A0 =A0 # Expect a proper Windows-ready filename
=A0 =A0 =A0 =A0 doc =3D @wd.Documents.Open(file)
=A0 =A0 =A0 =A0 new_filename =3D file.sub(/doc$/, "html")
=A0 =A0 =A0 =A0 doc.SaveAs(new_filename, WORD_HTML)
=A0 =A0 =A0 =A0 doc.Close()
=A0 =A0 =A0 =A0 return new_filename
=A0 =A0 =A0 rescue
=A0 =A0 =A0 =A0 @wd.Quit()
=A0 =A0 =A0 =A0 raise "Word encountered an unknown error and crashed."
=A0 =A0 =A0 end
=A0 =A0 end

=A0 =A0 def quit
=A0 =A0 =A0 @wd.Quit()
=A0 =A0 end
=A0 end
end # of WordHelper Module
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D


Another example with the use of macros or the Ruby equivalent:

Now, if you know that your Word instance will always have a set of=20
macros (from a template, say), you can call them thusly:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=A0 =A0 def wdrunmacro(file, macro)
=A0 =A0 =A0 begin
=A0 =A0 =A0 =A0 # Expect a proper Windows-ready filename
=A0 =A0 =A0 =A0 doc =3D @wd.Documents.Open(file)
=A0 =A0 =A0 =A0 @wrd.Run("TheMacroIAlwaysRun", doc)
=A0 =A0 =A0 =A0 @wrd.Run(macro, doc) =A0# the macro name passed in
=A0 =A0 =A0 =A0 doc.Save()
=A0 =A0 =A0 =A0 doc.Close()
=A0 =A0 =A0 =A0 return new_filename
=A0 =A0 =A0 rescue
=A0 =A0 =A0 =A0 @wd.Quit()
=A0 =A0 =A0 =A0 raise "Word encountered an unknown error and crashed."
=A0 =A0 =A0 end
=A0 =A0 end
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

The above suffers from relying on macros being available whenever the=20
method is called. With a little work, you should be able to translate=20
your VBA macros into Ruby code, callable from anywhere.=20

Here's a stupid example that checks the first character of Body=20
paragraphs following Heading 1 paragraphs for weirdness, deletes that=20
first character, removes the all the character formatting from the Body=20
paragraph and styles the paragraph Heading 2 (I said it was stupid..).

Note: This is written in a very VBAish way, which may or may not be good=20
for you, since it's a pretty direct mapping.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
def deletestupid(doc)
=A0 doc.Paragraphs.each do |para|
=A0 =A0 if para.Style.NameLocal.match(/Heading\s?1/)
=A0 =A0 =A0 p =3D para.Next =A0# Won't work if this is the last para
=A0 =A0 =A0 r =3D p.Range() =A0# So you can talk about characters
=A0 =A0 =A0 if p.Style.NameLocal.match(/Body/)
=A0 =A0 =A0 =A0 unless r.Characters.First.Text =3D~ /[ A-Za-z0-9]/
=A0 =A0 =A0 =A0 =A0 # could also be r.Characters(1).Delete()
=A0 =A0 =A0 =A0 =A0 r.Characters.First.Delete() =A0
=A0 =A0 =A0 =A0 =A0=20
=A0 =A0 =A0 =A0 =A0 # Blast away character formatting
=A0 =A0 =A0 =A0 =A0 p.Range.Font.Reset()
=A0 =A0 =A0 =A0 =A0=20
=A0 =A0 =A0 =A0 =A0 # Apply a new paragraph style
=A0 =A0 =A0 =A0 =A0 p.Style =3D doc.Styles("Heading 2")
=A0 =A0 =A0 =A0 end =A0
=A0 =A0 =A0 end =A0
=A0 =A0 end =A0
=A0 end =A0 =A0
end =A0
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

I'd love to hear your thoughts, comments, and corrections on the above.

Also here: http://kfahlgren.com/blog/?p=3D12

HTH,
Keith
 
C

chiaro scuro

------=_Part_9458_4129094.1139689827716
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

This is interesting stuff. I have recently used similar techniques to
execute excel workbooks in parallel on a computer grid.

Hi!

Interesting.
Thanks.

MCI


--

-- Chiaroscuro --
Liquid Development Blog:
http://feeds.feedburner.com/blogspot/liquiddevelopment

------=_Part_9458_4129094.1139689827716--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top