Scan Microsoft Office files

Discussion in 'Perl Misc' started by Will Fawcett, Sep 3, 2004.

  1. Will Fawcett

    Will Fawcett Guest

    I am trying to put together a script that will allow me to scan
    Microsoft Office files and store "keywords" for those files so they
    are searchable by content not just title.

    If you open a word file with Perl and look at the actual source it is
    basically a text file with a bunch of bogus code. I was hoping someone
    here might have heard of a module out there that can step out the
    ambiguous code out and just store plain text words. Or is RegEx my
    only option?

    -Will
    Will Fawcett, Sep 3, 2004
    #1
    1. Advertising

  2. Will Fawcett

    wfsp Guest

    "Will Fawcett" <> wrote in message
    news:...
    >I am trying to put together a script that will allow me to scan
    > Microsoft Office files and store "keywords" for those files so they
    > are searchable by content not just title.
    >
    > If you open a word file with Perl and look at the actual source it is
    > basically a text file with a bunch of bogus code. I was hoping someone
    > here might have heard of a module out there that can step out the
    > ambiguous code out and just store plain text words. Or is RegEx my
    > only option?
    >
    > -Will


    An example:

    #!/bin/perl5
    use strict;
    use warnings;
    use Win32::OLE;

    my $w = Win32::OLE->GetActiveObject('Word.Application');
    my $d = $w->ActiveDocument;
    my $paras = $d->Paragraphs;

    foreach my $para ( in $paras ) {
    my $style = $para->Style->{ NameLocal };
    my $text = $para->Range->{ text };
    print "$style\t$text\n"
    }
    Assumes Word is open and a document is open. The vba help files have all the
    methods/properties. A search on Win32::OLE will bring up many
    tutorials/references.
    wfsp, Sep 3, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Charles A. Lackman
    Replies:
    1
    Views:
    1,323
    smith
    Dec 8, 2004
  2. bracoute

    Microsoft Office Interop word dll

    bracoute, Jul 5, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    4,306
    anastasia
    Jul 5, 2003
  3. SpamProof
    Replies:
    0
    Views:
    530
    SpamProof
    Oct 21, 2003
  4. Vimal
    Replies:
    0
    Views:
    471
    Vimal
    Oct 14, 2004
  5. Stan Accrington
    Replies:
    1
    Views:
    917
    Michael Borgwardt
    May 13, 2004
Loading...

Share This Page