G
Goh, Yong Kwang
Hi.
I copied the code fragment from
http://www.wellho.net/solutions/1480965085.html to extract text from
Microsoft Word document.
---
use Win32::OLE;
use Win32::OLE::Enum;
print "Opening $ARGV[1]\n";
$document = Win32::OLE->GetObject("$ARGV[1]");
if(!defined $document){
die "Document still not defined!";
}
print "Opening $ARGV[0]\n";
open (FH,">$ARGV[0]");
print "Extracting Text ...\n";
$paragraphs = $document->Paragraphs();
$enumerate = new Win32::OLE::Enum($paragraphs);
while(defined($paragraph = $enumerate->Next()))
{
$style = $paragraph->{Style}->{NameLocal};
print FH "+$style\n";
$text = $paragraph->{Range}->{Text};
$text =~ s/[\n\r]//g;
$text =~ s/\x0b/\n/g;
print FH "=$text\n";
}
--
When I run it, the program keeps crashing with $document being
undefined and trying to call Paragraphs method.
So I added a check to determine if $document is defined before trying
to call the Paragraphs method. Somehow, when I run it: perl doc2txt.pl
test.txt test.doc, somehow the Win32::OLE->GetObject method always
fails to work and return a proper handle to the Word document. Thus
causing it to be undefined and crashing when Paragraphs method is
called.
Why is this so?
I copied the code fragment from
http://www.wellho.net/solutions/1480965085.html to extract text from
Microsoft Word document.
---
use Win32::OLE;
use Win32::OLE::Enum;
print "Opening $ARGV[1]\n";
$document = Win32::OLE->GetObject("$ARGV[1]");
if(!defined $document){
die "Document still not defined!";
}
print "Opening $ARGV[0]\n";
open (FH,">$ARGV[0]");
print "Extracting Text ...\n";
$paragraphs = $document->Paragraphs();
$enumerate = new Win32::OLE::Enum($paragraphs);
while(defined($paragraph = $enumerate->Next()))
{
$style = $paragraph->{Style}->{NameLocal};
print FH "+$style\n";
$text = $paragraph->{Range}->{Text};
$text =~ s/[\n\r]//g;
$text =~ s/\x0b/\n/g;
print FH "=$text\n";
}
--
When I run it, the program keeps crashing with $document being
undefined and trying to call Paragraphs method.
So I added a check to determine if $document is defined before trying
to call the Paragraphs method. Somehow, when I run it: perl doc2txt.pl
test.txt test.doc, somehow the Win32::OLE->GetObject method always
fails to work and return a proper handle to the Word document. Thus
causing it to be undefined and crashing when Paragraphs method is
called.
Why is this so?