How to convert MS doc to plain text using Perl on unix

Discussion in 'Perl Misc' started by Diandian Zhang, Jan 14, 2005.

  1. Does anyone have an idea, how to do this? Thanks!
     
    Diandian Zhang, Jan 14, 2005
    #1
    1. Advertising

  2. (Diandian Zhang) writes:
    > Does anyone have an idea, how to do this? Thanks!


    There are some module on CPAN dealing with RTF (rich text format).
    Maybe they are useful.
     
    Arndt Jonasson, Jan 14, 2005
    #2
    1. Advertising

  3. Diandian Zhang

    Tintin Guest

    "Diandian Zhang" <> wrote in message
    news:...
    > Does anyone have an idea, how to do this? Thanks!


    Save as a text file in Open Office :)
     
    Tintin, Jan 14, 2005
    #3
  4. Diandian Zhang wrote:
    > Does anyone have an idea, how to do this? Thanks!


    If you're on windows and have word, Win32::OLE
     
    Stephen Patterson, Jan 15, 2005
    #4
  5. Stephen Patterson <> wrote in news:34t8d9F4ebb01U1
    @individual.net:

    > Diandian Zhang wrote:
    >> Does anyone have an idea, how to do this? Thanks!

    >
    > If you're on windows and have word, Win32::OLE


    Once again, the perils of putting your entire question in the subject
    line are demonstrated.

    The OP needs this on Unix.

    One alternative is to take a look at word2x (google for it).

    On the other hand, if all one wants to is, say, to index contents of a
    Word file, the following would work to a certain extent:

    #! /usr/bin/perl

    use strict;
    use warnings;

    use File::Slurp;

    my $word_file = shift;
    my $doc = read_file($word_file, binmode => ':raw');

    $doc =~ s/[^\015\012\011\040-\176]//g;
    write_file(\*STDOUT, $doc);

    __END__

    Sinan
     
    A. Sinan Unur, Jan 15, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt
    Replies:
    3
    Views:
    536
    Tor Iver Wilhelmsen
    Sep 17, 2004
  2. Replies:
    7
    Views:
    21,546
    kalyan_iitd
    Jul 4, 2006
  3. Marcel Kessler

    Convert HTML to plain text

    Marcel Kessler, Nov 13, 2006, in forum: Java
    Replies:
    3
    Views:
    1,716
    Karl Uppiano
    Nov 14, 2006
  4. Replies:
    13
    Views:
    791
  5. geoffbache
    Replies:
    8
    Views:
    669
    Stefan Behnel
    Feb 11, 2008
Loading...

Share This Page