How to read a pdf file using active perl?

Discussion in 'Perl Misc' started by johny, Aug 28, 2006.

  1. johny

    johny Guest

    Hi,
    I am trying to read a PDF file using active Perl. I tried with
    PDF::API2 but no use. For example - I should get the text which is on
    the third line of first page...

    or

    Is there any way where I can save the pdf file as a .txt file and then
    read the file?
    Please help........

    Thanks,
    AJ
     
    johny, Aug 28, 2006
    #1
    1. Advertising

  2. johny

    David Squire Guest

    johny wrote:
    > Hi,
    > I am trying to read a PDF file using active Perl. I tried with
    > PDF::API2 but no use. For example - I should get the text which is on
    > the third line of first page...
    >
    > or
    >
    > Is there any way where I can save the pdf file as a .txt file and then
    > read the file?
    > Please help........


    Do you need to use Perl? There is the command-line utility pdftotext
    that is available on most UNIX-like systems (and no doubt cygwin).

    You need to be aware that there is no guarantee that you can get text
    out of a PDF document. The PDF standard allows arbitrary encodings to be
    used, so you would have to know what the glyph names mean to reconstruct
    the text. In some cases the glyph names are not meaningful. See
    http://www.glyphandcog.com/textext.html

    That being said, pdftotext works in the great majority of cases.


    DS
     
    David Squire, Aug 28, 2006
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ricardo Pog
    Replies:
    1
    Views:
    437
    Austin Ziegler
    Mar 26, 2008
  2. Sean Nakasone
    Replies:
    1
    Views:
    384
    Farrel Lifson
    Apr 14, 2008
  3. PGPS
    Replies:
    10
    Views:
    637
  4. Ranjit
    Replies:
    1
    Views:
    104
    Peter Scott
    Apr 28, 2007
Loading...

Share This Page