pdf to HTML conversion program?

Discussion in 'HTML' started by Cliff R., Jan 31, 2004.

  1. Cliff R.

    Cliff R. Guest

    Hi, can anyone recommend a good program that converts PDF files to
    HTML? I've tried one called PDF to HTML Converter Pro but the code it
    creates isn't what I'm looking for. I really just need it to convert
    to basic HTML keeping bold, itals, paragraph breaks, etc., NOT styled
    text so the line breaks are exactly the same, etc. In this one, every
    single line has this sort of code at the beginning: <div
    id="_506:9699" style="position:absolute;top:9699;left:506"><span
    id="_11" style="font-size:11px;font-family:Helvetica;color=#000000">
    etc. so the code is huge and unnecessarily complicated.

    Any ideas of what to use to create clean, basic HTML of mostly
    text-based PDF's?

    Thanks.
     
    Cliff R., Jan 31, 2004
    #1
    1. Advertising

  2. Cliff R. wrote:
    > Hi, can anyone recommend a good program that converts PDF files to
    > HTML?


    rm -f *.pdf
    nano foo.html
     
    Leif K-Brooks, Jan 31, 2004
    #2
    1. Advertising

  3. Cliff R.

    Terry Guest

    Leif K-Brooks wrote:

    > Cliff R. wrote:
    >
    >> Hi, can anyone recommend a good program that converts PDF files to
    >> HTML?

    >
    >
    > rm -f *.pdf
    > nano foo.html
    >


    tsk... and he asked so politely too!
     
    Terry, Jan 31, 2004
    #3
  4. Cliff R. wrote:

    > Any ideas of what to use to create clean, basic HTML of mostly
    > text-based PDF's?


    I dunno about that, but I can go one step better. Ghostscript includes a
    tool "ps2ascii" that can convert PDF and Postscript files to plain text.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me - http://www.goddamn.co.uk/tobyink/?page=132
     
    Toby A Inkster, Jan 31, 2004
    #4
  5. Terry wrote:
    >>> Hi, can anyone recommend a good program that converts PDF files to
    >>> HTML?

    >> rm -f *.pdf
    >> nano foo.html

    > tsk... and he asked so politely too!


    It's what I would do. PDF is a (mostly?) presentational format, HTML is
    structural. Anything short of true AI won't be able to convert them well.
     
    Leif K-Brooks, Feb 1, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page