Looking for html -> text convertor and index generator

Discussion in 'HTML' started by Spartanicus, Oct 20, 2004.

  1. Spartanicus

    Spartanicus Guest

    Can anyone recommend a good *batch* html -> text convertor? I'm looking
    for something that generates formatted text that if possible reflects
    the original display to make it easier to redo the markup by using S&R.

    I'm also looking for a utility that can generate a list with links and
    target id's based on the header structure of a document.

    Windows executables, freeware preferred, command line utils are fine.

    --
    Spartanicus
    Spartanicus, Oct 20, 2004
    #1
    1. Advertising

  2. Spartanicus

    brucie Guest

    In alt.html Spartanicus said:

    > Can anyone recommend a good *batch* html -> text convertor?


    the brain. unfortunately i lent mine out a few years ago and it never
    came back but when i was using it it was far superior than any attempt
    at automating the process.

    start again from scratch (and charge double) you'll get a much better
    end product.

    --


    v o i c e s
    brucie, Oct 20, 2004
    #2
    1. Advertising

  3. Spartanicus

    Spartanicus Guest

    brucie <> wrote:

    >> Can anyone recommend a good *batch* html -> text convertor?

    >
    >start again from scratch


    I need the bare content to do that.

    --
    Spartanicus
    Spartanicus, Oct 20, 2004
    #3
  4. Spartanicus

    rf Guest

    "Spartanicus" <> wrote in message
    news:...
    > brucie <> wrote:
    >
    > >> Can anyone recommend a good *batch* html -> text convertor?

    > >
    > >start again from scratch

    >
    > I need the bare content to do that.


    ?

    Load it into word and save it as text then load it into notepad and start
    fiddling.

    --
    Cheers
    Richard.
    rf, Oct 20, 2004
    #4
  5. Spartanicus

    rf Guest

    "rf" <rf@.invalid> wrote in message
    news:Osrdd.33371$...
    >
    > "Spartanicus" <> wrote in message
    > news:...
    > > brucie <> wrote:
    > >
    > > >> Can anyone recommend a good *batch* html -> text convertor?
    > > >
    > > >start again from scratch

    > >
    > > I need the bare content to do that.

    >
    > ?
    >
    > Load it into word and save it as text then load it into notepad and start
    > fiddling.


    Oops. Just saw the word batch in your OP.

    You can very easily craft a, for example, VB program to drive word in a sort
    of "batch" mode.

    --
    Cheers
    Richard.
    rf, Oct 20, 2004
    #5
  6. Spartanicus wrote:
    > Can anyone recommend a good *batch* html -> text convertor?


    You could try lynx's -dump option with a quick shell script for the
    batch part.
    Leif K-Brooks, Oct 20, 2004
    #6
  7. Spartanicus wrote:

    > I'm also looking for a utility that can generate a list with links and
    > target id's based on the header structure of a document.


    begin blatent-self-promotion:

    http://dorward.me.uk/software/tocbuilder/

    --
    David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
    Home is where the ~/.bashrc is
    David Dorward, Oct 20, 2004
    #7
  8. Spartanicus

    Spartanicus Guest

    David Dorward <> wrote:

    >> I'm also looking for a utility that can generate a list with links and
    >> target id's based on the header structure of a document.

    >
    >begin blatent-self-promotion:
    >
    >http://dorward.me.uk/software/tocbuilder/


    Pah, you've got all the text parsing abilities of Perl to your disposal,
    and you're expecting me to insert the header id's manually?

    I'm lazier than you think :)

    --
    Spartanicus
    Spartanicus, Oct 20, 2004
    #8
  9. Spartanicus

    Andy Dingley Guest

    On Wed, 20 Oct 2004 11:30:03 +0100, Spartanicus <>
    wrote:

    >Can anyone recommend a good *batch* html -> text convertor?


    [complicated stuff it needs to do too]


    XSLT

    Oh, sorry you wanted HTML conversion, not XHTML. This is just the
    sort of reason I'm always banging on about XHTML being a good thing
    for content authors who encounter this sort of task.


    For HTML I'd use Perl and your favourite HTML parsing module. But it's
    a thankless task.
    --
    Smert' spamionam
    Andy Dingley, Oct 21, 2004
    #9
  10. Spartanicus wrote:

    >>http://dorward.me.uk/software/tocbuilder/


    > Pah, you've got all the text parsing abilities of Perl to your disposal,
    > and you're expecting me to insert the header id's manually?


    Auto addition of ids on headings which don't have them already is a feature
    I've been planning to stick in the next release for a while. One year I
    might get round to making another release :)

    --
    David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
    Home is where the ~/.bashrc is
    David Dorward, Oct 21, 2004
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. karthikeyavenkat
    Replies:
    2
    Views:
    579
    Bryce
    Mar 17, 2005
  2. =?Utf-8?B?QVNQIERldmVsb3Blcg==?=

    RTF to html convertor

    =?Utf-8?B?QVNQIERldmVsb3Blcg==?=, Sep 26, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    327
    David Wier
    Sep 26, 2006
  3. Joey

    TTS(Text to Speech) Convertor.

    Joey, Jan 26, 2006, in forum: C Programming
    Replies:
    2
    Views:
    344
    Kenneth Brody
    Jan 26, 2006
  4. ddk1965

    html to text convertor in c

    ddk1965, Mar 18, 2009, in forum: C Programming
    Replies:
    4
    Views:
    437
    Richard Bos
    Mar 21, 2009
  5. Tomasz Chmielewski

    sorting index-15, index-9, index-110 "the human way"?

    Tomasz Chmielewski, Mar 4, 2008, in forum: Perl Misc
    Replies:
    4
    Views:
    280
    Tomasz Chmielewski
    Mar 4, 2008
Loading...

Share This Page