J
John Carter
There have been a couple of threads on parsing HTML.
If you looking for something excruciatingly simple (and simplistic) that
will parse the sloppiest of HTML pages, you could do worse than look at
LittleLexer.
http://littlelexer.rubyforge.org/
At a mere 44 lines of non-blank non-comment lines of Ruby, it has the
virtue of a certain elegant simplicity.
It includes a rudimentary HTML parser as an example.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand
Note to all marketers. If you want to sell things to me, buy Google words.
I refuse on principle to buy anything sold by spam or popup and I
never follow any links found in a spam. I do however use Google and
will often follow the neat non-irritating Google Word ads that are of
interest to me.
If you looking for something excruciatingly simple (and simplistic) that
will parse the sloppiest of HTML pages, you could do worse than look at
LittleLexer.
http://littlelexer.rubyforge.org/
At a mere 44 lines of non-blank non-comment lines of Ruby, it has the
virtue of a certain elegant simplicity.
It includes a rudimentary HTML parser as an example.
John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand
Note to all marketers. If you want to sell things to me, buy Google words.
I refuse on principle to buy anything sold by spam or popup and I
never follow any links found in a spam. I do however use Google and
will often follow the neat non-irritating Google Word ads that are of
interest to me.