Need a Module Similar to lynx in Perl

Market Mutant · Jan 18, 2004

I used to just decode the HTML to get what I want, but my current project's
download has a table. I looked it from LYNX and it is super simple, but when
open the file in HTML, I got headache. I know I can call lynx -dump from
perl, but I need something which I can run both from windows and linux
without using lynx. Any module which can output exactly like lynx's dump?

Walter Roberson · Jan 18, 2004

:I used to just decode the HTML to get what I want, but my current project's
:download has a table. I looked it from LYNX and it is super simple, but when

pen the file in HTML, I got headache. I know I can call lynx -dump from

erl, but I need something which I can run both from windows and linux
:without using lynx. Any module which can output exactly like lynx's dump?

You probably want to use the LWP module.

Chris · Jan 18, 2004

Market said:
I used to just decode the HTML to get what I want, but my current project's
download has a table. I looked it from LYNX and it is super simple, but when
open the file in HTML, I got headache. I know I can call lynx -dump from
perl, but I need something which I can run both from windows and linux
without using lynx. Any module which can output exactly like lynx's dump?

Ew, I totally agree that Lynx makes for some very fine (and simple) web
scraping. When I need that power from both Windows and *nix, I write it
as a Web service (using XML-RPC) and call it from either platform.
Works wonderously well. This also provices a consistence call interface
and centralizes my code in one location.

Chris

Ben Morrow · Jan 18, 2004

Market Mutant said:
I used to just decode the HTML to get what I want, but my current project's
download has a table. I looked it from LYNX and it is super simple, but when
open the file in HTML, I got headache. I know I can call lynx -dump from
perl, but I need something which I can run both from windows and linux
without using lynx. Any module which can output exactly like lynx's dump?

perldoc -q html

Ben

James Willmore · Jan 18, 2004

Ew, I totally agree that Lynx makes for some very fine (and simple) web
scraping. When I need that power from both Windows and *nix, I write it
as a Web service (using XML-RPC) and call it from either platform.
Works wonderously well. This also provices a consistence call interface
and centralizes my code in one location.

Or ... how about just using the LWP module? The OP just wants to get HTML
from a page.

And ... I bet if the OP used Google .... he would have found this to be
the question of the week

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
Never tell a lie unless it is absolutely convenient.

Market Mutant · Jan 19, 2004

I want to use lynx like, beause I got a table to deal with.
formattext needs too many other modules
and there is no good html->text shit in perl yet.

I write all the codes for myself just for this project. I hope I can find
something generic later for later projects. This really sucks. I have to
write different codes using s/// and split for all the html codes to be
texted.

James Willmore · Jan 19, 2004

I want to use lynx like, beause I got a table to deal with.
formattext needs too many other modules
and there is no good html->text shit in perl yet.

I write all the codes for myself just for this project. I hope I can find
something generic later for later projects. This really sucks. I have to
write different codes using s/// and split for all the html codes to be
texted.

Well .... if you need to parse the HTML, why not journey to your local
neighborhood CPAN and look over the *many* HTML parsing modules
(http://search.cpan.org/ and search for HTML). I believe there is one that
handles HTML tables. You *could* also use Google and search for the *many*
posts on this subject in this newsgroup.

HTH

--
Jim

Copyright notice: all code written by the author in this post is
released under the GPL. http://www.gnu.org/licenses/gpl.txt
for more information.

a fortune quote ...
Bizarreness is the essence of the exotic

Joe Smith · Jan 20, 2004

Market said:
and there is no good html->text shit in perl yet.

For just text, it is straight forward.

#!/usr/bin/perl -w
# Name: nohtml Author: (e-mail address removed) 07-Nov-2001
# Purpose: Extracts just the text portions of a document.

use strict;
use HTML:

arser ();

sub text_handler { # Ordinary text
print @_;
}

my $p = HTML:

arser->new(api_version => 3);
$p->handler( text => \&text_handler, "dtext");
$p->parse_file(shift || "-") || die $!;

1;

Translater + module + tkinter	1	Feb 16, 2023
Looking to start a project but I need help	2	Mar 8, 2023
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
capturing stdout from lynx..	2	Mar 11, 2006
Converting an Array to a String in JavaScript	7	Sep 22, 2023
HTML::template Module in perl	3	Apr 6, 2007
Im working on a school project and i need some advice in MPLAB X and some programming around it	0	Apr 26, 2023
building generators in Perl	1	Feb 27, 2013

Need a Module Similar to lynx in Perl

Market Mutant

Walter Roberson

Chris

Ben Morrow

James Willmore

Market Mutant

James Willmore

Joe Smith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads