[perl-python] get web page programatically

X

Xah Lee

# -*- coding: utf-8 -*-
# Python

# suppose you want to fetch a webpage.
from urllib import urlopen
print
urlopen('http://xahlee.org/Periodic_dosage_dir/_p2/russell-lecture.html').read()

# note the line
# from <library_name> import <function_name1,function_name2...>
# it reads the library and import the function name
# to see available functions in a module one can use "dir"
# import urllib; print dir(urllib)

# for more about this module import syntax, see
# http://python.org/doc/tut/node8.html

#---------------------
# sometimes in working with html pages, you need to creat links
# In url, some chars need to be encoded.
# the "quote" function does it. "unquote" function reverses it. Very
nice.

from urllib import quote
print quote("~joe's home page")
print 'http://www.google.com/search?q=' + quote("ménage à trois")
# (rely on the French to teach us interesting words)

# for more about the urllib module, see
# http://python.org/doc/lib/module-urllib.html

----------------------------
in perl, it's messy as usual. Long story short the simplest way is to
use the perl program HEAD or GET in /usr/bin or /usr/local/bin. When
one of the networking module is installed, perl contaminate your bin
dirs with these programs. In the unix shell, try
GET 'http://yahoo.com/'
should do the job. HEAD is similar for http
head. (assuming they are installed.)

if you need more complexty, perl has LWP::Simple and LWP::UserAgent to
begin with. (there are a host of spaghetti others) Both of these needs
to be installed extra. Perhaps consult your sys admin. The last time i
used them was some 2 years ago, so the following code is untested, but
should be it. I don't recall which one can't do what. Your milage may
vary.

use strict;
# use LWP::Simple;
use LWP::UserAgent;
my $ua = new LWP::UserAgent;
$ua->timeout(120);
my $url='http://yahoo.com/';
my $request = new HTTP::Request('GET', $url);
my $response = $ua->request($request);
my $content = $response->content();
print $content;
__END__

# note the above perl code. In many perl codes, they sport the Object
Oriented syntax, often concomitantly with a normal syntax version as
well.

----------------
this post is from the perl-python a-day mailing list. Please see
http://xahlee.org/perl-python/python.html

Xah
(e-mail address removed)
http://xahlee.org/PageTwo_dir/more.html
 
D

Dan Perl

# note the line
# from <library_name> import <function_name1,function_name2...>
# it reads the library and import the function name
# to see available functions in a module one can use "dir"
# import urllib; print dir(urllib)

After about a month, this tutorial has finally reached the syntax of the
"import" statement!

And word of advice to python beginners, "print dir(urllib)" is not very
useful in the sense mentioned here (it prints all the names defined in the
module with no explanations, and those names are not only functions, BTW).
But "help(urllib)" is much more useful. Even with such a simple script,
this tutorial still managed to give some bad advice.
 
C

Chris Mattern

Xah Lee wrote:

<snip>

Just the standard warnings for any novices unfamiliar with Mr. Lee.
Mr. Lee's posts are regularly riddled with severe errors (I found
the assertion that LWP::Simple and LWP::UserAgent aren't part of
the standard base perl install a particularly amusing one in this
particular post). Please be advised that you should get your
perl information from accurate sources. http://learn.perl.org
is an excellent place to start, with pointers to excellent Perl
books and even some readable for free online (notably Beginning
Perl).
--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top