getting absolute path ?

S

Stef Mientki

hello,

I'm trying to convert the links in html pages to absolute links,
these pages can either be webpages or files on local harddisk (winXP).
Now I've struggling for a while, and this code works a lilttle:

i = line.find ( 'href=' )
if i < 0 :
i = line.find ( ' src=' )
if i >= 0 :
ii = line.find ( '"', i+6 )
file = line [ i+6 : ii ]
#print urlparse.urljoin ( p, file )
if file.find ( 'http:' ) < 0 :
abspath = os.path.normpath ( os.path.join ( p, file ) )
line = line.replace ( file, abspath )
print line

but it only covers files on local disk and just 1 link per line,
so I guess it's a lot of trouble to catch all cases.
Isn't there a convenient function for (OS independent preferable) ?
Googled for it, but can't find it.

thanks,
Stef Mientki
 
K

kyosohma

hello,

I'm trying to convert the links in html pages to absolute links,
these pages can either be webpages or files on local harddisk (winXP).
Now I've struggling for a while, and this code works a lilttle:

i = line.find ( 'href=' )
if i < 0 :
i = line.find ( ' src=' )
if i >= 0 :
ii = line.find ( '"', i+6 )
file = line [ i+6 : ii ]
#print urlparse.urljoin ( p, file )
if file.find ( 'http:' ) < 0 :
abspath = os.path.normpath ( os.path.join ( p, file ) )
line = line.replace ( file, abspath )
print line

but it only covers files on local disk and just 1 link per line,
so I guess it's a lot of trouble to catch all cases.
Isn't there a convenient function for (OS independent preferable) ?
Googled for it, but can't find it.

thanks,
Stef Mientki

I googled a bit too. The Perl forums talk about using a regular
expression. You can probably take that and translate it into the
Python equivalent:

http://forums.devshed.com/perl-prog...e-relatives-links-to-absolute-links-8173.html

I also found this, which appears to be an old c.l.py thread:

http://www.dbforums.com/archive/index.php/t-320359.html

You might have more luck if you google for "relative to absolute
links". I would also take a look at how django or cherrypy creates
their URLs.

Mike
 
S

Stef Mientki

thanks Mike,

with your links I managed to write some code that seems to work well.
Still I stay surprised that these kind of functions are not available ;-)
cheers,
Stef

hello,

I'm trying to convert the links in html pages to absolute links,
these pages can either be webpages or files on local harddisk (winXP).
Now I've struggling for a while, and this code works a lilttle:

i = line.find ( 'href=' )
if i < 0 :
i = line.find ( ' src=' )
if i >= 0 :
ii = line.find ( '"', i+6 )
file = line [ i+6 : ii ]
#print urlparse.urljoin ( p, file )
if file.find ( 'http:' ) < 0 :
abspath = os.path.normpath ( os.path.join ( p, file ) )
line = line.replace ( file, abspath )
print line

but it only covers files on local disk and just 1 link per line,
so I guess it's a lot of trouble to catch all cases.
Isn't there a convenient function for (OS independent preferable) ?
Googled for it, but can't find it.

thanks,
Stef Mientki

I googled a bit too. The Perl forums talk about using a regular
expression. You can probably take that and translate it into the
Python equivalent:

http://forums.devshed.com/perl-prog...e-relatives-links-to-absolute-links-8173.html

I also found this, which appears to be an old c.l.py thread:

http://www.dbforums.com/archive/index.php/t-320359.html

You might have more luck if you google for "relative to absolute
links". I would also take a look at how django or cherrypy creates
their URLs.

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top