getting full URL from relative links

Discussion in 'Perl Misc' started by slugger3113, Apr 19, 2010.

  1. slugger3113

    slugger3113 Guest

    Hi, I'm trying to get full/absolute URLs from relative links in HTML
    documents. I've been trying to fudge this using File::Basename,
    WWW::Mechanize, etc. but was wondering if there's a more ready-made
    way to do this.

    For example, if my main doc is:

    http://www.abc.com/x/y/z/mydoc.html

    and it contains a relative link to:

    .../../otherdir/yourdoc.html

    how do I get the absolute URL to "yourdoc.html"? Using the above
    modules I've been able to get:

    http://www.abc.com/x/y/z/../../otherdir/yourdoc.html

    when what I want is:

    http://www.abc.com/x/otherdir/yourdoc.html

    Of course I could try and parse all of the possible variations for
    relative paths, but it's making my head hurt and I was wondering if
    there's a module that could help with this. Any thoughts would be
    appreciated.

    thanks
    Scott
     
    slugger3113, Apr 19, 2010
    #1
    1. Advertisements

  2. Jürgen Exner, Apr 19, 2010
    #2
    1. Advertisements

  3. slugger3113

    Steve C Guest

    You also need to know if there is a base tag in the head section
    since that changes the meaning of a relative link.
     
    Steve C, Apr 19, 2010
    #3
  4. slugger3113

    C.DeRykus Guest


    See: perldoc URI

    eg, print URI->new_abs('../../otherdir/yourdoc.html' ,
    'http://www.abc.com/x/y/z/')
     
    C.DeRykus, Apr 19, 2010
    #4
  5. slugger3113

    slugger3113 Guest

    Hm it looks like File::Spec will do what I want:

    my($dpath) = "/one/two/../three/four";

    my $cpath = File::Spec->canonpath( $dpath );

    print $cpath,$/;

    result: /one/three/four

    thanks for the tip on "canonical" (whatever that means)!

    Scott
     
    slugger3113, Apr 19, 2010
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.