getting full URL from relative links

Discussion in 'Perl Misc' started by slugger3113, Apr 19, 2010.

  1. slugger3113

    slugger3113 Guest

    Hi, I'm trying to get full/absolute URLs from relative links in HTML
    documents. I've been trying to fudge this using File::Basename,
    WWW::Mechanize, etc. but was wondering if there's a more ready-made
    way to do this.

    For example, if my main doc is:

    http://www.abc.com/x/y/z/mydoc.html

    and it contains a relative link to:

    .../../otherdir/yourdoc.html

    how do I get the absolute URL to "yourdoc.html"? Using the above
    modules I've been able to get:

    http://www.abc.com/x/y/z/../../otherdir/yourdoc.html

    when what I want is:

    http://www.abc.com/x/otherdir/yourdoc.html

    Of course I could try and parse all of the possible variations for
    relative paths, but it's making my head hurt and I was wondering if
    there's a module that could help with this. Any thoughts would be
    appreciated.

    thanks
    Scott
    slugger3113, Apr 19, 2010
    #1
    1. Advertising

  2. Jürgen Exner, Apr 19, 2010
    #2
    1. Advertising

  3. slugger3113

    Steve C Guest

    slugger3113 wrote:
    > Hi, I'm trying to get full/absolute URLs from relative links in HTML
    > documents. I've been trying to fudge this using File::Basename,
    > WWW::Mechanize, etc. but was wondering if there's a more ready-made
    > way to do this.
    >
    > For example, if my main doc is:
    >
    > http://www.abc.com/x/y/z/mydoc.html
    >
    > and it contains a relative link to:
    >
    > ../../otherdir/yourdoc.html
    >
    > how do I get the absolute URL to "yourdoc.html"? Using the above
    > modules I've been able to get:
    >
    > http://www.abc.com/x/y/z/../../otherdir/yourdoc.html
    >
    > when what I want is:
    >
    > http://www.abc.com/x/otherdir/yourdoc.html
    >
    > Of course I could try and parse all of the possible variations for
    > relative paths, but it's making my head hurt and I was wondering if
    > there's a module that could help with this. Any thoughts would be
    > appreciated.


    You also need to know if there is a base tag in the head section
    since that changes the meaning of a relative link.
    Steve C, Apr 19, 2010
    #3
  4. slugger3113

    C.DeRykus Guest

    On Apr 19, 8:51 am, slugger3113 <> wrote:
    > Hi, I'm trying to get full/absolute URLs from relative links in HTML
    > documents. I've been trying to fudge this using File::Basename,
    > WWW::Mechanize, etc. but was wondering if there's a more ready-made
    > way to do this.
    >
    > For example, if my main doc is:
    >
    > http://www.abc.com/x/y/z/mydoc.html
    >
    > and it contains a relative link to:
    >
    > ../../otherdir/yourdoc.html
    >
    > how do I get the absolute URL to "yourdoc.html"? Using the above
    > modules I've been able to get:
    >
    > http://www.abc.com/x/y/z/../../otherdir/yourdoc.html
    >
    > when what I want is:
    >
    > http://www.abc.com/x/otherdir/yourdoc.html
    >
    > Of course I could try and parse all of the possible variations for
    > relative paths, but it's making my head hurt and I was wondering if
    > there's a module that could help with this. Any thoughts would be
    > appreciated.
    >



    See: perldoc URI

    eg, print URI->new_abs('../../otherdir/yourdoc.html' ,
    'http://www.abc.com/x/y/z/')

    --
    Charles DeRykus
    C.DeRykus, Apr 19, 2010
    #4
  5. slugger3113

    slugger3113 Guest

    On Apr 19, 11:07 am, Jürgen Exner <> wrote:
    > For file names there is a module that will compute the canonical path,
    > but I can't remember the name right now. And I don't know if it will
    > work with URLs, either.
    >
    > jue


    Hm it looks like File::Spec will do what I want:

    my($dpath) = "/one/two/../three/four";

    my $cpath = File::Spec->canonpath( $dpath );

    print $cpath,$/;

    result: /one/three/four

    thanks for the tip on "canonical" (whatever that means)!

    Scott
    slugger3113, Apr 19, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. wl
    Replies:
    1
    Views:
    4,398
  2. =?Utf-8?B?UGF1bCBCb25mYW50aQ==?=

    Response.Redirect() converts relative URL to absolute URL

    =?Utf-8?B?UGF1bCBCb25mYW50aQ==?=, Apr 24, 2006, in forum: ASP .Net
    Replies:
    2
    Views:
    6,039
    bruce barker \(sqlwork.com\)
    Apr 25, 2006
  3. LB
    Replies:
    1
    Views:
    838
  4. David Thielen

    Get full url from relative url

    David Thielen, May 25, 2006, in forum: ASP .Net Web Controls
    Replies:
    2
    Views:
    180
    David Thielen
    May 26, 2006
  5. Garrett Smith
    Replies:
    14
    Views:
    293
    David Mark
    May 26, 2009
Loading...

Share This Page