URL detection follow-up

Discussion in 'Perl Misc' started by \Dandy\ Randy, Sep 10, 2003.

  1. Hi,

    As per my previous posts, I am searching for a way to open a text file that
    contains a few paragraphs of text, locate web URL's and replace them with
    the needed html tags such as <a href & </a> etc. Most of the responses
    suggested using a perl module ... URI::Find and similair methods.
    Unfortunately, the hosting company I run my scripts from does not have this
    module installed, and they are not prepared to install it just for my needs.
    I also cannot change hosting companies.

    So ... is there ANY other way my goal can be accomplished giving I cannot
    use URI::Find? Here is an example of theoretical code:

    #!/usr/bin/perl

    # get text file data
    open (TEXT, "<data/data.txt") or die "Can't open file: $!";
    @data=<TEXT>;
    close(TEXT);

    # find web URL's and replace occurances with needed HTML tags
    scan @list > replace;

    # write the changed data back to the text file
    open (TEXT, ">data/data.txt") or die "Can't open file: $!";
    print DATA @text;
    close(TEXT);

    Please, it is very important to me to find this solution, if you have any
    ideas, post back. Working code examples are very welcomed. Thanx everyone!

    Randy
    \Dandy\ Randy, Sep 10, 2003
    #1
    1. Advertising

  2. "\"Dandy\" Randy" <> wrote in
    news:sMM7b.935815$:

    > methods. Unfortunately, the hosting company I run my scripts from does
    > not have this module installed, and they are not prepared to install
    > it just for my needs. I also cannot change hosting companies.


    perldoc -q lib
    Found in C:\Perl\lib\pod\perlfaq8.pod

    How do I keep my own module/library directory?

    When you build modules, use the PREFIX option when generating
    Makefiles:

    perl Makefile.PL PREFIX=/u/mydir/perl

    then either set the PERL5LIB environment variable before you run
    scripts that use the modules/libraries (see perlrun) or say

    use lib '/u/mydir/perl';

    This is almost the same as

    BEGIN {
    unshift(@INC, '/u/mydir/perl');
    }

    except that the lib module checks for machine-dependent
    subdirectories. See Perl's lib for more information.


    --
    A. Sinan Unur

    Remove dashes for address
    Spam bait: mailto:
    A. Sinan Unur, Sep 10, 2003
    #2
    1. Advertising

  3. \Dandy\ Randy

    Brian Wakem Guest

    ""Dandy" Randy" <> wrote in message
    news:sMM7b.935815$...
    > Hi,
    >
    > As per my previous posts, I am searching for a way to open a text file

    that
    > contains a few paragraphs of text, locate web URL's and replace them with
    > the needed html tags such as <a href & </a> etc. Most of the responses
    > suggested using a perl module ... URI::Find and similair methods.
    > Unfortunately, the hosting company I run my scripts from does not have

    this
    > module installed, and they are not prepared to install it just for my

    needs.
    > I also cannot change hosting companies.
    >
    > So ... is there ANY other way my goal can be accomplished giving I cannot
    > use URI::Find? Here is an example of theoretical code:
    >
    > #!/usr/bin/perl
    >
    > # get text file data
    > open (TEXT, "<data/data.txt") or die "Can't open file: $!";
    > @data=<TEXT>;
    > close(TEXT);
    >
    > # find web URL's and replace occurances with needed HTML tags
    > scan @list > replace;
    >
    > # write the changed data back to the text file
    > open (TEXT, ">data/data.txt") or die "Can't open file: $!";
    > print DATA @text;
    > close(TEXT);
    >
    > Please, it is very important to me to find this solution, if you have any
    > ideas, post back. Working code examples are very welcomed. Thanx everyone!



    If they are all like http://www.domain.com/dir/file.html then you could do
    something like -


    foreach(@data) {
    s!(http://.*?)(?:\s|$)!<a href="$1">$1</a>!gi;
    }

    Not perfect, but it'll get you started.

    --
    Brian Wakem
    Brian Wakem, Sep 10, 2003
    #3
  4. Awesome ... works great ... now ... can you formulate a replacement command
    that will take an email address and add the <a href="mailto: commands so
    that email addresses will also become linked? You've been agreat help!

    Randy

    "Brian Wakem" <> wrote in message
    news:bjo6ce$lfb7n$-berlin.de...
    >
    > ""Dandy" Randy" <> wrote in message
    > news:sMM7b.935815$...
    > > Hi,
    > >
    > > As per my previous posts, I am searching for a way to open a text file

    > that
    > > contains a few paragraphs of text, locate web URL's and replace them

    with
    > > the needed html tags such as <a href & </a> etc. Most of the responses
    > > suggested using a perl module ... URI::Find and similair methods.
    > > Unfortunately, the hosting company I run my scripts from does not have

    > this
    > > module installed, and they are not prepared to install it just for my

    > needs.
    > > I also cannot change hosting companies.
    > >
    > > So ... is there ANY other way my goal can be accomplished giving I

    cannot
    > > use URI::Find? Here is an example of theoretical code:
    > >
    > > #!/usr/bin/perl
    > >
    > > # get text file data
    > > open (TEXT, "<data/data.txt") or die "Can't open file: $!";
    > > @data=<TEXT>;
    > > close(TEXT);
    > >
    > > # find web URL's and replace occurances with needed HTML tags
    > > scan @list > replace;
    > >
    > > # write the changed data back to the text file
    > > open (TEXT, ">data/data.txt") or die "Can't open file: $!";
    > > print DATA @text;
    > > close(TEXT);
    > >
    > > Please, it is very important to me to find this solution, if you have

    any
    > > ideas, post back. Working code examples are very welcomed. Thanx

    everyone!
    >
    >
    > If they are all like http://www.domain.com/dir/file.html then you could do
    > something like -
    >
    >
    > foreach(@data) {
    > s!(http://.*?)(?:\s|$)!<a href="$1">$1</a>!gi;
    > }
    >
    > Not perfect, but it'll get you started.
    >
    > --
    > Brian Wakem
    >
    >
    \Dandy\ Randy, Sep 10, 2003
    #4
  5. \Dandy\ Randy

    Brian Wakem Guest

    ""Dandy" Randy" <> wrote in message
    news:VpN7b.927838$...
    > Awesome ... works great ... now ... can you formulate a replacement

    command
    > that will take an email address and add the <a href="mailto: commands so
    > that email addresses will also become linked? You've been agreat help!
    >
    > Randy
    >


    Nice example of top posting.

    To match email addresses perfectly every time is probably impossible, but a
    simple and effective way of matching 99%+ would be:-

    foreach(@data) {
    s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
    }

    --
    Brian Wakem
    Brian Wakem, Sep 10, 2003
    #5
  6. "Brian Wakem" wrote:

    > To match email addresses perfectly every time is probably impossible, but

    a
    > simple and effective way of matching 99%+ would be:-
    >
    > foreach(@data) {
    > s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
    > }


    Brian, thankx again, that one worked too. Here is what i'm now using that
    seems to work correctly:

    $contents=~ s/http:\/\///g;
    $contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
    $contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;

    The first code eliminates the http:// in case the text contained a full url,
    then adjusted your code to start looking for www. You also may notice a
    deliberate space after the </a> tags ... this was needed as your code seemed
    to kill the trailing space. Owe you one.

    Randy

    P.S. Sorry about the last top post.
    \Dandy\ Randy, Sep 10, 2003
    #6
  7. \Dandy\ Randy

    Brian Wakem Guest

    ""Dandy" Randy" <> wrote in message
    news:KEN7b.125548$...
    > "Brian Wakem" wrote:
    >
    > > To match email addresses perfectly every time is probably impossible,

    but
    > a
    > > simple and effective way of matching 99%+ would be:-
    > >
    > > foreach(@data) {
    > > s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a>!g;
    > > }

    >
    > Brian, thankx again, that one worked too. Here is what i'm now using that
    > seems to work correctly:
    >
    > $contents=~ s/http:\/\///g;
    > $contents=~ s!(www.*?)(?:\s|$)!<a href="http://$1">$1</a> !gi;
    > $contents=~ s!([-\w.]+\@[-\w.]+)!<a href="mailto:$1">$1</a> !g;
    >
    > The first code eliminates the http:// in case the text contained a full

    url,
    > then adjusted your code to start looking for www. You also may notice a
    > deliberate space after the </a> tags ... this was needed as your code

    seemed
    > to kill the trailing space. Owe you one.



    Yes it would have swallowed the space.

    s!(http://.*?)(\s|$)!<a href="$1">$1</a>$2!gi;

    Instead should sort that out.

    I'm glad they worked for you, but it's important to understand why, in case
    you need to alter something. It's also important to understand why those
    regexs are not perfect and will not work for every url or email, and from
    time-to-time, could match things that aren't urls or email addresses.

    --
    Brian Wakem
    Brian Wakem, Sep 10, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin
    Replies:
    0
    Views:
    1,241
    Martin
    Aug 29, 2003
  2. Jon paugh
    Replies:
    1
    Views:
    695
  3. Replies:
    1
    Views:
    231
    Chris Rebert
    Aug 30, 2009
  4. Replies:
    0
    Views:
    296
  5. Just D.
    Replies:
    0
    Views:
    410
    Just D.
    Aug 11, 2004
Loading...

Share This Page