Clickable link conversion regex?

T

Tuxedo

Can anyone suggest a solution to enclose bare urls with href tags?

open(my $fh, 'urls.txt') or die $!;

while (my $line = <$fh>) {
$line =~ s[...] # match http or https instances
[...]s; # replace with enclosing hrefs
print $line;
}

The input format may be one or more URLs p/line.

Each scheme begins with either http:// or https:// but not necessarily as a
first string on a line.

Each URL ends with either the end of a line or a whitespace.

The input file would look like for example:

---------- urls.txt -------

http://www.example.com/hello
http://www.example.com/

bla https://www.example.com/a_page.htm plus a string not part of the URL

-----------

If an http or https string already has a preceding occurrence of a closing
html tag ">", such as:
<a href=http://bla.com>http://bla.com</a>
.... then it should be excluded with no replacement.

Two conditions exist in the input file:

The 'http' or 'https' bit will always begin at the first character on a new
line or have a preceding whitespace immediately before itself, like:

http://someurl.com line w/ whitespace before
http://someother.com
hello http://bla.com also w/ a whitespace before

The match and replace output on the above three lines would then be:

<a href=http://someurl.com>http://someurl.com</a> line w/ whitespace before
<a href=http://someother.com>http://someother.com</a>
hello <a href=http://bla.com>http://bla.com</a> also w/ a whitespace before

In case something may written as http://bla, which as in this sentence
isn't a link, it would inadvertently end up being converted into a link,
but that would be a rare occurrence. In other words, without additional
validity checking, the regex would be a best-guess procedure. For a more
strict procedure, each match could perhaps be checked against a
is_web_uri($...) function using Data::Validate::URI that validates http or
https URIs specifically. That said, any example that illustrates a basic
search and replace concept be much appreciated, even if it's only a
best-guess URL type of procedure.

Many thanks for any bright ideas!

Tuxedo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top