RegExp: Problems with matching a(ny) URI

J

Jesper Stocholm

I need to be able to detect URIs in some text and after this replace
dem with HTML-anchors, that is

http://www.tempuri.org/page.html

should be replaced with

<a href="http://www.tempuri.org/page.html">http://www.tempuri.org/page.html</a>

I have made the following code:

re = new RegExp('(((http|https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))');
str = document.forms[0].longtext.value; //textarea with text to replace
newstr = str.replace(re, '<a href="$1">$1</a>');
document.write(newstr);

However, it doesn't quite work as I would like it to. It seems to only
make a single match, and it seems to ingore leading and trailing
whitespaces.

Can you help me solving this?

I test of the code I have so far can be found at
http://www.stocholm.dk/test/html/regexp.html

Thanks,

:eek:)
 
J

Janwillem Borleffs

Jesper Stocholm said:
I need to be able to detect URIs in some text and after this replace
dem with HTML-anchors, that is ....
I have made the following code:

re = new RegExp('(((http|https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))'); ....
However, it doesn't quite work as I would like it to. It seems to only
make a single match, and it seems to ingore leading and trailing
whitespaces.

Try this regexp:

re = /(http|https|ftp)([^ ]+)/ig;

(The i at the end means ignore case, the g means global for multiple
matches)

This doesn't allow spaces in URL's (which isn't allowed anyway).


JW
 
J

Janwillem Borleffs

Janwillem Borleffs said:
Try this regexp:

re = /(http|https|ftp)([^ ]+)/ig;

Forgot to mention that the replacement should be done as follows:

newstr = str.replace(re, '<a href="$1$2">$1$2</a>');


JW
 
L

Lasse Reichstein Nielsen

Jesper Stocholm said:
I need to be able to detect URIs in some text and after this replace
dem with HTML-anchors,
I have made the following code:

re = new RegExp('(((http|https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))'); ....
However, it doesn't quite work as I would like it to.

Step back from the technical part and answer this question:
What is an URL?
or more precisely:
What will you accept as an URL?
(The formal definition is in RFC2396
<URL:http://rfc.sunsite.dk/rfc/rfc2396.html>)

If you can answer it, in detail, then I bet it is easier to make a
regular expression to match it (or get help doing it, because then
the job is precisely specified).

/L
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top