RegExp: Problems with matching a(ny) URI

Discussion in 'Javascript' started by Jesper Stocholm, Aug 14, 2003.

  1. I need to be able to detect URIs in some text and after this replace
    dem with HTML-anchors, that is

    should be replaced with

    <a href=""></a>

    I have made the following code:

    re = new RegExp('(((http|https|ftp):\/\/)?\w+[.\w]+([^\w]*[\s]+))');
    str = document.forms[0].longtext.value; //textarea with text to replace
    newstr = str.replace(re, '<a href="$1">$1</a>');

    However, it doesn't quite work as I would like it to. It seems to only
    make a single match, and it seems to ingore leading and trailing

    Can you help me solving this?

    I test of the code I have so far can be found at


    Jesper Stocholm, Aug 14, 2003
  2. Try this regexp:

    re = /(http|https|ftp)([^ ]+)/ig;

    (The i at the end means ignore case, the g means global for multiple

    This doesn't allow spaces in URL's (which isn't allowed anyway).

    Janwillem Borleffs, Aug 14, 2003
  3. Forgot to mention that the replacement should be done as follows:

    newstr = str.replace(re, '<a href="$1$2">$1$2</a>');

    Janwillem Borleffs, Aug 14, 2003
  4. Step back from the technical part and answer this question:
    What is an URL?
    or more precisely:
    What will you accept as an URL?
    (The formal definition is in RFC2396

    If you can answer it, in detail, then I bet it is easier to make a
    regular expression to match it (or get help doing it, because then
    the job is precisely specified).

    Lasse Reichstein Nielsen, Aug 14, 2003
