Does string contain A, and if so, does a section of string contain B

Discussion in 'Javascript' started by Jason Carlton, Dec 6, 2009.

  1. Tricky subject, sorry.

    I'm wanting to check a textarea to see if it contains "<img", and if
    so, does the section between "<img" and the following ">" contain
    "mydomain.com".

    This is particularly tricky since there can be more than one
    "<img...>" in the field.

    I can do this in Perl easily enough:

    while ($comment =~ /(<img[^>]+?>)/sgxi) {
    if ($1 =~ /mydomain\.com/gi) {
    # do whatever
    }
    }


    But how do I create something similar in Javascript?

    TIA,

    Jason
    Jason Carlton, Dec 6, 2009
    #1
    1. Advertising

  2. Jason Carlton

    Evertjan. Guest

    Jason Carlton wrote on 07 dec 2009 in comp.lang.javascript:

    > Tricky subject, sorry.


    No it is not.


    > I'm wanting to check a textarea to see if it contains "<img", and if
    > so, does the section between "<img" and the following ">" contain
    > "mydomain.com".
    >
    > This is particularly tricky since there can be more than one
    > "<img...>" in the field.
    >
    > I can do this in Perl easily enough:
    >
    > while ($comment =~ /(<img[^>]+?>)/sgxi) {
    > if ($1 =~ /mydomain\.com/gi) {
    > # do whatever
    > }
    >}


    Do you think that is easy, look at javascript!

    > But how do I create something similar in Javascript?



    var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)


    --
    Evertjan.
    The Netherlands.
    (Please change the x'es to dots in my emailaddress)
    Evertjan., Dec 6, 2009
    #2
    1. Advertising

  3. Re: Does string contain A, and if so, does a section of stringcontain B

    On Dec 6, 6:27 pm, "Evertjan." <> wrote:
    > Jason Carlton wrote on 07 dec 2009 in comp.lang.javascript:
    >
    > > Tricky subject, sorry.

    >
    > No it is not.
    >
    > > I'm wanting to check a textarea to see if it contains "<img", and if
    > > so, does the section between "<img" and the following ">" contain
    > > "mydomain.com".

    >
    > > This is particularly tricky since there can be more than one
    > > "<img...>" in the field.

    >
    > > I can do this in Perl easily enough:

    >
    > > while ($comment =~ /(<img[^>]+?>)/sgxi) {
    > >   if ($1 =~ /mydomain\.com/gi) {
    > >     # do whatever
    > >   }
    > >}

    >
    > Do you think that is easy, look at javascript!
    >
    > > But how do I create something similar in Javascript?

    >
    > var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)
    >
    > --
    > Evertjan.
    > The Netherlands.
    > (Please change the x'es to dots in my emailaddress)



    Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
    the i.test() function you used, though. Is there a different name for
    that function?

    Similarly, how do I do the opposite and test if any of the "<img...>"
    tags do NOT contain mydomain.com?
    Jason Carlton, Dec 6, 2009
    #3
  4. Jason Carlton wrote:

    > Everjan. wrote:
    > > Jason Carlton wrote:
    > > > I can do this in Perl easily enough:
    > > >
    > > > while ($comment =~ /(<img[^>]+?>)/sgxi) {
    > > > if ($1 =~ /mydomain\.com/gi) {
    > > > # do whatever
    > > > }
    > > > }


    I presume this can be done better in Perl, too.

    > > [...]
    > > var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)


    That is not equivalent to what you are doing in Perl above, though.
    Incidentally, you should not assume people know other languages than those
    discussed in the target newsgroup, although it is often the case. When in
    doubt, explain what the code in the other language does.

    > Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
    > the i.test() function you used, though.


    It is _not_ the i.test() function. The `i' (case-*i*nsensitive) belongs to
    the RegExp literal, like in Perl. I am getting the idea here that you do
    not know Perl (and Perl-compatible Regular Expressions) either.

    > Is there a different name for that function?


    Any name you want to give it. The property name stands for a reference to a
    Function object; that object can have any number of references to it.
    (However, it is required here that the base object of the reference is a
    RegExp instance).

    > Similarly, how do I do the opposite and test if any of the "<img...>"
    > tags do NOT contain mydomain.com?


    Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
    RTFM.


    PointedEars
    --
    realism: HTML 4.01 Strict
    evangelism: XHTML 1.0 Strict
    madness: XHTML 1.1 as application/xhtml+xml
    -- Bjoern Hoehrmann
    Thomas 'PointedEars' Lahn, Dec 7, 2009
    #4
  5. Re: Does string contain A, and if so, does a section of stringcontain B

    > I presume this can be done better in Perl, too.

    TIMTOWTDI.


    > It is _not_ the i.test() function.  The `i' (case-*i*nsensitive) belongs to
    > the RegExp literal, like in Perl.  I am getting the idea here that you do
    > not know Perl (and Perl-compatible Regular Expressions) either.


    Don't be a douche. I'd never seen the switch followed by .test, and
    really have never used a switch in Javascript, so I didn't catch that
    this is what that was. Sue me.


    > Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
    > RTFM.


    I looked into that before posting, but I'm not sure that (a) I'm doing
    it right, and (b) it's going to do what I'm needing.

    This just returns true on everything:

    booleanResult = /(?!<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);


    This returns false if there's only one <img...> tag that doesn't
    contain mydomain.com, but if I have multiple tags then it returns true
    if any of them do not contain mydomain.com:

    booleanResult = /(?=<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);

    Which means that it would return this as false:

    var comment = "Test <img src='http://www.yahoo.com/logo.gif'>";

    But this as true:

    var comment = "Test <img src='http://www.mydomain.com/
    logo.gif'><br>Test <img src='http://www.yahoo.com/logo.gif'>";


    I need it to return false if ANY of the instances existed that didn't
    contain mydomain.com.
    Jason Carlton, Dec 7, 2009
    #5
  6. Jason Carlton

    abozhilov Guest

    Re: Does string contain A, and if so, does a section of stringcontain B

    Evertjan. wrote:

    > var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)


    [^>]+

    + is greedy and here you have backtracking when engine go to `>`. You
    can see in RegexBuddy with string:

    <img src="mydomain.com" alt="" /> => Regex engine make 66 step before
    match.

    If you make plus lazzy:

    <img[^>]+?mydomain\.com[^>]*> => 30 step

    Regards.
    abozhilov, Dec 7, 2009
    #6
  7. Jason Carlton

    Csaba Gabor Guest

    Re: Does string contain A, and if so, does a section of stringcontain B

    On Dec 7, 9:10 am, Jason Carlton <> wrote:
    ....
    > Which means that it would return this as false:
    >
    > var comment = "Test <img src='http://www.yahoo.com/logo.gif'>";
    >
    > But this as true:
    >
    > var comment = "Test <img src='http://www.mydomain.com/
    > logo.gif'><br>Test <img src='http://www.yahoo.com/logo.gif'>";
    >
    > I need it to return false if ANY of the instances existed that didn't
    > contain mydomain.com.


    I would try something like:
    if (!(1+comment.replace(
    /<img[^>]+?mydomain\.com[^>]*?>/gi,"<img>").
    search(/<img[^>]+?>/i)))
    alert ("all have mydomain.com");
    else alert ("non mydomain.com detected");

    That first replace is for degenerate cases of <img> in the string.
    The second replace replaces all properly formed <img ...> elements
    with a dummy element. The search then checks for any rogue
    elements still left.

    However, what about the case of something like:
    <img src='othercomain.com' title='<img src="mydomain.com">'>
    Everything discussed so far will fail on that - a
    broader approach is necessary if you want to protect
    against more complicated strings.

    Csaba Gabor from Vienna
    Csaba Gabor, Dec 7, 2009
    #7
  8. Jason Carlton

    Evertjan. Guest

    Thomas 'PointedEars' Lahn wrote on 07 dec 2009 in comp.lang.javascript:

    >> > var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

    >
    > That is not equivalent to what you are doing in Perl above, though.
    > Incidentally, you should not assume people know other languages than
    > those discussed in the target newsgroup, although it is often the
    > case. When in doubt, explain what the code in the other language
    > does.


    Indeed, I don't know a perl from a swine.

    >> Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
    >> the i.test() function you used, though.

    >
    > It is _not_ the i.test() function. The `i' (case-*i*nsensitive)
    > belongs to the RegExp literal, like in Perl. I am getting the idea
    > here that you do not know Perl (and Perl-compatible Regular
    > Expressions) either.
    >
    >> Is there a different name for that function?

    >
    > Any name you want to give it. The property name stands for a
    > reference to a Function object; that object can have any number of
    > references to it. (However, it is required here that the base object
    > of the reference is a RegExp instance).
    >
    >> Similarly, how do I do the opposite and test if any of the "<img...>"
    >> tags do NOT contain mydomain.com?

    >
    > Possibility: Non-capturing negative lookahead (borrowed from PCRE,
    > too). RTFM.


    No lookahead needed,
    if "none" of the tags is ment.

    var invertedBooleanResult = !/<img[^>]+mydomain\.com[^>]*>/i.test(str)


    --
    Evertjan.
    The Netherlands.
    (Please change the x'es to dots in my emailaddress)
    Evertjan., Dec 7, 2009
    #8
  9. Jason Carlton wrote:

    >> I presume this can be done better in Perl, too.

    >
    > TIMTOWTDI.
    >
    >> It is _not_ the i.test() function. The `i' (case-*i*nsensitive) belongs
    >> to the RegExp literal, like in Perl. I am getting the idea here that you
    >> do not know Perl (and Perl-compatible Regular Expressions) either.

    >
    > Don't be a douche. I'd never seen the switch followed by .test, and
    > really have never used a switch in Javascript, so I didn't catch that
    > this is what that was. Sue me.
    >
    >
    >> Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
    >> RTFM.

    >
    > I looked into that before posting, but I'm not sure that (a) I'm doing
    > it right, and (b) it's going to do what I'm needing.


    That's too bad.


    Score adjusted

    PointedEars
    Thomas 'PointedEars' Lahn, Dec 7, 2009
    #9
  10. Re: Does string contain A, and if so, does a section of stringcontain B

    Csaba Gabor wrote:
    > I would try something like:
    > if (!(1+comment.replace(
    >     /<img[^>]+?mydomain\.com[^>]*?>/gi,"<img>").
    >     search(/<img[^>]+?>/i)))
    >      alert ("all have mydomain.com");
    > else alert ("non mydomain.com detected");
    >
    > That first replace is for degenerate cases of <img> in the string.
    > The second replace replaces all properly formed <img ...> elements
    > with a dummy element.  The search then checks for any rogue
    > elements still left.


    Interesting. But your approach make two steps before completely
    analyze input string.
    What about this one:

    /<img(?:(?!mydomain\.com)[^>])+?>/i;

    Will be match first image which doesn't contain "mydomain.com".

    Regards ;~)
    Asen Bozhilov, Dec 7, 2009
    #10
  11. Re: Does string contain A, and if so, does a section of stringcontain B

    On Dec 7, 4:59 am, abozhilov <> wrote:
    > Evertjan. wrote:
    > > var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

    >
    > [^>]+
    >
    > + is greedy and here you have backtracking when engine go to `>`. You
    > can see in RegexBuddy with string:
    >
    > <img src="mydomain.com" alt="" /> => Regex engine make 66 step before
    > match.
    >
    > If you make plus lazzy:
    >
    > <img[^>]+?mydomain\.com[^>]*> => 30 step
    >
    > Regards.



    Thanks to all of you! This really helped a lot.

    - Jason
    Jason Carlton, Dec 8, 2009
    #11
  12. In comp.lang.javascript message <bed20c49-b2fd-4f53-ade1-299b64ede909@g3
    1g2000vbr.googlegroups.com>, Sun, 6 Dec 2009 15:16:35, Jason Carlton
    <> posted:
    >
    >I'm wanting to check a textarea to see if it contains "<img", and if
    >so, does the section between "<img" and the following ">" contain
    >"mydomain.com".
    >
    >This is particularly tricky since there can be more than one
    >"<img...>" in the field.



    Under such circumstances, and particularly if you are not fully familiar
    with all the features of tee latest JavaScript RegExps, it may help to
    tackle the problem in more than one pass.

    In this case, consider first replacing all "<img" with a single
    character that is not in the string already (Unicode offers tens of
    thousands). You can then more easily express the condition that a
    substring must not contain the consecutive characters < i m g .

    --
    (c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05 IE 7.
    Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
    I find MiniTrue useful for viewing/searching/altering files, at a DOS prompt;
    free, DOS/Win/UNIX, <URL:http://www.idiotsdelight.net/minitrue/> unsupported.
    Dr J R Stockton, Dec 8, 2009
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?U2F0aXNoIEJhYnUgRGFzYXJp?=

    The application's configuration files must contain 'trust' section

    =?Utf-8?B?U2F0aXNoIEJhYnUgRGFzYXJp?=, Jan 11, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    1,812
    =?Utf-8?B?U2F0aXNoIEJhYnUgRGFzYXJp?=
    Jan 11, 2006
  2. Satish Babu Dasari

    The application's configuration files must contain 'trust' section

    Satish Babu Dasari, Jan 11, 2006, in forum: ASP .Net Security
    Replies:
    2
    Views:
    306
    Satish Babu Dasari
    Jan 11, 2006
  3. Bob

    config files must contain trust section

    Bob, Dec 5, 2006, in forum: ASP .Net Security
    Replies:
    0
    Views:
    123
  4. Roger Pack
    Replies:
    3
    Views:
    155
    Roger Pack
    Sep 28, 2010
  5. kampy
    Replies:
    9
    Views:
    323
    Steven D'Aprano
    Oct 19, 2012
Loading...

Share This Page