Does string contain A, and if so, does a section of string contain B

Jason Carlton · Dec 6, 2009

Tricky subject, sorry.

I'm wanting to check a textarea to see if it contains "<img", and if
so, does the section between "<img" and the following ">" contain
"mydomain.com".

This is particularly tricky since there can be more than one
"<img...>" in the field.

I can do this in Perl easily enough:

while ($comment =~ /(<img[^>]+?>)/sgxi) {
if ($1 =~ /mydomain\.com/gi) {
# do whatever
}
}

But how do I create something similar in Javascript?

TIA,

Jason

Evertjan. · Dec 6, 2009

Jason Carlton wrote on 07 dec 2009 in comp.lang.javascript:

Tricky subject, sorry.

No it is not.

I'm wanting to check a textarea to see if it contains "<img", and if
so, does the section between "<img" and the following ">" contain
"mydomain.com".

This is particularly tricky since there can be more than one
"<img...>" in the field.

I can do this in Perl easily enough:

while ($comment =~ /(<img[^>]+?>)/sgxi) {
if ($1 =~ /mydomain\.com/gi) {
# do whatever
}
}

Do you think that is easy, look at javascript!

But how do I create something similar in Javascript?

var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

Jason Carlton · Dec 6, 2009

Jason Carlton wrote on 07 dec 2009 in comp.lang.javascript:

Tricky subject, sorry.

Click to expand...

No it is not.

I'm wanting to check a textarea to see if it contains "<img", and if
so, does the section between "<img" and the following ">" contain
"mydomain.com".

Click to expand...

This is particularly tricky since there can be more than one
"<img...>" in the field.

Click to expand...

I can do this in Perl easily enough:

Click to expand...

while ($comment =~ /(<img[^>]+?>)/sgxi) {
if ($1 =~ /mydomain\.com/gi) {
# do whatever
}
}

Click to expand...

Do you think that is easy, look at javascript!

But how do I create something similar in Javascript?

Click to expand...

var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
the i.test() function you used, though. Is there a different name for
that function?

Similarly, how do I do the opposite and test if any of the "<img...>"
tags do NOT contain mydomain.com?

Thomas 'PointedEars' Lahn · Dec 7, 2009

Jason said:
Everjan. said:

Jason said:

I can do this in Perl easily enough:

while ($comment =~ /(<img[^>]+?>)/sgxi) {
if ($1 =~ /mydomain\.com/gi) {
# do whatever
}
}

Click to expand...

Click to expand...

I presume this can be done better in Perl, too.

[...]
var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

Click to expand...

That is not equivalent to what you are doing in Perl above, though.
Incidentally, you should not assume people know other languages than those
discussed in the target newsgroup, although it is often the case. When in
doubt, explain what the code in the other language does.

Awesome! Thanks, Evertjan, that is easy. I couldn't find anything on
the i.test() function you used, though.

It is _not_ the i.test() function. The `i' (case-*i*nsensitive) belongs to
the RegExp literal, like in Perl. I am getting the idea here that you do
not know Perl (and Perl-compatible Regular Expressions) either.

Is there a different name for that function?

Any name you want to give it. The property name stands for a reference to a
Function object; that object can have any number of references to it.
(However, it is required here that the base object of the reference is a
RegExp instance).

Similarly, how do I do the opposite and test if any of the "<img...>"
tags do NOT contain mydomain.com?

Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
RTFM.

PointedEars

Jason Carlton · Dec 7, 2009

I presume this can be done better in Perl, too.

TIMTOWTDI.

It is _not_ the i.test() function. The `i' (case-*i*nsensitive) belongs to
the RegExp literal, like in Perl. I am getting the idea here that you do
not know Perl (and Perl-compatible Regular Expressions) either.

Don't be a douche. I'd never seen the switch followed by .test, and
really have never used a switch in Javascript, so I didn't catch that
this is what that was. Sue me.

Possibility: Non-capturing negative lookahead (borrowed from PCRE, too).
RTFM.

I looked into that before posting, but I'm not sure that (a) I'm doing
it right, and (b) it's going to do what I'm needing.

This just returns true on everything:

booleanResult = /(?!<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);

This returns false if there's only one <img...> tag that doesn't
contain mydomain.com, but if I have multiple tags then it returns true
if any of them do not contain mydomain.com:

booleanResult = /(?=<img[^>]+mydomain\.com[^>]*>)/gi.test(comment);

Which means that it would return this as false:

var comment = "Test <img src='http://www.yahoo.com/logo.gif'>";

But this as true:

var comment = "Test <img src='http://www.mydomain.com/
logo.gif'><br>Test <img src='http://www.yahoo.com/logo.gif'>";

I need it to return false if ANY of the instances existed that didn't
contain mydomain.com.

abozhilov · Dec 7, 2009

Evertjan. said:
var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

[^>]+

+ is greedy and here you have backtracking when engine go to `>`. You
can see in RegexBuddy with string:

<img src="mydomain.com" alt="" /> => Regex engine make 66 step before
match.

If you make plus lazzy:

<img[^>]+?mydomain\.com[^>]*> => 30 step

Regards.

Csaba Gabor · Dec 7, 2009

On Dec 7 said:
Which means that it would return this as false:

var comment = "Test <img src='http://www.yahoo.com/logo.gif'>";

But this as true:

var comment = "Test <img src='http://www.mydomain.com/
logo.gif'><br>Test <img src='http://www.yahoo.com/logo.gif'>";

I need it to return false if ANY of the instances existed that didn't
contain mydomain.com.

I would try something like:
if (!(1+comment.replace(
/<img[^>]+?mydomain\.com[^>]*?>/gi,"<img>").
search(/<img[^>]+?>/i)))
alert ("all have mydomain.com");
else alert ("non mydomain.com detected");

That first replace is for degenerate cases of <img> in the string.
The second replace replaces all properly formed <img ...> elements
with a dummy element. The search then checks for any rogue
elements still left.

However, what about the case of something like:
<img src='othercomain.com' title='<img src="mydomain.com">'>
Everything discussed so far will fail on that - a
broader approach is necessary if you want to protect
against more complicated strings.

Csaba Gabor from Vienna

Evertjan. · Dec 7, 2009

Thomas 'PointedEars' Lahn wrote on 07 dec 2009 in comp.lang.javascript:

var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

Click to expand...

Click to expand...

That is not equivalent to what you are doing in Perl above, though.
Incidentally, you should not assume people know other languages than
those discussed in the target newsgroup, although it is often the
case. When in doubt, explain what the code in the other language
does.

Indeed, I don't know a perl from a swine.

It is _not_ the i.test() function. The `i' (case-*i*nsensitive)
belongs to the RegExp literal, like in Perl. I am getting the idea
here that you do not know Perl (and Perl-compatible Regular
Expressions) either.

Any name you want to give it. The property name stands for a
reference to a Function object; that object can have any number of
references to it. (However, it is required here that the base object
of the reference is a RegExp instance).

Possibility: Non-capturing negative lookahead (borrowed from PCRE,
too). RTFM.

No lookahead needed,
if "none" of the tags is ment.

var invertedBooleanResult = !/<img[^>]+mydomain\.com[^>]*>/i.test(str)

Thomas 'PointedEars' Lahn · Dec 7, 2009

Jason said:
Don't be a douche. I'd never seen the switch followed by .test, and
really have never used a switch in Javascript, so I didn't catch that
this is what that was. Sue me.

I looked into that before posting, but I'm not sure that (a) I'm doing
it right, and (b) it's going to do what I'm needing.

That's too bad.

Score adjusted

PointedEars

Asen Bozhilov · Dec 7, 2009

Csaba said:
I would try something like:
if (!(1+comment.replace(
/<img[^>]+?mydomain\.com[^>]*?>/gi,"<img>").
search(/<img[^>]+?>/i)))
alert ("all have mydomain.com");
else alert ("non mydomain.com detected");

That first replace is for degenerate cases of <img> in the string.
The second replace replaces all properly formed <img ...> elements
with a dummy element. The search then checks for any rogue
elements still left.

Interesting. But your approach make two steps before completely
analyze input string.
What about this one:

/<img(?

?!mydomain\.com)[^>])+?>/i;

Will be match first image which doesn't contain "mydomain.com".

Regards ;~)

Jason Carlton · Dec 8, 2009

Evertjan. said:
Evertjan. said:

var booleanResult = /<img[^>]+mydomain\.com[^>]*>/i.test(str)

Click to expand...

[^>]+

+ is greedy and here you have backtracking when engine go to `>`. You
can see in RegexBuddy with string:

<img src="mydomain.com" alt="" /> => Regex engine make 66 step before
match.

If you make plus lazzy:

<img[^>]+?mydomain\.com[^>]*> => 30 step

Regards.

Thanks to all of you! This really helped a lot.

- Jason

Dr J R Stockton · Dec 8, 2009

In comp.lang.javascript message <bed20c49-b2fd-4f53-ade1-299b64ede909@g3
1g2000vbr.googlegroups.com>, Sun, 6 Dec 2009 15:16:35, Jason Carlton

I'm wanting to check a textarea to see if it contains "<img", and if
so, does the section between "<img" and the following ">" contain
"mydomain.com".

This is particularly tricky since there can be more than one
"<img...>" in the field.

Under such circumstances, and particularly if you are not fully familiar
with all the features of tee latest JavaScript RegExps, it may help to
tackle the problem in more than one pass.

In this case, consider first replacing all "<img" with a single
character that is not in the string already (Unicode offers tens of
thousands). You can then more easily express the condition that a
substring must not contain the consecutive characters < i m g .

javascript function call does not contain rendered asp code for firstrecord of recordset	0	Jul 24, 2008
FAQ 9.4 How do I remove HTML from a string?	0	Apr 10, 2011
Checking a value of a recordset and seeing if a queried string valueexists	2	Jul 28, 2009
Sizewell B++	0	Mar 30, 2006
A Brief Review of jQuery 1.5	13	Feb 13, 2011
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
Finding an outputting a string of text	0	Jan 11, 2006
datagrid sort string (is representation of date) as a DATE	1	Mar 31, 2006

Does string contain A, and if so, does a section of string contain B

Jason Carlton

Evertjan.

Jason Carlton

Thomas 'PointedEars' Lahn

Jason Carlton

abozhilov

Csaba Gabor

Evertjan.

Thomas 'PointedEars' Lahn

Asen Bozhilov

Jason Carlton

Dr J R Stockton

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads