Regular expression woes

M

Mark (News)

I'm not really sure where to post this question as it covers so many
platforms, but as the platform isn't relevant, here goes...

I'm trying to (pulling my hair out more like) construct a regular
expression string that says the following: "match if the input string
does not start with the characters http". E.g.

e.g.
"this string" - match
"this http string" - match
"http-and-a-bit-more-text" - no match
"ht" - match
"" - match

I've tried something like ^[^(^http)] but this gives no match on the
last 2. Any ideas? - I'd really appreciate it!
Cheers
Mark
 
P

Paul Lalli

Mark (News) said:
I'm not really sure where to post this question as it covers so many
platforms, but as the platform isn't relevant, here goes...

Incorrect. The platform is exceedingly relevant. Regular expressions
are not a constant across languages. Perl regular expression are not
the same as Javascript regular expressions are not the same as PHP
regular expressions.

Choose one or the other, tell us what you're *trying* to do, and in what
environment you're doing it, and then someone can help you.

Paul Lalli
 
L

Leendert Bottelberghs

I'm trying to (pulling my hair out more like) construct a regular
expression string that says the following: "match if the input string
does not start with the characters http". E.g.

e.g.
"this string" - match
"this http string" - match
"http-and-a-bit-more-text" - no match
"ht" - match
"" - match

So don't match if the string starts with "http":

$str !~ m/^http/


-leendert bottelberghs
 
I

ioneabu

Mark said:
I'm not really sure where to post this question as it covers so many
platforms, but as the platform isn't relevant, here goes...

I'm trying to (pulling my hair out more like) construct a regular
expression string that says the following: "match if the input string
does not start with the characters http". E.g.

wouldn't it be:

$match !~ m/^http/;

Is there an equivalent negation metacharacter for a word and not just a
character class? I was just wondering about that.

wana
 
C

Chris Mattern

Mark said:
I'm not really sure where to post this question as it covers so many
platforms, but as the platform isn't relevant, here goes...

I'm trying to (pulling my hair out more like) construct a regular
expression string that says the following: "match if the input string
does not start with the characters http". E.g.

e.g.
"this string" - match
"this http string" - match
"http-and-a-bit-more-text" - no match
"ht" - match
"" - match

I've tried something like ^[^(^http)] but this gives no match on the
last 2. Any ideas? - I'd really appreciate it!
Cheers
Mark

Use the "does not match" operator, !~.

if ($my_string !~ /^http/) {
do_something(); }

If you're not using perl, well I guess your platform *is* relevant...
--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"
 
S

Sherm Pendley

Paul said:
Incorrect. The platform is exceedingly relevant. Regular expressions
are not a constant across languages. Perl regular expression are not
the same as Javascript regular expressions are not the same as PHP
regular expressions.

Also, what you're trying to do - negate a match condition - is often easier
to do in the host language than in the regex itself. For example, in Perl
you could do what you asked with this:

if ($some_string !~ /^http/) { ... }
# or
unless (/^http/) { ... }

But that just reinforces Paul's point - the platform is very relevant.

sherm--
 
M

Mark (News)

I appreciate all the effort in providing a solution to the wider
problem, but perhaps I should have been more explicit - my fault.

I'm specifically trying to avoid using the host shell to do the
negation even though I can use this approach in just about any
language. What I'm really after is to contain the logic entirely within
the regular expression.

Why? Intellectual exercise. :) (Kind of like why people climb
mountains, but without having to take my butt off the chair.)

Cheers
Mark
 
E

Evertjan.

Mark (News) wrote on 04 feb 2005 in comp.lang.javascript:
I'm not really sure where to post this question as it covers so many
platforms, but as the platform isn't relevant, here goes...

I'm trying to (pulling my hair out more like) construct a regular
expression string that says the following: "match if the input string
does not start with the characters http". E.g.

e.g.
"this string" - match
"this http string" - match
"http-and-a-bit-more-text" - no match
"ht" - match
"" - match

In javascript this function is not match but test:

var s = "this http string"

if (!/^http/.test(s))
alert("Match!")
else
alert("No match!")
 
R

Rasto Levrinc

Mark said:
I appreciate all the effort in providing a solution to the wider
problem, but perhaps I should have been more explicit - my fault.

I'm specifically trying to avoid using the host shell to do the
negation even though I can use this approach in just about any
language. What I'm really after is to contain the logic entirely within
the regular expression.

You can do it with a zero-width negative look-ahead assertion in perl.

$string=~/^(?!http)/
 
D

Dietmar Meier

You can do it with a zero-width negative look-ahead assertion in perl.

$string=~/^(?!http)/

Some JavaScript implementations implement regular expressions but
don't implement look-ahead assertions. Here you would need

/^([^h]ttp.*|h[^t]tp.*|ht[^t]p|htt[^p].*|.{0,3})$/.test(string)

ciao, dhgm
 
E

Evertjan.

Dietmar Meier wrote on 04 feb 2005 in comp.lang.javascript:
You can do it with a zero-width negative look-ahead assertion in perl.

$string=~/^(?!http)/

Some JavaScript implementations implement regular expressions but
don't implement look-ahead assertions. Here you would need

/^([^h]ttp.*|h[^t]tp.*|ht[^t]p|htt[^p].*|.{0,3})$/.test(string)

[The $ cannot be right, I think.]

r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p]))/.test(s)
 
E

Evertjan.

Dietmar Meier wrote on 04 feb 2005 in comp.lang.javascript:
Evertjan. said:
/^([^h]ttp.*|h[^t]tp.*|ht[^t]p|htt[^p].*|.{0,3})$/.test(string)
[The $ cannot be right, I think.]

For what value of string do you think, the "$" would lead to the
wrong result?

"xttp://" should return true
"http://" should return false

Yes, you are right here.
r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p]))/.test(s)

This would not match strings with 3 or less characters.

Yes, you are right again.

Let me try:

r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p])|(.{0,3}$))/.test(s)

[I could loose some () but I like them for clarity
 
G

Grant Wagner

Dietmar Meier said:
You can do it with a zero-width negative look-ahead assertion in
perl.

$string=~/^(?!http)/

Some JavaScript implementations implement regular expressions but
don't implement look-ahead assertions. Here you would need

/^([^h]ttp.*|h[^t]tp.*|ht[^t]p|htt[^p].*|.{0,3})$/.test(string)

Why do people insist on doing things the hardest way possible. Test for
the condition you don't want, then negate it.

if (!/^http/i.test(some_string)) { ... }

By the way, this is pretty much the same solution already provided for
Perl:

if ($some_string !~ /^http/) { ... }

(although I chose to make it case-insensitive, since the protocol in a
URI isn't case-sensitive, it could be upper, lower or mixed case)
 
M

Mark (News)

"Why do people insist on doing things the hardest way possible."? Well,
as I said in an earlier post, I wanted to do the whole thing within a
regex rather than resorting to the shell. Mainly because, crazy as it
sounds, it's a fun intellectual exercise. :) And anyway, if I always
take the path of least resistance, I'll never learn, right? (But I
guess that's OT.)
 
R

Richards Noah \(IFR LIT MET\)

Evertjan. said:
Dietmar Meier wrote on 04 feb 2005 in comp.lang.javascript:
Evertjan. said:
/^([^h]ttp.*|h[^t]tp.*|ht[^t]p|htt[^p].*|.{0,3})$/.test(string)
[The $ cannot be right, I think.]

For what value of string do you think, the "$" would lead to the
wrong result?

"xttp://" should return true
"http://" should return false

Yes, you are right here.
r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p]))/.test(s)

This would not match strings with 3 or less characters.

Yes, you are right again.

Let me try:

r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p])|(.{0,3}$))/.test(s)

[I could loose some () but I like them for clarity

None of those regular expressions will work. For example, you regexp will
not match against "this string", since it differs in 4 places in the first 4
characters.

You cannot negate a string by negating each character. If you really wanted
to do it in that way, you would have to negate all possible combinations of
letters in "http". So, just for fun, it would look something like this
(newlines added for clarity):

/^(
([^h][^t][^t][^p])|

(h[^t][^t][^p])|
([^h]t[^t][^p])|
([^h][^t]t[^p])|
([^h][^t][^t]p)|

(ht[^t][^p])|
(h[^t]t[^p])|
(h[^t][^t]p)|
([^h]tt[^p])|
([^h]t[^t]p)|
([^h][^t]tp)|

(htt[^p])|
(ht[^t]p)|
(h[^t]tp)|
([^h]ttp)

)|(.{0,3}$)/

The moral of this story: "negating" a string in regular expressions is very,
very ugly (without negative look ahead). Your best bet, as many others have
mentioned, is to do something akin to perl's !~, i.e. match against ^http,
and consider matches to be, well, not matches.
 
E

Evertjan.

Richards Noah (IFR LIT MET) wrote on 04 feb 2005 in
comp.lang.javascript:
r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p])|(.{0,3}$))/.test(s)

[I could loose some () but I like them for clarity

None of those regular expressions will work. For example, you regexp
will not match against "this string", since it differs in 4 places in
the first 4 characters.

s = "this string"
r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p])|(.{0,3}$))/.test(s)
alert(r)

shows: true as per OQ.

So what is the problem?

Please show a string that does not work.
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Evertjan.
r = /^(([^h]...)|(.[^t]..)|(..[^t].)|(...[^p])|(.{0,3}$))/.test(s)

Too much work.

[^h]
| h[^t]
| ht[^t]
| htt[^p]
| .{0,3}$

Hope this helps,
Ilya
 
M

Mark (News)

Is it true that if zero-width negative look-ahead is not available,
there is always an alternative regex to do the job?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top