Newbie question on regex.

S

SomeDeveloper

I'm unable to understand the following statement from
http://search.cpan.org/dist/perl/pod/perlretut.pod.

"Because, for example, \d and \w are sets of characters, it is
incorrect to think of [^\d\w] as [\D\W];"

Of course, being subsumed by \w, \d is clearly redundant. But
redundancy aside, aren't [^\d\w] and [\D\W] still mathematically
equivalent?

Tia (for responding to my possible ignorance),
Some Developer.
 
G

Gunnar Hjalmarsson

SomeDeveloper said:
I'm unable to understand the following statement from
http://search.cpan.org/dist/perl/pod/perlretut.pod.

"Because, for example, \d and \w are sets of characters, it is
incorrect to think of [^\d\w] as [\D\W];"

Of course, being subsumed by \w, \d is clearly redundant. But
redundancy aside, aren't [^\d\w] and [\D\W] still mathematically
equivalent?

No. Since \d is a subset of \w, [^\d\w] is the same as [^\w] which in
turn is the same as \W.
 
L

Lars Eighner

In our last episode,
the lovely and talented SomeDeveloper
broadcast on comp.lang.perl.misc:
I'm unable to understand the following statement from
http://search.cpan.org/dist/perl/pod/perlretut.pod.
"Because, for example, \d and \w are sets of characters, it is
incorrect to think of [^\d\w] as [\D\W];"
Of course, being subsumed by \w, \d is clearly redundant. But
redundancy aside, aren't [^\d\w] and [\D\W] still mathematically
equivalent?

No, they are not mathematically equivalent (much less "still").

[\D\W] is all the characters which are not digits plus all the
characters which are not alphanumeric. Thus [\D\W] are all the
characters except digits (i.e. [^\d] which is the same as [\D]).

[^\d\w], as you seem to understand, amounts to [^\w] which is
the same as [\W].
 
B

Ben Morrow

Quoth (e-mail address removed) (SomeDeveloper):
I'm unable to understand the following statement from
http://search.cpan.org/dist/perl/pod/perlretut.pod.

"Because, for example, \d and \w are sets of characters, it is
incorrect to think of [^\d\w] as [\D\W];"

Of course, being subsumed by \w, \d is clearly redundant. But
redundancy aside, aren't [^\d\w] and [\D\W] still mathematically
equivalent?

You are being confused by the fact that \d is a subset of \w. In general
(say, \w and \s) [\S\W] means 'not a space OR not a wordchar' (and will
match everything) whereas [^\s\d] means 'not a space AND not a
wordchar'. As you have (half-)realised, in the case where \d is a
subset of \w these two are equivalent (and also equivalent to simply
'not a wordchar').

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top