regexp - unclosed character class

M

Mike

Hi,

I am trying to get

"/^(([^<>()[\\]\\.,;:\\s@\"]+(\\.[^<>()[\\]\\.,;:\\s@\"]+)*)|(\".+\"))@((\\[
[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+
[a-zA-Z]{2,}))$/";

to work. It compiles, but when I run it I get:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148
/^(([^<>()[\]\.,;:\s@"]+(\.[^<>()[\]\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3
{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

this is the email address validator from
http://www.breakingpar.com/bkp/home.nsf/Doc!OpenNavigator&87256B280015193F87
256C40004CC8C6

that I am trying to get working in Java. Original regexp is

/^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1
,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,})
)$/

please help !

Mike
 
P

Paul Lutus

Mike said:
Hi,

I am trying to get

[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+
[a-zA-Z]{2,}))$/";

to work.

Well then, try breaking the task into smaller, more manageable pieces.
It compiles,

Post the code in which it "works".
but when I run it

Define "run it". Post your code.
I get:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148
/^(([^<>()[\]\.,;:\s@"]+(\.[^<>()[\]\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3
{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

this is the email address validator from
http://www.breakingpar.com/bkp/home.nsf/Doc!OpenNavigator&87256B280015193F87
256C40004CC8C6

that I am trying to get working in Java. Original regexp is

,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,})
)$/

please help !

To put this expression into quotes for Java, just escape each escape
character:

'\' -> '\\'

and escape each quote character:

'"' -> '\"'

and remove the beginning and ending slashes, and POST YOUR CODE, and it
should work.
 
M

Mike

hi Paul,

thanks for the help
Post the code in which it "works".

the following compiles, but produces the run-time error below:

import java.util.regex.*;

public class test {
public test (){

String sampleText = "(e-mail address removed)";
String sampleRegex =
"^(([^<>()[\\]\\.,;:\\s@\"]+(\\.[^<>()[\\]\\.,;:\\s@\"]+)*)|(\".+\"))@((\\[[
0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[
a-zA-Z]{2,}))$";
java.util.regex.Pattern p = java.util.regex.Pattern.compile(sampleRegex);
java.util.regex.Matcher m = p.matcher(sampleText);

if (m.find()){
System.out.println("Matched");
}

}

public static void main (String [] args){
test fred = new test();
}
}
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148
/^(([^<>()[\]\.,;:\s@"]+(\.[^<>()[\]\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3
{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
To put this expression into quotes for Java, just escape each escape
character:

'\' -> '\\'

so when there is a single slash add another, where there are already 2 leave
alone ?
and escape each quote character:
'"' -> '\"'
and remove the beginning and ending slashes, and POST YOUR CODE, and it
should work.

I tried your suggestions - or at least how I interpreted them and it won't
compile.

Here is the pattern I tried:

"^(([^<>()[\\]\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\.,;:\\s@\\"]+)*)|(\\".+\\"))@((
\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\
..)+[a-zA-Z]{2,}))$";

as opposed to the original:

"^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1
,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,})
)$"

What bit did I misunderstand ?

Mike
 
P

Paul Lutus

Mike said:
hi Paul,

thanks for the help
Post the code in which it "works".

the following compiles, but produces the run-time error below:

import java.util.regex.*;

public class test {
public test (){

String sampleText = "(e-mail address removed)";
String sampleRegex =
0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[
a-zA-Z]{2,}))$";
java.util.regex.Pattern p = java.util.regex.Pattern.compile(sampleRegex);
java.util.regex.Matcher m = p.matcher(sampleText);

if (m.find()){
System.out.println("Matched");
}

}

public static void main (String [] args){
test fred = new test();
}
}
Exception in thread "main" java.util.regex.PatternSyntaxException:
Unclo acter class near index 148
/^(([^ said:
{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
To put this expression into quotes for Java, just escape each escape
character:

'\' -> '\\'

so when there is a single slash add another, where there are already 2
leave alone ?

No, instead, do exactly what I said:
 
R

Roedy Green

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148
/^(([^<>()[\]\.,;:\s@"]+(\.[^<>()[\]\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3
{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

what you need is
http://mindprod.com/projregexproofreader.html

Unfortunately no one has written it yet.

I suggest printing this on a piece of paper, and first of all
highlight all quoted characters. Then look the [] () balancing of
what is left.
 
P

Paul Lutus

Mike said:
What bit did I misunderstand ?

You "bit" off more than you could chew. The pattern is some
complexity-freak's wet dream, and without access to the original, working
string in it native source file (which you do not provide), it is
impossible to sort it out.

public class Test {

public static void main(String[] args)
{

String patt = "^[\\w|\\.]+\\@\\w+\\.\\w+$";
String test = "(e-mail address removed)";

java.util.regex.Pattern p = java.util.regex.Pattern.compile(patt);
java.util.regex.Matcher m = p.matcher(test);

System.out.println(m.find() ? "Matched." : "Failed.");

}
}
 
V

VisionSet

Mike said:
[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+
[a-zA-Z]{2,}))$/";

to work. It compiles, but when I run it I get:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148

Instead of this part of your regex:

[^<>()[\\]\\.,;:\\s@\"]

do you mean:

[^<>()\\[\\].,;:\\s@\"]
 
M

Mike

Hi again,
I have tried to follow your advice, but I am still not getting anywhere. I
have added a \ to every occurance of a \

ie \s becomes \\s and
\dot becomes \\dot
(\".+\") becomes (\\".+\\")

is this what you meant ? When I try and compile I get this:

test.java:8: ';' expected
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";
^
test.java:8: illegal character: \64
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";

^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";

^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";

^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";

^
test.java:8: illegal character: \92
String sampleRegex =
"^(([^<>()[\\]\\\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\\\.,;:\\s@\\"]+)*)|(\\".+\\")
)@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9
]+\\.)+[a-zA-Z]{2,}))$";

^
16 errors


.....
 
M

Mike

Paul Lutus said:
You "bit" off more than you could chew. The pattern is some
complexity-freak's wet dream, and without access to the original, working
string in it native source file (which you do not provide), it is
impossible to sort it out.

I provided the link to the article and web page which contains the source.

The page demos the regexp and copied below is the javascript function
implemented in that page.

<script type="text/javascript" language="JavaScript"><!--
function isValidEmail(emailAddress) {
var re =
/^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1
,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,})
)$/
return re.test(emailAddress);
}
// -->
</script>

Mike
 
M

Mike

Mike W,

I think that you may have hit the nail on the head !

This is the regexp modified in-line with your suggestion:

^(([^<>()\\[\\].,;:\\s@\"]+(\\.[^<>()\\[].,;:\\s@\"]+)*)|(\".+\"))@((\\[[0-9
]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[a-z
A-Z]{2,}))$

I have not however tried matching against weird and wonderful (but
nevertheless valid addresses).

many thanks (to everybody who posted) for their help.

Mike




VisionSet said:
[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+
[a-zA-Z]{2,}))$/";

to work. It compiles, but when I run it I get:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unclo
acter class near index 148

Instead of this part of your regex:

[^<>()[\\]\\.,;:\\s@\"]

do you mean:

[^<>()\\[\\].,;:\\s@\"]
 
?

=?ISO-8859-1?Q?Daniel_Sj=F6blom?=

Mike said:
I tried your suggestions - or at least how I interpreted them and it won't
compile.

Here is the pattern I tried:

"^(([^<>()[\\]\\.,;:\\s@\\"]+(\\.[^<>()[\\]\\.,;:\\s@\\"]+)*)|(\\".+\\"))@((
\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\
..)+[a-zA-Z]{2,}))$";

as opposed to the original:

"^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1
,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,})
)$"

What bit did I misunderstand ?

You need to escape literal backslashes, and java regexes also require
that you properly match [ with ] so you need to quote the literal [ in
the part that matches the name part of the email adress. Also, " was
already escaped in the original. I think it should be as follows:

^(([^<>()\\[\\]\\\\\\.,;:\\s@\"]+(\\.[^<>()\\[\\]\\\\\\.,;:\\s@\"]+)*)|
(\".+\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|((
[a-zA-Z\\-0-9]+\\.)+[a-zA-Z]{2,}))$

But it would be better really if you read through the regex and tried to
understand it. If you understand it you know what should be escaped.
 
M

Mike

I provided the link to the article and web page which contains the
source.
The link you provided does not go anywhere specific. Try it.

You are quite right it doesn't - outlook wrapped the link and only marked up
the first half.
The "256C40004CC8C6" on the second line should be tagged on the end. Sorry

M
 
M

Mike

You need to escape literal backslashes, and java regexes also require
that you properly match [ with ] so you need to quote the literal [ in
the part that matches the name part of the email adress. Also, " was
already escaped in the original. I think it should be as follows:

^(([^<>()\\[\\]\\\\\\.,;:\\s@\"]+(\\.[^<>()\\[\\]\\\\\\.,;:\\s@\"]+)*)|
(\".+\"))@((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|((
[a-zA-Z\\-0-9]+\\.)+[a-zA-Z]{2,}))$

That works !
But it would be better really if you read through the regex and tried to
understand it. If you understand it you know what should be escaped.

Agreed, I haven't got "Matching regular expressions" to hand - a damn good
book, which I have used before !

Thanks for the regexp.

M
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,262
Messages
2,571,056
Members
48,769
Latest member
Clifft

Latest Threads

Top