regex problem: 'greater than' 'less than' and 'equals' not matching!

F

falcon

I have a very strange problem. I want to replace every thing in a
string except letters, numbers, space, and certain symbols listed in
the regex expression below

"blahblah !@#$%^&*()--.,<>=".replaceAll("[^A-Za-z0-9/-?:().,'+^| ]","")

I expect to get the following string back:
"blahblah ^().,"

but I actually get the following:
"blahblah ^().,<>="

Notice the greater/less than and equals signs are still there!

I did a quick check using this site:
http://www.fileformat.info/tool/regex.htm and I get the same result
back. What's going on here???
 
C

colirl

Hi,

I have not actually run the regex with my fixes but first of all there
are a few problems. Characters like $, |, [, ), \, / and so on are
peculiar cases in regular expressions. If you want to match for one of
those then you have to preceed it by a backslash. So:

\| # Vertical bar
\[ # An open square bracket
\) # A closing parenthesis
\* # An asterisk
\^ # A carat symbol
\/ # A slash
\\ # A backslash

Try this for your ( and ) and see if it makes any difference! I dont'
have the time to test the fix but thats what you need to do for special
charachters.
 
F

falcon

colirl,
That doesn't seem to work. Besides, it is replacing most of right
characters with blanks, for some reason it keeps relational symbols
(><=).
 
F

falcon

Sorry, it does work, I had to move some chars in my regext string
around, but adding back slashes to those characters which have special
meaning apparently was the problem. Thanks colirl!
 
C

colirl

Ok you need to escape the - symbol because as I said, specail
characters need to be escaped.

try [^A-Za-z0-9/\-?:().,'+^| ] as your expression. gets rid of
the ><= for me.
 
C

colirl

so then you get


blahblah ^()--.,


Solution:
"blahblah !@#$%^&*()--.,<>=".replaceAll("[^A-Za-z0-9/\-?:().,'+^|
]","")


Enjoy :)
 
J

Jussi Piitulainen

falcon said:
I have a very strange problem. I want to replace every thing in a
string except letters, numbers, space, and certain symbols listed in
the regex expression below

"blahblah !@#$%^&*()--.,<>=".replaceAll("[^A-Za-z0-9/-?:().,'+^| ]","")

/-? contains :;<=>.
 
F

falcon

Jussi,
I already fixed the problem, but its amusing that I missed seeing /-?
as *from '/'* *to '?'*

Thanks :)
 
R

Rob Skedgell

colirl said:
Ok you need to escape the - symbol because as I said, specail
characters need to be escaped.

try [^A-Za-z0-9/\-?:().,'+^| ] as your expression. gets rid of
the ><= for me.

Or you can put a '-' unescaped as the last character in a character
class, since it doesn't form part of a range e.g. "[a-z-]" will match
"-". Admittedly the 1.5.0 javadocs for java.util.regex.Pattern at
<http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#cc>
don't make that clear: "inside a character class ... the expression -
becomes a range forming metacharacter."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,054
Latest member
LucyCarper

Latest Threads

Top