Regular expressions

P

Pimousse

I definitively not a regexp expert ! ;)

Here's my problem.

I'm working on a string that looks like :
sin(1)+12.22-x2+x3+1/x4+1*17+56*ln(32+sin(26*x1))+exp(58)+1+ln(12)+sin(26)+pow(2,3)

etc etc ....

This string is built by an user, and I have to refactor it so that the
jvm could understand it.

x1 ... x5 are arguments which type is Double.

I already have some regexps that tranform ln to Math.log, sin to
Math.sin etc ... This is working perfectly, thanks to the help of some
guys from this forum ;)

But as I'm working with java.lang.Math, I must use Double everywhere,
especially with fonction like pow() : indeed, if i do sth like pow(2,3),
java will throw me an exception (normal behaviour?).

So i must tranform all numbers like 2 in 2.0 and 12558651 in 12558651.0.
But i mustn't changed 12.36 in 12.36.0, .45 in .45.0 or x1 in x1.0.

I made this regexp :

System.out.println("x1 : "+"x1".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false
System.out.println("12 : "+"12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> true
System.out.println(".12 : "+".12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false
System.out.println("1245.12 :
"+"1245.12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false

So I thought I had found the good regexp ! Actually not ! :(

I'm using a code that looks like :

Pattern p = Pattern.compile(myPattern);
Matcher m = p.matcher(myChain);
StringBuffer buf = new StringBuffer(myChain);
int pos=0;
while(true){
if(m.find(pos)){
buf.insert(m.end()-1,".0");
pos = m.end()+m.group().length()-1;
System.err.println("i : "+(++i)+" - "+buf.toString());
System.err.println("I found the text \"" + m.group() +
"\" starting at index " + m.start() +
" and ending at index " + m.end() + " --> pos : "+pos);
m = p.matcher(buf.toString());
}
else{
a = buf.toString();
break;
}
}

The string I test is :
pow(2,3)+sin(1)+12.22-x2+x3+1/x4+1*17+56*ln(32+sin(26*x1))+exp(58)+1+ln(12)+sin(26)

Here's the result if I use the pattern "(?<!\\.|x)[0-9]+[^\\.]" :
pow(2.0,3.0)+sin(1.0)+1.02.22.0-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58.0)+1.0+ln(12.0)+sin(26.0)
This pattern has a big problem with numbers like 78.69 or 1.8963 : it
transforms 12.22 in 1.02.22.0 !
But the remaining is "perfect".

If I use "(?<!\\.|x)[0-9]+[^\\.0-9]", here's the result :
pow(2.0,3.0)+sin(1.0)+12.22.0-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58.0)+1.0+ln(12.0)+sin(26.0)
There's still a mistake as it changes 12.22 in 12.22.0 ....

Another pattern I tried was : "\\b(?<!\\.|x)[0-9]+[^\\.0-9]\\b" !
Here's the result :
pow(2.0,3)+sin(1)+12.22-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58)+1.0+ln(12)+sin(26)
Here, there's no more problem with 12.22 but there are with integers ..

Can anyone helps me to point out my mistakes ?
Maybe I shouldn't use regexps ...
Tonight, I promise, I'll buy a book for mastering regexps on Amazon ... :)
 
F

Filip Larsen

Pimousse wrote
Here's my problem.

I'm working on a string that looks like :
sin(1)+12.22-x2+x3+1/x4+1*17+56*ln(32+sin(26*x1))+exp(58)+1+ln(12)+sin(2
6)+pow(2,3)

etc etc ....

This string is built by an user, and I have to refactor it so that the
jvm could understand it.
Maybe I shouldn't use regexps ...

Perhaps the syntax your users are using is close enought for you to use
the BeanShell parser (http://www.beanshell.org/) or some of the more
evaluation specific parsers like JEP (http://www.singularsys.com/jep/)
or JEPLite (http://jeplite.sourceforge.net/).

If you really need to parse the expressions yourself, for instance
because you want to do mathematical transformations on the expression
and not just so that the JVM can evaluate its value, then you really
should look into building your own parser with tools like JavaCC
(http://javacc.dev.java.net/), which seems to already have grammars
close to what you want.


Regards,
 
P

Pimousse

I wanna apologize for the ones who still answer to this question .....
Indeed, due to a mailserver misfunction at work, this topic has been
posted twice !

Here are the solution found by the community and I :

\\b(?<!\\.)[0-9]+\\b and inserting "(double)" at matcher.start()

\\b(?<!\\.|x)[0-9]+[^\\.0-9] and inserting ".0" at matcher.start()

More explanations at :
http://forum.java.sun.com/thread.jsp?forum=31&thread=542064

Thanks for all.

Pimousse
 
P

Pimousse

I wanna apologize for the ones who still answer to this question .....
Indeed, due to a mailserver misfunction at work, this topic has been
posted twice !

Here are the solution found by the community and I :

\\b(?<!\\.)[0-9]+\\b and inserting "(double)" at matcher.start()

\\b(?<!\\.|x)[0-9]+[^\\.0-9] and inserting ".0" at matcher.start()

More explanations at :
http://forum.java.sun.com/thread.jsp?forum=31&thread=542064

Thanks for all.

Pimousse
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,058
Latest member
QQXCharlot

Latest Threads

Top