Regular expressions ...

P

Pimousse

I'm definitively not a regexp expert ! ;)
Here's my problem.

I'm working on a string that looks like :
sin(1)+12.22-x2+x3+1/x4+1*17+56*ln(32+sin(26*x1))+exp(58)+1+ln(12)+sin(26)+pow(2,3)

etc etc ....

This string is built by an user, and I have to refactor it so that the
jvm could understand it.

x1 ... x5 are arguments which type is Double.

I already have some regexps that tranform ln to Math.log, sin to
Math.sin etc ... This is working perfectly, thanks to the help of some
guys from this forum ;)

But as I'm working with java.lang.Math, I must use Double everywhere,
especially with fonction like pow() : indeed, if i do sth like pow(2,3),
java will throw me an exception (normal behaviour?).

So i must tranform all numbers like 2, 12558651 in 2.0 and 12558651.0.
But i mustn't changed 12.36 in 12.36.0, .45 in .45.0 or x1 in x1.0.

I made this regexp :

System.out.println("x1 : "+"x1".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false
System.out.println("12 : "+"12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> true
System.out.println(".12 : "+".12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false
System.out.println("1245.12 :
"+"1245.12".matches("(?<!\\.|x)[0-9]+[^\\.]"));
//--> false

So I thought I had found the good regexp ! Actually not ! :(
I'm using a code that looks like :

Pattern p = Pattern.compile(myPattern);
Matcher m = p.matcher(a);
StringBuffer buf = new StringBuffer(myChain);
int pos=0;
while(true){
if(m.find(pos)){
buf.insert(m.end()-1,".0");
pos = m.end()+m.group().length()-1;
System.err.println("i : "+(++i)+" - "+buf.toString());
System.err.println("I found the text \"" + m.group() +
"\" starting at index " + m.start() +
" and ending at index " + m.end() + " --> pos : "+pos);
m = p.matcher(buf.toString());
}
else{
a = buf.toString();
break;
}
}


The string I test is :
pow(2,3)+sin(1)+12.22-x2+x3+1/x4+1*17+56*ln(32+sin(26*x1))+exp(58)+1+ln(12)+sin(26)

Here's the result if I use the pattern (?<!\\.|x)[0-9]+[^\\.] :
pow(2.0,3.0)+sin(1.0)+1.02.22.0-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58.0)+1.0+ln(12.0)+sin(26.0)
This pattern has a big problem with numbers like 78.69 or 1.8963 : it
transforms 12.22 in 1.02.22.0 !
But the remaining is "perfect".

If I use (?<!\\.|x)[0-9]+[^\\.0-9], here's the result :
pow(2.0,3.0)+sin(1.0)+12.22.0-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58.0)+1.0+ln(12.0)+sin(26.0)
There's still a mistake as it changes 12.22 in 12.22.0 ....

Another pattern I tried was : \\b(?<!\\.|x)[0-9]+[^\\.0-9]\\b !
Here's the result :
pow(2.0,3)+sin(1)+12.22-x2+x3+1.0/x4+1.0*17.0+56.0*ln(32.0+sin(26.0*x1))+exp(58)+1.0+ln(12)+sin(26)
Here, there's no more problems with 12.22 but there are with integers ..

Can anyone helps me to point out my mistakes ?
Maybe I shouldn't use regexp ...
Tonight, I promise, I'll buy a book for mastering regexps on Amazon ... :)
 
C

Carl Howells

Pimousse said:
Can anyone helps me to point out my mistakes ?
Maybe I shouldn't use regexp ...
Tonight, I promise, I'll buy a book for mastering regexps on Amazon ... :)

We have a winner!

This is NOT a particularly appropriate place to use a simple regex
package. You want a more sophisticated parser than a regex package can
supply. Look into something like javaCC for this. It should solve your
problems in a much simpler manner than creating giant regexes.
 
P

Pimousse

We have a winner!
so glad to be THE one ! ;)
This is NOT a particularly appropriate place to use a simple regex
package. You want a more sophisticated parser than a regex package
can supply. Look into something like javaCC for this. It should
solve your problems in a much simpler manner than creating giant
regexes.

Oh I solved it another way ....

Pattern p = Pattern.compile("\\b(?<!\\.)[0-9]+\\b");
Matcher m = p.matcher(myChain);
StringBuffer buf = new StringBuffer(myChain);
int pos=0;
while(true){
if(m.find(pos)){
buf.insert(m.start(),"(double)");
pos = m.end()+9;
m = p.matcher(buf);
}
else{
myChain = buf.toString();
m = null;
p = null;
buf = null;
break;
}
}
Indeed, 2.0 or (double)2 are the same .... ;)

But you're right, i'm maybe not using the good method ... ;)
But I only wanted to make the jvm understanding (hard to make to
computer understand sth ;)) I was using doubles and not integers !

Thanks for the idea of javaCC. the next, I'll use LEP ....
 
T

TechBookReport

Pimousse said:
We have a winner!
so glad to be THE one ! ;)
This is NOT a particularly appropriate place to use a simple regex
package. You want a more sophisticated parser than a regex package
can supply. Look into something like javaCC for this. It should
solve your problems in a much simpler manner than creating giant
regexes.

Oh I solved it another way ....

Pattern p = Pattern.compile("\\b(?<!\\.)[0-9]+\\b");
Matcher m = p.matcher(myChain);
StringBuffer buf = new StringBuffer(myChain);
int pos=0;
while(true){
if(m.find(pos)){
buf.insert(m.start(),"(double)");
pos = m.end()+9;
m = p.matcher(buf);
}
else{
myChain = buf.toString();
m = null;
p = null;
buf = null;
break;
}
}
Indeed, 2.0 or (double)2 are the same .... ;)

But you're right, i'm maybe not using the good method ... ;)
But I only wanted to make the jvm understanding (hard to make to
computer understand sth ;)) I was using doubles and not integers !

Thanks for the idea of javaCC. the next, I'll use LEP ....

Really, you need to take a look at JEP which does exactly what you need.
Take a quick trip to http://www.singularsys.com/jep/ and you'll be a
happier bunny...

Pan
===============================================
TechBookReport: http://www.techbookreport.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top