regex puzzle

R

Roedy Green

Is there a way within a regex to find strings of the from:

1 .. 20?

For reasons hard to explain I don't want to capture then check outside
the regex.
 
K

Knute Johnson

Is there a way within a regex to find strings of the from:

1 .. 20?

For reasons hard to explain I don't want to capture then check outside
the regex.

So does 1 .. 20 mean 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
or 1 .. 20?

1 .. 20 ^\d+ \.\. \d+$
 
R

Roedy Green

So does 1 .. 20 mean 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
or 1 .. 20?

nope 1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20

hopefully a little more terse.

and would not match 2099
 
R

Roedy Green

nope 1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20

hopefully a little more terse.

and would not match 2099

I think you need a new regex operator that collects characters 0-9,
0-9\. etc then parses a number, then compares it with low and high
bound in the regex operator.

It seems like such a simple problem.

Regex gets foiled so easily, like parsing HTML tags when they don't
appear in canonical order.
 
S

Stefan Ram

Knute Johnson said:
So does 1 .. 20 mean 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
or 1 .. 20?

public final class Main
{ public static void main( final java.lang.String[] args )
{ final java.lang.String r = "[1-9]|1[0-9]|20";
for( int i = 0; i < 22; ++i ){ java.lang.System.out.println
( ( i < 10 ? " " : "" )+ i + " " +( i + "" ).matches( r )); }}}
 
K

Knute Johnson

I think you need a new regex operator that collects characters 0-9,
0-9\. etc then parses a number, then compares it with low and high
bound in the regex operator.

It seems like such a simple problem.

Regex gets foiled so easily, like parsing HTML tags when they don't
appear in canonical order.

import java.util.regex.*;

public class test5 {
public static void main(String[] args) {
String sequence = "1..20";
String subject = "1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20";

System.out.println(matchSeq(sequence,subject) ? "matches" : "no
match");
}

public static boolean matchSeq(String seq, String subject) {
Pattern p1 = Pattern.compile("(\\d+)\\.\\.(\\d+)");
Matcher m1 = p1.matcher(seq);
if (m1.matches()) {
int from = Integer.parseInt(m1.group(1));
int to = Integer.parseInt(m1.group(2));
String s2 = Integer.toString(from);
for (int i=from+1; i<=to; i++)
s2 = s2.concat(String.format(" %d",i));
Pattern p2 = Pattern.compile(s2);
Matcher m2 = p2.matcher(subject);
return m2.matches();
} else {
throw new IllegalArgumentException("bad sequence");
}
}
}
 
M

markspace

{ final java.lang.String r = "[1-9]|1[0-9]|20";


This is the type of solution I'm used to seeing for problems of this
type. Roedy said he didn't want to match substrings (so "2009" should
not be matched. I think you just have to add some "not digit" to the
outside of that.

"\\D([1-9]|1[0-9]|20)\\D"

Not tested.
 
R

Robert Klemme

This is the type of solution I'm used to seeing for problems of this
type. Roedy said he didn't want to match substrings (so "2009" should
not be matched. I think you just have to add some "not digit" to the
outside of that.

"\\D([1-9]|1[0-9]|20)\\D"

But better use lookaround for that:

package regexp;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RoedyNumberMatch {

private static final String[] P_FIXES = { "", "foo", "bar ", "20" };

private static final Pattern PAT = Pattern
.compile("(?<!\\d)(?:20|1\\d|[1-9])(?!\\d)");

public static void main(String[] args) {
final Matcher m = PAT.matcher("");

for (int i = 0; i < 30; ++i) {
System.out.println("i = " + i);

for (final String pre : P_FIXES) {
for (final String post : P_FIXES) {
final String s = pre + i + post;
System.out.println("s = '" + s + "'");

if (m.reset(s).matches()) {
System.out.println("Matches: '" + m.group() + "'");
}

if (m.reset(s).find()) {
System.out.println("Finds: '" + m.group() + "'");
}
}
}

System.out.println();
}
}

}

Also here: https://gist.github.com/rklemme/10451386

Cheers

robert
 
A

Arne Vajhøj

I think you need a new regex operator that collects characters 0-9,
0-9\. etc then parses a number, then compares it with low and high
bound in the regex operator.

It seems like such a simple problem.

Sure but it is not a regex problem.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top