regexp

G

gbgkille69

Hello,

I have a text file like that:
2005-10-17;AXC dfgh k;29,26;275 2005-10-17;KLCACM Rfhekksn Allerg
FGH;9,65;434 2005-10-17;TYhdkdkj F12;50,5;276 2005-10-17

I'd like to extarct the values like that:
2005-10-17
AXC dfgh k
29,26
275
2005-10-17
KLCACM Rfhekksn Allerg FGH
:

but the code below only produces:
Found a match: 2005-10-17;AXC
g1: 2005-10-17
g2:
Found a match: 2005-10-17;KLCACM
g1: 2005-10-17
g2:
Found a match: 2005-10-17;TYhdkdkj
g1: 2005-10-17
g2:

Any idee on how I could achieve this? i.e. a record in the file is
<date>;<name>;<value>;<value><space>
<date>; and so on...

code:

String regex = "([0-9]{4}-[0-9]{2}-[0-9]{2});(\\w*)*";
String targetString = "2005-10-17;AXC dfgh k;29,26;275
2005-10-17;KLCACM Rfhekksn Allerg FGH;9,65;434 2005-10-17;TYhdkdkj
F12;50,5;276 2005-10-17";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(targetString);


while (matcher.find()) {
System.out.println("Found a match: " + matcher.group(0) +
"\ng1: " + matcher.group(1) +
"\ng2: " + matcher.group(2)
//+"\ng3: " + matcher.group(3)
);
}
 
C

carlos

I have a text file like that:
2005-10-17;AXC dfgh k;29,26;275 2005-10-17;KLCACM Rfhekksn Allerg
FGH;9,65;434 2005-10-17;TYhdkdkj F12;50,5;276 2005-10-17

I'd like to extarct the values like that:
2005-10-17
AXC dfgh k
29,26
275
2005-10-17
KLCACM Rfhekksn Allerg FGH

String data = "2005-10-17;AXC dfgh k;29,26;275 2005-10-17;" +
"KLCACM Rfhekksn Allerg FGH;9,65;434 2005-10-17;" +
"TYhdkdkj F12;50,5;276 2005-10-17";

String[] dataArray = data.split(";");

for (int i = 0 ; i < dataArray.length ; i++)
{
System.err.println(dataArray);
}
 
G

gbgkille69

many thanks but that won't work as the record ends with a number and a
space and not a ;
with split the output would be:
:
29,26
275 2005-10-17
KLCACM...
:
or in other words every record starts with a date. I'd like to use a
regexp inorder to keep the program simple, i.e. so that I can passa a
regexp as one single input paramter. furthermore the regexp will do a
syntx check for me which split wouldn't do.

any idea?
 
R

Roedy Green

or in other words every record starts with a date. I'd like to use a
regexp inorder to keep the program simple, i.e. so that I can passa a
regexp as one single input paramter. furthermore the regexp will do a
syntx check for me which split wouldn't do.

you can take it apart yourself with repeated indexOfs with only a few
lines of code. I am presuming there are no fields with embedded ;
with some quoting convention.

You can also take it apart char by char with a finite state automaton.

See http://mindprod.com/jgloss/finitestate.html

If you were of the habit of using hammers to kill mosquitoes, you
could write a parser. See http://mindprod.com/jgloss/parser.html
 
N

Nigel Wade

Hello,

I have a text file like that:
2005-10-17;AXC dfgh k;29,26;275 2005-10-17;KLCACM Rfhekksn Allerg
FGH;9,65;434 2005-10-17;TYhdkdkj F12;50,5;276 2005-10-17

I'd like to extarct the values like that:
2005-10-17
AXC dfgh k
29,26
275
2005-10-17
KLCACM Rfhekksn Allerg FGH
:

but the code below only produces:
Found a match: 2005-10-17;AXC
g1: 2005-10-17
g2:
Found a match: 2005-10-17;KLCACM
g1: 2005-10-17
g2:
Found a match: 2005-10-17;TYhdkdkj
g1: 2005-10-17
g2:

Any idee on how I could achieve this? i.e. a record in the file is
<date>;<name>;<value>;<value><space>
<date>; and so on...

code:

String regex = "([0-9]{4}-[0-9]{2}-[0-9]{2});(\\w*)*";
String targetString = "2005-10-17;AXC dfgh k;29,26;275
2005-10-17;KLCACM Rfhekksn Allerg FGH;9,65;434 2005-10-17;TYhdkdkj
F12;50,5;276 2005-10-17";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(targetString);


while (matcher.find()) {
System.out.println("Found a match: " + matcher.group(0) +
"\ng1: " + matcher.group(1) +
"\ng2: " + matcher.group(2)
//+"\ng3: " + matcher.group(3)
);
}

Is the content of the file entirely contained within one line? Or are there
multiple records per line, or one record per line?

It would make more sense to me to split the file/string into individual records,
then process each record. This should be much simpler to handle, and the code
should be easier to understand. After creating a set of records, you can split
each one into fields with the ";" field separator.
 
C

carlos

many thanks but that won't work as the record ends with a number and a
space and not a ;
with split the output would be:
:
29,26
275 2005-10-17
KLCACM...
:
or in other words every record starts with a date. I'd like to use a
regexp inorder to keep the program simple, i.e. so that I can passa a
regexp as one single input paramter. furthermore the regexp will do a
syntx check for me which split wouldn't do.


Ok, so ' 2005-' could be used as delimiter.

I came up with targetString.split(" [0-9]{4}-"); to match
this, but unfortunately, this expression does not include the
delimiter, so the year gets lost.

Does anyone know how to alter this expression in such a
way that it also includes the delimiter ?
 
R

Roedy Green

Is the content of the file entirely contained within one line? Or are there
multiple records per line, or one record per line?

It would make more sense to me to split the file/string into individual records,
then process each record. This should be much simpler to handle, and the code
should be easier to understand. After creating a set of records, you can split
each one into fields with the ";" field separator.

Don't be afraid to use some custom logic to handle the stuff awkward
in regex and get the regex to do only what it does naturally.

Similarly, don't be afraid to use two or more regexes, one to find the
big pattern and others to take that pattern apart rather than trying
to do it all in one does-everything-but-eat regex.
 
A

Alan Krueger

wang said:
use java.util.StringTokenizer.

No, you don't want to do that. In the usual case, StringTokenizer
treats multiple occurrences of a delimiter as a single token break. In
a delimited format like the OP described, this may break if any of the
records can be an empty string.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,112
Latest member
VinayKumar Nevatia
Top