Removing Dates from Strings

B

Bertram Hurtig

Hi,

I have a big file - each line may start with date and time (german date
formatting). To be able to sort and compare these lines, I want to
remove the Date and time sub strings.

For the date and time, I got this regex:
\d\d\.\d\d.\d\d\d\d\s\d\d\:\d\d:\d\d

I know there are classes like Pattern and Matcher, but this only tells
me if the String contains a date + time - but no idea how to also get
the relevant index positions to be able to remove the substrings found.

I know I could "manually" do this writting my own parsing method,
but I would prefer to have it nice, short (and performance optimized) -
and I don't want to reinvent the wheel.... ;-)

Any ideas?

Thanks in advance,


Betram
 
C

Christian

Bertram said:
Hi,

I have a big file - each line may start with date and time (german date
formatting). To be able to sort and compare these lines, I want to
remove the Date and time sub strings.

For the date and time, I got this regex:
\d\d\.\d\d.\d\d\d\d\s\d\d\:\d\d:\d\d

I know there are classes like Pattern and Matcher, but this only tells
me if the String contains a date + time - but no idea how to also get
the relevant index positions to be able to remove the substrings found.

I know I could "manually" do this writting my own parsing method,
but I would prefer to have it nice, short (and performance optimized) -
and I don't want to reinvent the wheel.... ;-)

Any ideas?

Thanks in advance,


Betram
use capturing Groups and you are done.. just surrround interesting parts
with () then you can retrieve them ..
 
J

Jeff Higgins

Bertram Hurtig wrote
Hi,

I have a big file - each line may start with date and time (german date
formatting). To be able to sort and compare these lines, I want to remove
the Date and time sub strings.

For the date and time, I got this regex:
\d\d\.\d\d.\d\d\d\d\s\d\d\:\d\d:\d\d

I know there are classes like Pattern and Matcher, but this only tells me
if the String contains a date + time - but no idea how to also get the
relevant index positions to be able to remove the substrings found.

I know I could "manually" do this writting my own parsing method,
but I would prefer to have it nice, short (and performance optimized) -
and I don't want to reinvent the wheel.... ;-)

Any ideas?
Maybe java.text.SimpleDateFormat.parse(String text, ParsePosition pos).
 
S

Stefan Ram

Bertram Hurtig said:
formatting). To be able to sort and compare these lines, I want to
remove the Date and time sub strings.

public class Main
{
public static void main
( final java.lang.String[] args )
{
final java.util.Scanner source =new java.util.Scanner
( "12.12.2007 10:39:59 abc def\n" +
"12.12.2007 10:39:59 abc def\n" );

while( source.hasNextLine() )
{ java.lang.System.out.println
( source.nextLine().replaceAll( "(?:\\S+\\s+){2}", "" )); }}}

abc def
abc def
 
J

Jeff Higgins

Jeff Higgins wrote
Bertram Hurtig wrote
Maybe java.text.SimpleDateFormat.parse(String text, ParsePosition pos).

import java.text.ParsePosition;
import java.text.SimpleDateFormat;

public class Main
{
public static void main(String[] args)
{
String[] source =
{
"12.12.2007 10:39:59 abc def",
"abc def",
"12.12.2007 10:39:59 abc def ghi",
"12.12.2007 10:39:59 abc 12.12.2007 10:39:59",
"12.12.2007 10:39:59 12.12.2007 10:39:59 abc",
"abc def ghi 12.12.2007 10:39:59"
};
SimpleDateFormat format =
new SimpleDateFormat("dd.MM.yyyy HH:mm:ss ");
ParsePosition pos = new ParsePosition(0);
for(String s : source)
{
if(format.parse(s, pos) != null)
System.out.println(s.substring(pos.getIndex()));
else
System.out.println(s);
pos.setIndex(0);
}
}
}

abc def
abc def
abc def ghi
abc 12.12.2007 10:39:59
12.12.2007 10:39:59 abc
abc def ghi 12.12.2007 10:39:59
 
J

Jeff Higgins

Jeff Higgins wrote
Jeff Higgins wrote

import java.text.NumberFormat;
import java.text.ParsePosition;
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.Comparator;

public class Main
{
public static void main(String[] args)
{
String[] source =
{
"12.12.2007 10:39:59 776854 3.1416e0",
"144209 456332.987e3",
"12.12.2007 10:39:59 567445 1e3",
"12.12.2007 10:39:59 999222 19",
"334978, 332.987e3",
"999224 789.3e-15"
};

Arrays.sort(source, new IgnoreDateComparator());
for(String s : source)
System.out.println(s);
}

static class IgnoreDateComparator
implements Comparator<String>
{
@Override
public int compare(String s1, String s2)
{
SimpleDateFormat df =
new SimpleDateFormat("dd.MM.yyyy HH:mm:ss ");
ParsePosition s1Pos = new ParsePosition(0);
ParsePosition s2Pos = new ParsePosition(0);
Long s1Long, s2Long;
NumberFormat nf = NumberFormat.getIntegerInstance();

df.parse(s1, s1Pos);
df.parse(s2, s2Pos);
if(s1Pos.getIndex() > 0)
s1Pos.setIndex(s1Pos.getIndex());
if(s2Pos.getIndex() > 0)
s2Pos.setIndex(s2Pos.getIndex());
s1Long = (Long)nf.parse(s1, s1Pos);
s2Long = (Long)nf.parse(s2, s2Pos);
return s1Long.compareTo(s2Long);
}
}
}

144209 456332.987e3
334978, 332.987e3
12.12.2007 10:39:59 567445 1e3
12.12.2007 10:39:59 776854 3.1416e0
12.12.2007 10:39:59 999222 19
999224 789.3e-15
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,679
Members
48,796
Latest member
Greg L.

Latest Threads

Top