H
Hendrik Maryns
(This is in Java, but the regex is general, therefore x-post to
c.l.p.m., f-up to c.l.j.h.)
Hi all,
I want to discard the header of some file. The header is everything
before a line beginning with "#BOS". However, I do not want #BOS to be
part of the match, since I need it later on.
I thought of using a regex to do that. I came up with
..*(?s)(?=#BOS)
However, this gave me nothing.
(To be precise, I have
Scanner corpus = new Scanner(inFile);
Pattern header = Pattern.compile(".*(?s)(?=#BOS)", Pattern.MULTILINE);
corpus.skip(header);
and it gives me
java.util.NoSuchElementException
at java.util.Scanner.skip(Scanner.java:1706)
at
de.uni_tuebingen.sfb.lichtenstein.binarytrees.Converter2.main(Converter2.java:61)
so if any of the Java people sees a problem there, please point out.)
So to pinpoint my problem: I want a regex which matches any number of
lines until it finds a line beginning with #BOS, but does not include
#BOS in the match.
Other tries looked like this:
..*?(?s)(?=#BOS)
(.|\n)*?(?=#BOS) (this freezes the program)
..*(?=#BOS) with MULTLINE uption to Pattern.Compile
..*(?s)^(?=#BOS)
and several others, but I find no solution. So my last resort is asking
here.
TIA, H.
--
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
c.l.p.m., f-up to c.l.j.h.)
Hi all,
I want to discard the header of some file. The header is everything
before a line beginning with "#BOS". However, I do not want #BOS to be
part of the match, since I need it later on.
I thought of using a regex to do that. I came up with
..*(?s)(?=#BOS)
However, this gave me nothing.
(To be precise, I have
Scanner corpus = new Scanner(inFile);
Pattern header = Pattern.compile(".*(?s)(?=#BOS)", Pattern.MULTILINE);
corpus.skip(header);
and it gives me
java.util.NoSuchElementException
at java.util.Scanner.skip(Scanner.java:1706)
at
de.uni_tuebingen.sfb.lichtenstein.binarytrees.Converter2.main(Converter2.java:61)
so if any of the Java people sees a problem there, please point out.)
So to pinpoint my problem: I want a regex which matches any number of
lines until it finds a line beginning with #BOS, but does not include
#BOS in the match.
Other tries looked like this:
..*?(?s)(?=#BOS)
(.|\n)*?(?=#BOS) (this freezes the program)
..*(?=#BOS) with MULTLINE uption to Pattern.Compile
..*(?s)^(?=#BOS)
and several others, but I find no solution. So my last resort is asking
here.
TIA, H.
--
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html