Reading CSV-type files...

A

Andreas Leitgeb

I've got a file in a CSV-type format:
elem1:elem2:elem3
foo:xyz:fjhl
but sometimes some fields might also be empty:
:abc:jkfh

I've tried using a java.util.Scanner (on each line)
for this (with a delimiter of ":"), but it doesn't
account for empty fields, so in the third example
line, it would give me the "abc" as *first* element,
not as second. :-(

Is there some setting for Scanner that I've missed,
or is there something similar to Scanner that works
more like what I need,
or do I have to code that myself?
 
D

Daniel Pitts

Andreas said:
I've got a file in a CSV-type format:
elem1:elem2:elem3
foo:xyz:fjhl
but sometimes some fields might also be empty:
:abc:jkfh

I've tried using a java.util.Scanner (on each line)
for this (with a delimiter of ":"), but it doesn't
account for empty fields, so in the third example
line, it would give me the "abc" as *first* element,
not as second. :-(

Is there some setting for Scanner that I've missed,
or is there something similar to Scanner that works
more like what I need,
or do I have to code that myself?
Have you tried myString.split(":")?
 
M

Manish Pandit

Andreas said:
I've got a file in a CSV-type format:
elem1:elem2:elem3
foo:xyz:fjhl
but sometimes some fields might also be empty:
:abc:jkfh

I've tried using a java.util.Scanner (on each line)
for this (with a delimiter of ":"), but it doesn't
account for empty fields, so in the third example
line, it would give me the "abc" as *first* element,
not as second. :-(

Is there some setting for Scanner that I've missed,
or is there something similar to Scanner that works
more like what I need,
or do I have to code that myself?

Tokenizing the file might not scale if the file is huge. You might want
to give csvjdbc a look (http://csvjdbc.sourceforge.net/), which gives a
JDBC API over CSV files. You can set the delimited to : instead of the
default comma.

-cheers,
Manish
 
G

Googmeister

Andreas said:
I've got a file in a CSV-type format:
elem1:elem2:elem3
foo:xyz:fjhl
but sometimes some fields might also be empty:
:abc:jkfh

I've tried using a java.util.Scanner (on each line)
for this (with a delimiter of ":"), but it doesn't
account for empty fields, so in the third example
line, it would give me the "abc" as *first* element,
not as second. :-(

Is there some setting for Scanner that I've missed,
or is there something similar to Scanner that works
more like what I need,
or do I have to code that myself?

How about reading in a line with Scanner,
and then using the split() method in String?
 
A

Andreas Leitgeb

How about reading in a line with Scanner,
and then using the split() method in String?

Yes, thanks a lot. That's what I'm gonna do for
the problem at hand.

I once read over String.split, but obviously was too
much fixed on the Scanner being able to directly read
from File, although in my case for tokenizing each
line this of course wasn't actually necessary.

I haven't found any reference on what kind of stream
the Scanner uses when working on a File. Can it be
expected to use buffered streams, or is using a Scanner
directly on a File just playing lottery with performance?
 
A

Andreas Leitgeb

Manish Pandit said:
Tokenizing the file might not scale if the file is huge. You might want
to give csvjdbc a look (http://csvjdbc.sourceforge.net/), which gives a
JDBC API over CSV files. You can set the delimited to : instead of the
default comma.

Thanks, I appreciate the answer.

In my case, however, the job to be done for each parsed
line definitely outweighs any optimisations in parsing.
 
D

David Segall

Andreas Leitgeb said:
I've got a file in a CSV-type format:
elem1:elem2:elem3
foo:xyz:fjhl
but sometimes some fields might also be empty:
:abc:jkfh

I've tried using a java.util.Scanner (on each line)
for this (with a delimiter of ":"), but it doesn't
account for empty fields, so in the third example
line, it would give me the "abc" as *first* element,
not as second. :-(

Is there some setting for Scanner that I've missed,
or is there something similar to Scanner that works
more like what I need,
or do I have to code that myself?
Check out Roedy Green's CSV reader
<http://www.mindprod.com/products1.html#CSV>.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top