better way to parse Tabs

M

MileHighCelt

I have been looking thru use groups and API's and there seems to be
some issue on parsing a tab delimited file. If a user uploads a file
that is tab delimited, and could contain nulls, is this necessarily the
best approach:

InputStream in = theFile.getInputStream();
BufferedReader r = new BufferedReader(new InputStreamReader(in));
String line;
while ((line = r.readLine()) != null) {
StringTokenizer st = new StringTokenizer(line, "\t", true);
MyBean region = new MyBean();

region.setLab_id(Integer.parseInt(st.nextToken()));
region.setControl_nmbr(st.nextToken());
...
}

As I understand it, the "true" arguement in the tokenizer will handle
the nulls/no value as empty strings?

Is there a better way to solve this? Therre are probably 50 columns
per row, with a variety of date, int, and String values.

Thank you for you opinion and advice.
 
J

Jean-Francois Briere

As I understand it, the "true" arguement in the tokenizer will handle
the nulls/no value as empty strings?

No. If you read carefully the documentation, you will see that "true"
means that it returns the delimiters (in your case "\t") as tokens.
So if you want use a StringTokenizer you'll have to allways check for
the "\t" token and also take care of sequential "\t" tokens.
Is there a better way to solve this?

Yes. You could do (JVM 1.4+):

String[] elements = line.split("\t");

In the returning string array, the empty elements (\t\t) will be empty
strings.
All you'll have to do is convert to Date or int the relevant elements,
allways verifying first for empty string of course).

region.setLab_id(convertToInt(elements[0]));
region.setControl_nmbr(elements[1]); // no conversion needed
....
region.setSomeDateField(convertToDate(elements[n]));
....
int convertToInt(String str)
{
return (str.length() == 0) ? 0 : Integer.parseInt(str);
}

Date convertToDate(String str)
{
return (str.length() == 0) ? null :
someSimpleDateFormatInstance.parse(str);
}

Regards
 
M

MileHighCelt

Thank you - that is exactly what I was looking for! I will swap out
the StringTokenizer I implemented as it sure seems slow.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top