java String split returns an additional first empty string

H

Hanif

Hi folks,
I am trying to use the split on a String, the result returns an
array of Strings. I always get the first String in the array as an
empty string.

Please note that the StringTokenizer did not work, but with using
split i am very close to the number of token i want.

Any Ideas !

Thanks a lot
Hanif

---------------------------------------------------------------------------
Here is some code that would illustrate the problem:

String data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";

String t_data=data.trim();
String[] TOKENS = t_nodeData_.split("XXXX:");
System.out.println("TOKENS.length : " + TOKENS.length);
System.out.println("TOKENS[0].length : " + TOKENS[0].length());
for(int i = 0; i < TOKENS.length; i++) {
System.out.println("TOKENS["+i+"] :"+TOKENS);
}


Result :
t_data :
XXXX:aaaaaaa
bbbbbbb
XXXX:cccccc
ddddd
XXXX:eeeeee
ffffff

TOKENS.length : 4

TOKENS[0].length : 0
TOKENS[0] :
TOKENS[1] :aaaaaaa
bbbbbbb

TOKENS[2] :cccccc
ddddd


TOKENS[3] :eeeeee
ffffff
 
C

Collin VanDyck

It is because your source String starts with the token on which you are
splitting. You could always create a second array by copying non-empty
strings from the first array. Perhaps this would give you the array you are
looking for?
 
V

VisionSet

Hanif said:
Hi folks,
I am trying to use the split on a String, the result returns an
array of Strings. I always get the first String in the array as an
empty string.

Please note that the StringTokenizer did not work, but with using
split i am very close to the number of token i want.

Oooh, you wait till Paul sees this, you'll be for it!

What output exactly are you *expecting* from what input?

String.split(), if you read the API splits *around* the regex you give it.
So you will get an empty string if the regex matches the 1st part of the
String.
 
P

Paul Lutus

Hanif said:
Hi folks,
I am trying to use the split on a String, the result returns an
array of Strings. I always get the first String in the array as an
empty string.

Please note that the StringTokenizer did not work, but with using
split i am very close to the number of token i want.

Any Ideas !

Thanks a lot
Hanif

---------------------------------------------------------------------------
Here is some code that would illustrate the problem:

String
data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";

String t_data=data.trim();
String[] TOKENS = t_nodeData_.split("XXXX:");

No, this is wrong, and in two (some might say three) ways. Do it this way:

String[] tokens = t_data.split(":");

And the variable "t_nodeData_" appears nowhere in your code. You must always
post the actual code (copy and paste), not something you made up for your
newsreader.

And avoid the use of ALL UPPERCASE for ordinary variable names.

FInally, if you might have one or more empty final fields that you want
represented in the resulting array, do it this way:

String[] tokens = t_data.split(":",-1);
 
D

David W. Burhans

The initial zero length string is not additional, it is the first
substring in your example data that is terminated by your regular
expression.

Here is an excerpt from the two argument split() javadoc:
The array returned by this method contains each
substring of this string that is terminated by
another substring that matches the given expression
or is terminated by the end of the string.

Even though you are calling the one argument split() method, you still
need to know how the two argument split() method functions because of
the following exceprt from the one argument split() javadoc:
This method works as if by invoking the two-argument
split method with the given expression and a limit
argument of zero.
 
H

Hanif

Hi,
Thanks for all your replies. i agree with your opinion.
issue resolved:
it would be better if i leave the first item from the array, since the
data will always start with the token string pattern.

Collin VanDyck said:
It is because your source String starts with the token on which you are
splitting. You could always create a second array by copying non-empty
strings from the first array. Perhaps this would give you the array you are
looking for?



Hanif said:
Hi folks,
I am trying to use the split on a String, the result returns an
array of Strings. I always get the first String in the array as an
empty string.

Please note that the StringTokenizer did not work, but with using
split i am very close to the number of token i want.

Any Ideas !

Thanks a lot
Hanif

-------------------------------------------------------------------------- -
Here is some code that would illustrate the problem:

String data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";

String t_data=data.trim();
String[] TOKENS = t_nodeData_.split("XXXX:");
System.out.println("TOKENS.length : " + TOKENS.length);
System.out.println("TOKENS[0].length : " + TOKENS[0].length());
for(int i = 0; i < TOKENS.length; i++) {
System.out.println("TOKENS["+i+"] :"+TOKENS);
}


Result :
t_data :
XXXX:aaaaaaa
bbbbbbb
XXXX:cccccc
ddddd
XXXX:eeeeee
ffffff

TOKENS.length : 4

TOKENS[0].length : 0
TOKENS[0] :
TOKENS[1] :aaaaaaa
bbbbbbb

TOKENS[2] :cccccc
ddddd


TOKENS[3] :eeeeee
ffffff
 
P

Paul Lutus

Hanif said:
Hi,
Thanks for all your replies. i agree with your opinion.
issue resolved:
it would be better if i leave the first item from the array, since the
data will always start with the token string pattern.

If you believe this, then you seem to have missed the point. Why not filter
out the desired section first, then apply "split()" on the result?

Unless I don't understand your meaning.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top