java String split returns an additional first empty string

Discussion in 'Java' started by Hanif, Oct 16, 2003.

  1. Hanif

    Hanif Guest

    Hi folks,
    I am trying to use the split on a String, the result returns an
    array of Strings. I always get the first String in the array as an
    empty string.

    Please note that the StringTokenizer did not work, but with using
    split i am very close to the number of token i want.

    Any Ideas !

    Thanks a lot
    Hanif

    ---------------------------------------------------------------------------
    Here is some code that would illustrate the problem:

    String data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";

    String t_data=data.trim();
    String[] TOKENS = t_nodeData_.split("XXXX:");
    System.out.println("TOKENS.length : " + TOKENS.length);
    System.out.println("TOKENS[0].length : " + TOKENS[0].length());
    for(int i = 0; i < TOKENS.length; i++) {
    System.out.println("TOKENS["+i+"] :"+TOKENS);
    }


    Result :
    t_data :
    XXXX:aaaaaaa
    bbbbbbb
    XXXX:cccccc
    ddddd
    XXXX:eeeeee
    ffffff

    TOKENS.length : 4

    TOKENS[0].length : 0
    TOKENS[0] :
    TOKENS[1] :aaaaaaa
    bbbbbbb

    TOKENS[2] :cccccc
    ddddd


    TOKENS[3] :eeeeee
    ffffff
    Hanif, Oct 16, 2003
    #1
    1. Advertising

  2. It is because your source String starts with the token on which you are
    splitting. You could always create a second array by copying non-empty
    strings from the first array. Perhaps this would give you the array you are
    looking for?



    "Hanif" <> wrote in message
    news:...
    > Hi folks,
    > I am trying to use the split on a String, the result returns an
    > array of Strings. I always get the first String in the array as an
    > empty string.
    >
    > Please note that the StringTokenizer did not work, but with using
    > split i am very close to the number of token i want.
    >
    > Any Ideas !
    >
    > Thanks a lot
    > Hanif
    >
    > --------------------------------------------------------------------------

    -
    > Here is some code that would illustrate the problem:
    >
    > String

    data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";
    >
    > String t_data=data.trim();
    > String[] TOKENS = t_nodeData_.split("XXXX:");
    > System.out.println("TOKENS.length : " + TOKENS.length);
    > System.out.println("TOKENS[0].length : " + TOKENS[0].length());
    > for(int i = 0; i < TOKENS.length; i++) {
    > System.out.println("TOKENS["+i+"] :"+TOKENS);
    > }
    >
    >
    > Result :
    > t_data :
    > XXXX:aaaaaaa
    > bbbbbbb
    > XXXX:cccccc
    > ddddd
    > XXXX:eeeeee
    > ffffff
    >
    > TOKENS.length : 4
    >
    > TOKENS[0].length : 0
    > TOKENS[0] :
    > TOKENS[1] :aaaaaaa
    > bbbbbbb
    >
    > TOKENS[2] :cccccc
    > ddddd
    >
    >
    > TOKENS[3] :eeeeee
    > ffffff
    Collin VanDyck, Oct 16, 2003
    #2
    1. Advertising

  3. Hanif

    VisionSet Guest

    "Hanif" <> wrote in message
    news:...
    > Hi folks,
    > I am trying to use the split on a String, the result returns an
    > array of Strings. I always get the first String in the array as an
    > empty string.
    >
    > Please note that the StringTokenizer did not work, but with using
    > split i am very close to the number of token i want.
    >


    Oooh, you wait till Paul sees this, you'll be for it!

    What output exactly are you *expecting* from what input?

    String.split(), if you read the API splits *around* the regex you give it.
    So you will get an empty string if the regex matches the 1st part of the
    String.

    --
    Mike W
    VisionSet, Oct 16, 2003
    #3
  4. Hanif

    Paul Lutus Guest

    Hanif wrote:

    > Hi folks,
    > I am trying to use the split on a String, the result returns an
    > array of Strings. I always get the first String in the array as an
    > empty string.
    >
    > Please note that the StringTokenizer did not work, but with using
    > split i am very close to the number of token i want.
    >
    > Any Ideas !
    >
    > Thanks a lot
    > Hanif
    >
    >

    ---------------------------------------------------------------------------
    > Here is some code that would illustrate the problem:
    >
    > String
    > data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";
    >
    > String t_data=data.trim();
    > String[] TOKENS = t_nodeData_.split("XXXX:");


    No, this is wrong, and in two (some might say three) ways. Do it this way:

    String[] tokens = t_data.split(":");

    And the variable "t_nodeData_" appears nowhere in your code. You must always
    post the actual code (copy and paste), not something you made up for your
    newsreader.

    And avoid the use of ALL UPPERCASE for ordinary variable names.

    FInally, if you might have one or more empty final fields that you want
    represented in the resulting array, do it this way:

    String[] tokens = t_data.split(":",-1);


    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 16, 2003
    #4
  5. The initial zero length string is not additional, it is the first
    substring in your example data that is terminated by your regular
    expression.

    Here is an excerpt from the two argument split() javadoc:
    The array returned by this method contains each
    substring of this string that is terminated by
    another substring that matches the given expression
    or is terminated by the end of the string.

    Even though you are calling the one argument split() method, you still
    need to know how the two argument split() method functions because of
    the following exceprt from the one argument split() javadoc:
    This method works as if by invoking the two-argument
    split method with the given expression and a limit
    argument of zero.
    David W. Burhans, Oct 16, 2003
    #5
  6. Hanif

    Hanif Guest

    Hi,
    Thanks for all your replies. i agree with your opinion.
    issue resolved:
    it would be better if i leave the first item from the array, since the
    data will always start with the token string pattern.

    "Collin VanDyck" <> wrote in message news:<>...
    > It is because your source String starts with the token on which you are
    > splitting. You could always create a second array by copying non-empty
    > strings from the first array. Perhaps this would give you the array you are
    > looking for?
    >
    >
    >
    > "Hanif" <> wrote in message
    > news:...
    > > Hi folks,
    > > I am trying to use the split on a String, the result returns an
    > > array of Strings. I always get the first String in the array as an
    > > empty string.
    > >
    > > Please note that the StringTokenizer did not work, but with using
    > > split i am very close to the number of token i want.
    > >
    > > Any Ideas !
    > >
    > > Thanks a lot
    > > Hanif
    > >
    > > --------------------------------------------------------------------------

    > -
    > > Here is some code that would illustrate the problem:
    > >
    > > String

    > data="XXXX:aaaaaaa\nbbbbbbb\nXXXX:cccccc\nddddd\nXXXX:eeeeee\nffffff\n";
    > >
    > > String t_data=data.trim();
    > > String[] TOKENS = t_nodeData_.split("XXXX:");
    > > System.out.println("TOKENS.length : " + TOKENS.length);
    > > System.out.println("TOKENS[0].length : " + TOKENS[0].length());
    > > for(int i = 0; i < TOKENS.length; i++) {
    > > System.out.println("TOKENS["+i+"] :"+TOKENS);
    > > }
    > >
    > >
    > > Result :
    > > t_data :
    > > XXXX:aaaaaaa
    > > bbbbbbb
    > > XXXX:cccccc
    > > ddddd
    > > XXXX:eeeeee
    > > ffffff
    > >
    > > TOKENS.length : 4
    > >
    > > TOKENS[0].length : 0
    > > TOKENS[0] :
    > > TOKENS[1] :aaaaaaa
    > > bbbbbbb
    > >
    > > TOKENS[2] :cccccc
    > > ddddd
    > >
    > >
    > > TOKENS[3] :eeeeee
    > > ffffff
    Hanif, Oct 17, 2003
    #6
  7. Hanif

    Paul Lutus Guest

    Hanif wrote:

    > Hi,
    > Thanks for all your replies. i agree with your opinion.
    > issue resolved:
    > it would be better if i leave the first item from the array, since the
    > data will always start with the token string pattern.


    If you believe this, then you seem to have missed the point. Why not filter
    out the desired section first, then apply "split()" on the result?

    Unless I don't understand your meaning.

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Oct 17, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. -
    Replies:
    2
    Views:
    1,219
    Nigel Wade
    Feb 9, 2005
  2. alf
    Replies:
    3
    Views:
    296
    Lawrence D'Oliveiro
    Sep 27, 2006
  3. Simon Strandgaard

    [bug] String#split returns extra empty string

    Simon Strandgaard, May 31, 2004, in forum: Ruby
    Replies:
    8
    Views:
    336
    David A. Black
    Jun 1, 2004
  4. Sam Kong
    Replies:
    5
    Views:
    240
    Rick DeNatale
    Aug 12, 2006
  5. Michael Hamer

    empty first element after split

    Michael Hamer, Jul 11, 2008, in forum: Perl Misc
    Replies:
    7
    Views:
    138
    Dr.Ruud
    Jul 12, 2008
Loading...

Share This Page