Split string whose length is varying

Discussion in 'C Programming' started by Java and Swing, Oct 6, 2005.

  1. Say I have a string which contains numbers separated by a comma... such
    as "0,1,2,3,4,5"...I want to split the string at the commas and return
    an array containing, 0,1...5.

    Suggestions? I've tried something like...

    int *Split(char *msg) {
    int len = 0;
    char *tmp;
    char *sub_string = NULL;
    int *results;

    tmp = (char *) malloc(strlen(msg) * sizeof(char));
    strcpy(tmp, msg);

    sub_string = strtok(tmp, ",");
    while (sub_string != NULL) {
    len++;
    sub_string = strtok(NULL, ",");
    }

    results = (int *) malloc(len * sizeof(int));
    sub_string = strtok(msg, ","); <<<<----crahses here
    while (sub_string != NULL) {
    *results = (int) atoi(sub_string);
    results += sizeof(int);
    sub_string = strtok(NULL, ",");
    }

    return results;
    }

    ....above you can see I have pointed out where it crashes. I think it's
    because strtok needs a pointer to the string...so I changed that line
    to..

    sub_string = strtok(&msg, ",");

    ...which got me past the error, but then when I try to print the results
    after the function...it's not right. Maybe its my function or my way
    of printing results.

    main() {
    char *msg = "1,2,3,4,5";
    int *results = Split(msg);
    // How do I print the results here??
    }

    Thanks in advance.
    Java and Swing, Oct 6, 2005
    #1
    1. Advertising

  2. Java and Swing

    Eric Sosman Guest

    Java and Swing wrote On 10/06/05 10:57,:
    > Say I have a string which contains numbers separated by a comma... such
    > as "0,1,2,3,4,5"...I want to split the string at the commas and return
    > an array containing, 0,1...5.
    >
    > Suggestions? I've tried something like...
    >
    > int *Split(char *msg) {
    > int len = 0;
    > char *tmp;
    > char *sub_string = NULL;


    Why bother to initialize? You never make use of
    the initialized value; it just gets overwritten the
    first time you call strtok().

    > int *results;
    >
    > tmp = (char *) malloc(strlen(msg) * sizeof(char));
    > strcpy(tmp, msg);


    The classic C off-by-one error: your `tmp' region
    is too small to contain `msg'. (You've also failed to
    check for malloc() failure, and you've created a memory
    leak by not free()ing `tmp' when you're done with it.)

    > sub_string = strtok(tmp, ",");
    > while (sub_string != NULL) {
    > len++;
    > sub_string = strtok(NULL, ",");
    > }


    This loop just counts the number of commas in `msg'.
    Well, almost: if the input held ",1,,3," this loop
    would count two, not four -- but if you don't need to
    worry about that sort of thing, there are simpler ways
    to obtain the count.

    > results = (int *) malloc(len * sizeof(int));
    > sub_string = strtok(msg, ","); <<<<----crahses here


    Remember, strtok() modifies the string it is scanning.
    You've handed it a pointer to a string literal, which must
    be treated as a constant: any attempt to modify the string
    causes undefined behavior.

    > while (sub_string != NULL) {
    > *results = (int) atoi(sub_string);


    Why cast an int to an int? Also, atoi() is not a
    good way to convert strings to numbers: it will do
    unpredictable things if given strings like "abba", and
    its behavior on strings like "23skiddoo", although
    predictable, leaves you unaware of the strange input.
    Try strtol() instead.

    > results += sizeof(int);


    This is horribly wrong. You need to go back to your
    textbook and review what it says about pointer arithmetic.

    > sub_string = strtok(NULL, ",");
    > }
    >
    > return results;
    > }
    >
    > ...above you can see I have pointed out where it crashes. I think it's
    > because strtok needs a pointer to the string...so I changed that line
    > to..
    >
    > sub_string = strtok(&msg, ",");


    This is wrong, too, and your compiler should have
    complained about it -- unless, perhaps, you forgot to
    #include <string.h>, which would be yet another error.

    The practice of making random changes in hopes that
    a bug you don't understand will somehow disappear is not
    the use of reason but of magic. Even if you do get the
    program to work for the test cases you throw at it, you
    won't understand why it works and you won't know whether
    it works only by the strangest of unstable coincidences.
    Maybe it only works if the last number happens to be odd,
    or if there's an even number of numbers altogether, or
    if the moon is waning. The technique is Not Recommended.

    > ..which got me past the error,


    ... by introducing a completely different one ...

    > but then when I try to print the results
    > after the function...it's not right. Maybe its my function or my way
    > of printing results.
    >
    > main() {
    > char *msg = "1,2,3,4,5";
    > int *results = Split(msg);
    > // How do I print the results here??
    > }
    >
    > Thanks in advance.
    >
    Eric Sosman, Oct 6, 2005
    #2
    1. Advertising

  3. thanks for the input, i will try to make changes as you suggested.
    what simpler ways are there to count the number of commas in msg?

    any good online tutorials? ...its been quite a few years since i've
    done C.

    thanks.
    Java and Swing, Oct 6, 2005
    #3
  4. Java and Swing <> wrote:

    > thanks for the input, i will try to make changes as you suggested.
    > what simpler ways are there to count the number of commas in msg?


    It is proper Usenet etiquette to include the text you are replying to.
    To do this using Google groups, please follow the instructions below,
    penned by Keith Thompson:

    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.

    (That said, assuming count is an int and cp is a char *:

    count=0;
    while( (cp=strchr(msg,',')) != NULL ) {
    count++;
    cp++;
    }

    is much better for counting commas.)

    --
    Christopher Benson-Manica | I *should* know what I'm talking about - if I
    ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Christopher Benson-Manica, Oct 6, 2005
    #4
  5. when i try..

    void count(char *msg) {
    int count = 0;
    char *cp;

    while ( (cp = strchr(msg, ',')) != NULL ) {
    printf("cp = %s\n", cp);
    count++;
    cp++;
    }

    printf("Count: %d", count);
    }

    main() {
    char *msg = "1,2,4,5";
    count(msg);
    }

    ...the output i get is , ",2,4,5" forever.

    Christopher Benson-Manica wrote:
    > Java and Swing <> wrote:
    >
    > > thanks for the input, i will try to make changes as you suggested.
    > > what simpler ways are there to count the number of commas in msg?

    >
    > It is proper Usenet etiquette to include the text you are replying to.
    > To do this using Google groups, please follow the instructions below,
    > penned by Keith Thompson:
    >
    > If you want to post a followup via groups.google.com, don't use
    > the broken "Reply" link at the bottom of the article. Click on
    > "show options" at the top of the article, then click on the
    > "Reply" at the bottom of the article headers.
    >
    > (That said, assuming count is an int and cp is a char *:
    >
    > count=0;
    > while( (cp=strchr(msg,',')) != NULL ) {
    > count++;
    > cp++;
    > }
    >
    > is much better for counting commas.)
    >
    > --
    > Christopher Benson-Manica | I *should* know what I'm talking about - if I
    > ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Java and Swing, Oct 6, 2005
    #5
  6. i fixed it...

    void count(char *msg) {
    int count = 0;
    char *cp;
    char *bak = (char *) malloc((strlen(msg) * sizeof(char)) + 1);
    strcpy(bak, msg);

    while ( (cp = strchr(bak, ',')) != NULL ) {
    printf("cp = %s\n", cp);
    count++;
    bak++;
    }
    printf("Count: %d", count);
    }

    Java and Swing wrote:
    > when i try..
    >
    > void count(char *msg) {
    > int count = 0;
    > char *cp;
    >
    > while ( (cp = strchr(msg, ',')) != NULL ) {
    > printf("cp = %s\n", cp);
    > count++;
    > cp++;
    > }
    >
    > printf("Count: %d", count);
    > }
    >
    > main() {
    > char *msg = "1,2,4,5";
    > count(msg);
    > }
    >
    > ..the output i get is , ",2,4,5" forever.
    >
    > Christopher Benson-Manica wrote:
    > > Java and Swing <> wrote:
    > >
    > > > thanks for the input, i will try to make changes as you suggested.
    > > > what simpler ways are there to count the number of commas in msg?

    > >
    > > It is proper Usenet etiquette to include the text you are replying to.
    > > To do this using Google groups, please follow the instructions below,
    > > penned by Keith Thompson:
    > >
    > > If you want to post a followup via groups.google.com, don't use
    > > the broken "Reply" link at the bottom of the article. Click on
    > > "show options" at the top of the article, then click on the
    > > "Reply" at the bottom of the article headers.
    > >
    > > (That said, assuming count is an int and cp is a char *:
    > >
    > > count=0;
    > > while( (cp=strchr(msg,',')) != NULL ) {
    > > count++;
    > > cp++;
    > > }
    > >
    > > is much better for counting commas.)
    > >
    > > --
    > > Christopher Benson-Manica | I *should* know what I'm talking about - if I
    > > ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Java and Swing, Oct 6, 2005
    #6
  7. Java and Swing

    deaden Guest


    > char *bak = (char *) malloc((strlen(msg) * sizeof(char)) + 1);


    There is no need (and in fact it can hide errors) to cast malloc's
    return value, the reason your compiler is complaining if you dont have
    a cast is because you didnt include stdlib.h.

    deaden
    deaden, Oct 6, 2005
    #7
  8. Java and Swing <> wrote:

    > i fixed it...


    > while ( (cp = strchr(bak, ',')) != NULL ) {
    > printf("cp = %s\n", cp);
    > count++;
    > bak++;
    > }


    Well, it's certainly gratifying to see that you grasped the concept
    well enough to fix the bug in my code. :)

    I hate when I do that.

    --
    Christopher Benson-Manica | I *should* know what I'm talking about - if I
    ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Christopher Benson-Manica, Oct 6, 2005
    #8
  9. however, i did realize that the code does not count the number of
    tokens. it counts the number of chracters minus 1.

    for example, pass it, "1,2,3,4" and you get back count = 6. I changed
    to use strtok and have it working fine now.

    Christopher Benson-Manica wrote:
    > Java and Swing <> wrote:
    >
    > > i fixed it...

    >
    > > while ( (cp = strchr(bak, ',')) != NULL ) {
    > > printf("cp = %s\n", cp);
    > > count++;
    > > bak++;
    > > }

    >
    > Well, it's certainly gratifying to see that you grasped the concept
    > well enough to fix the bug in my code. :)
    >
    > I hate when I do that.
    >
    > --
    > Christopher Benson-Manica | I *should* know what I'm talking about - if I
    > ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Java and Swing, Oct 6, 2005
    #9
  10. Java and Swing <> wrote:

    > however, i did realize that the code does not count the number of
    > tokens. it counts the number of chracters minus 1.


    > for example, pass it, "1,2,3,4" and you get back count = 6. I changed
    > to use strtok and have it working fine now.


    I'm not doing myself any favors here (I already look like a dork), but
    the *real* fix should be

    cp=msg;
    count=0;
    while( (cp=strchr(cp,',')) != NULL ) {
    count++;
    cp++;
    }

    Do you see why you got 6 with

    > > > while ( (cp = strchr(bak, ',')) != NULL ) {
    > > > printf("cp = %s\n", cp);
    > > > count++;
    > > > bak++;
    > > > }


    ? (Try to figure it out - if you get it, then you really get this now.)

    In any case, do NOT use strtok() for counting the commas - I
    promise you it can be done better without it, just please overlook the
    fact that it took me three times to get it right...

    --
    Christopher Benson-Manica | I *should* know what I'm talking about - if I
    ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
    Christopher Benson-Manica, Oct 6, 2005
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sam
    Replies:
    3
    Views:
    14,075
    Karl Seguin
    Feb 17, 2005
  2. er
    Replies:
    8
    Views:
    420
    Roland Pibinger
    Jul 5, 2008
  3. Sam Kong
    Replies:
    5
    Views:
    223
    Rick DeNatale
    Aug 12, 2006
  4. Stanley Xu
    Replies:
    2
    Views:
    582
    Stanley Xu
    Mar 23, 2011
  5. christrier
    Replies:
    12
    Views:
    208
    Bart Lateur
    Nov 16, 2005
Loading...

Share This Page