C-style string parsing

Discussion in 'C++' started by Christopher Benson-Manica, Oct 14, 2003.

  1. I have a C-style string (null-terminated) that consists of items in one of the
    following formats:
    14 characters
    5 characters space 8 characters
    6 characters colon 8 characters
    5 characters colon 8 characters

    Items are delimited by semicolons or commas. I have to produce a string
    delimited only by semicolons and containing items in the first two formats
    only. For example,

    "AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
    "AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"

    Posting to comp.lang.c yielded the following:

    int myfunc( const char *idlist )
    {
    int items=0;
    char *newstr=(char *)malloc( strlen(idlist)+1 );
    if( !newstr ) {
    return( -2 );
    }
    int srcidx=0;
    int destidx=0;
    int chars=0;

    for( ; idlist[srcidx] ; srcidx++ ) {
    if( idlist[srcidx] == ':' ) {
    if( chars == 5 ) {
    newstr[destidx++]=' ';
    chars++;
    }
    else if( chars != 6 )
    return( -1 ); // Invalid format
    }
    }
    else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
    if( chars != 14 ) { // Invalid format
    return( -1 );
    }
    newstr[destidx++]=';';
    chars=0;
    items++;
    }
    else if( ++chars > 14 ) {
    return( -1 );
    }
    else {
    newstr[destidx++]=idlist[srcidx];
    }
    }
    newstr[destidx]='\0';
    if( chars == 14 ) {
    items++;
    }
    else if( !items || chars ) { // items == 0 || chars != 0
    return( -1 );
    }
    printf( "The string '%s' has %d items.\n", newstr, items );
    /* Call a function using newstr here */
    free( newstr );
    return( 0 );
    }

    I'd like to know how to improve this function (specifically, the call to
    malloc()) to make it more like typical C++. One thing: Don't tell me to use
    std::string's, because it isn't an option (the C++ code at my company uses
    C-style strings almost exclusively).

    --
    Christopher Benson-Manica | Upon the wheel thy fate doth turn,
    ataru(at)cyberspace.org | upon the rack thy lesson learn.
     
    Christopher Benson-Manica, Oct 14, 2003
    #1
    1. Advertising

  2. Hello.

    > return( -1 ); // Invalid format


    You can do:

    const int INVALID_FORMAT= -1;

    And then

    return INVALID_FOMAT;

    Is auto-commented.

    > else if( !items || chars ) { // items == 0 || chars != 0


    Why comment what you intend to do instead of doing it?

    else if (items == 0 || chars != 0) {

    > I'd like to know how to improve this function (specifically, the call to
    > malloc()) to make it more like typical C++. One thing: Don't tell me to use


    Use new / delete instead of malloc / free.

    > std::string's, because it isn't an option (the C++ code at my company uses
    > C-style strings almost exclusively).


    You can be one of the exceptions ;)

    Regards.
     
    =?iso-8859-1?Q?Juli=E1n?= Albo, Oct 14, 2003
    #2
    1. Advertising

  3. Julián Albo <> spoke thus:

    > const int INVALID_FORMAT= -1;


    > And then


    > return INVALID_FOMAT;


    Well, the actual function uses an enumerated error code - I left it out for
    clarity.

    >> else if( !items || chars ) { // items == 0 || chars != 0


    > Why comment what you intend to do instead of doing it?


    > else if (items == 0 || chars != 0) {


    Because I want my code to be l337? ;)

    > You can be one of the exceptions ;)


    I think they have error handling code for exceptions like me ;)

    --
    Christopher Benson-Manica | Upon the wheel thy fate doth turn,
    ataru(at)cyberspace.org | upon the rack thy lesson learn.
     
    Christopher Benson-Manica, Oct 15, 2003
    #3
  4. Christopher Benson-Manica

    Sean Fraley Guest

    Christopher Benson-Manica wrote:

    > I have a C-style string (null-terminated) that consists of items in one of
    > the following formats:
    > 14 characters
    > 5 characters space 8 characters
    > 6 characters colon 8 characters
    > 5 characters colon 8 characters
    >
    > Items are delimited by semicolons or commas. I have to produce a string
    > delimited only by semicolons and containing items in the first two formats
    > only. For example,
    >
    > "AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
    > "AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"
    >
    > Posting to comp.lang.c yielded the following:
    >
    > int myfunc( const char *idlist )
    > {
    > int items=0;
    > char *newstr=(char *)malloc( strlen(idlist)+1 );
    > if( !newstr ) {
    > return( -2 );
    > }
    > int srcidx=0;
    > int destidx=0;
    > int chars=0;
    >
    > for( ; idlist[srcidx] ; srcidx++ ) {
    > if( idlist[srcidx] == ':' ) {
    > if( chars == 5 ) {
    > newstr[destidx++]=' ';
    > chars++;
    > }
    > else if( chars != 6 )
    > return( -1 ); // Invalid format
    > }
    > }
    > else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
    > if( chars != 14 ) { // Invalid format
    > return( -1 );
    > }
    > newstr[destidx++]=';';
    > chars=0;
    > items++;
    > }
    > else if( ++chars > 14 ) {
    > return( -1 );
    > }
    > else {
    > newstr[destidx++]=idlist[srcidx];
    > }
    > }
    > newstr[destidx]='\0';
    > if( chars == 14 ) {
    > items++;
    > }
    > else if( !items || chars ) { // items == 0 || chars != 0
    > return( -1 );
    > }
    > printf( "The string '%s' has %d items.\n", newstr, items );
    > /* Call a function using newstr here */
    > free( newstr );
    > return( 0 );
    > }
    >
    > I'd like to know how to improve this function (specifically, the call to
    > malloc()) to make it more like typical C++. One thing: Don't tell me to
    > use std::string's, because it isn't an option (the C++ code at my company
    > uses C-style strings almost exclusively).


    Don't be to set against std::string. If you need to write code that will be
    used by other people in you company, and they insist on using c-style
    strings, then simply make appropriate use of std::string::c_str(). Just
    because other people you work with want to make things hard on themselves
    doesn't mean that you have to.

    Sean
     
    Sean Fraley, Oct 15, 2003
    #4
  5. Christopher Benson-Manica escribió:

    > >> else if( !items || chars ) { // items == 0 || chars != 0

    >
    > > Why comment what you intend to do instead of doing it?

    >
    > > else if (items == 0 || chars != 0) {

    >
    > Because I want my code to be l337? ;)


    Doing things that the compiler can do for you is being l337? }:)

    Regards.
     
    =?iso-8859-1?Q?Juli=E1n?= Albo, Oct 15, 2003
    #5
  6. Sean Fraley <> spoke thus:

    > Don't be to set against std::string. If you need to write code that will be
    > used by other people in you company, and they insist on using c-style
    > strings, then simply make appropriate use of std::string::c_str(). Just
    > because other people you work with want to make things hard on themselves
    > doesn't mean that you have to.


    Well, it doesn't seem to be too useful to create a std::string just for
    parsing purposes and then convert back to a c_str... (un?)fortunately, the de
    facto paradigm here is still C anyway. Not that *I'm* necessarily sad about
    that (I *like* C!). The real problem comes from the fact that all the code
    uses custom classes and template classes as substitutes for the STL...

    --
    Christopher Benson-Manica | Upon the wheel thy fate doth turn,
    ataru(at)cyberspace.org | upon the rack thy lesson learn.
     
    Christopher Benson-Manica, Oct 15, 2003
    #6
  7. Christopher Benson-Manica

    Phlip Guest

    Christopher Benson-Manica wrote:

    > Well, it doesn't seem to be too useful to create a std::string just for
    > parsing purposes and then convert back to a c_str... (un?)fortunately,

    the de
    > facto paradigm here is still C anyway. Not that *I'm* necessarily sad

    about
    > that (I *like* C!). The real problem comes from the fact that all the

    code
    > uses custom classes and template classes as substitutes for the STL...


    Y'all are probably using C-style C++. Unless your C code actually so sloppy
    that C++ can't compile it.

    Follow this simple regimen:

    - use std::string, and any other highest-level C++ thing, at whim

    - have less bugs and tighter code than your colleagues

    - count said bugs.

    Here's Bjarne's "Don't use new[] like malloc()" interview:

    http://www.artima.com/intv/goldilocksP.html

    --
    Phlip
     
    Phlip, Oct 15, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rob Nicholson
    Replies:
    3
    Views:
    835
    Rob Nicholson
    May 28, 2005
  2. Markus Ilmola

    sscanf style string parsing

    Markus Ilmola, Mar 10, 2006, in forum: C++
    Replies:
    6
    Views:
    1,375
    Default User
    Mar 11, 2006
  3. Replies:
    21
    Views:
    1,452
    Alex Vinokur
    Aug 18, 2007
  4. Christopher

    c-style string vs std::string

    Christopher, Sep 16, 2011, in forum: C++
    Replies:
    20
    Views:
    799
  5. Ken Varn
    Replies:
    0
    Views:
    525
    Ken Varn
    Apr 26, 2004
Loading...

Share This Page