C-style string parsing

  • Thread starter Christopher Benson-Manica
  • Start date
C

Christopher Benson-Manica

I have a C-style string (null-terminated) that consists of items in one of the
following formats:
14 characters
5 characters space 8 characters
6 characters colon 8 characters
5 characters colon 8 characters

Items are delimited by semicolons or commas. I have to produce a string
delimited only by semicolons and containing items in the first two formats
only. For example,

"AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
"AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"

Posting to comp.lang.c yielded the following:

int myfunc( const char *idlist )
{
int items=0;
char *newstr=(char *)malloc( strlen(idlist)+1 );
if( !newstr ) {
return( -2 );
}
int srcidx=0;
int destidx=0;
int chars=0;

for( ; idlist[srcidx] ; srcidx++ ) {
if( idlist[srcidx] == ':' ) {
if( chars == 5 ) {
newstr[destidx++]=' ';
chars++;
}
else if( chars != 6 )
return( -1 ); // Invalid format
}
}
else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
if( chars != 14 ) { // Invalid format
return( -1 );
}
newstr[destidx++]=';';
chars=0;
items++;
}
else if( ++chars > 14 ) {
return( -1 );
}
else {
newstr[destidx++]=idlist[srcidx];
}
}
newstr[destidx]='\0';
if( chars == 14 ) {
items++;
}
else if( !items || chars ) { // items == 0 || chars != 0
return( -1 );
}
printf( "The string '%s' has %d items.\n", newstr, items );
/* Call a function using newstr here */
free( newstr );
return( 0 );
}

I'd like to know how to improve this function (specifically, the call to
malloc()) to make it more like typical C++. One thing: Don't tell me to use
std::string's, because it isn't an option (the C++ code at my company uses
C-style strings almost exclusively).
 
?

=?iso-8859-1?Q?Juli=E1n?= Albo

Hello.
return( -1 ); // Invalid format

You can do:

const int INVALID_FORMAT= -1;

And then

return INVALID_FOMAT;

Is auto-commented.
else if( !items || chars ) { // items == 0 || chars != 0

Why comment what you intend to do instead of doing it?

else if (items == 0 || chars != 0) {
I'd like to know how to improve this function (specifically, the call to
malloc()) to make it more like typical C++. One thing: Don't tell me to use

Use new / delete instead of malloc / free.
std::string's, because it isn't an option (the C++ code at my company uses
C-style strings almost exclusively).

You can be one of the exceptions ;)

Regards.
 
C

Christopher Benson-Manica

Julián Albo said:
const int INVALID_FORMAT= -1;
return INVALID_FOMAT;

Well, the actual function uses an enumerated error code - I left it out for
clarity.
Why comment what you intend to do instead of doing it?
else if (items == 0 || chars != 0) {

Because I want my code to be l337? ;)
You can be one of the exceptions ;)

I think they have error handling code for exceptions like me ;)
 
S

Sean Fraley

Christopher said:
I have a C-style string (null-terminated) that consists of items in one of
the following formats:
14 characters
5 characters space 8 characters
6 characters colon 8 characters
5 characters colon 8 characters

Items are delimited by semicolons or commas. I have to produce a string
delimited only by semicolons and containing items in the first two formats
only. For example,

"AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
"AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"

Posting to comp.lang.c yielded the following:

int myfunc( const char *idlist )
{
int items=0;
char *newstr=(char *)malloc( strlen(idlist)+1 );
if( !newstr ) {
return( -2 );
}
int srcidx=0;
int destidx=0;
int chars=0;

for( ; idlist[srcidx] ; srcidx++ ) {
if( idlist[srcidx] == ':' ) {
if( chars == 5 ) {
newstr[destidx++]=' ';
chars++;
}
else if( chars != 6 )
return( -1 ); // Invalid format
}
}
else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
if( chars != 14 ) { // Invalid format
return( -1 );
}
newstr[destidx++]=';';
chars=0;
items++;
}
else if( ++chars > 14 ) {
return( -1 );
}
else {
newstr[destidx++]=idlist[srcidx];
}
}
newstr[destidx]='\0';
if( chars == 14 ) {
items++;
}
else if( !items || chars ) { // items == 0 || chars != 0
return( -1 );
}
printf( "The string '%s' has %d items.\n", newstr, items );
/* Call a function using newstr here */
free( newstr );
return( 0 );
}

I'd like to know how to improve this function (specifically, the call to
malloc()) to make it more like typical C++. One thing: Don't tell me to
use std::string's, because it isn't an option (the C++ code at my company
uses C-style strings almost exclusively).

Don't be to set against std::string. If you need to write code that will be
used by other people in you company, and they insist on using c-style
strings, then simply make appropriate use of std::string::c_str(). Just
because other people you work with want to make things hard on themselves
doesn't mean that you have to.

Sean
 
?

=?iso-8859-1?Q?Juli=E1n?= Albo

Christopher Benson-Manica escribió:
Because I want my code to be l337? ;)

Doing things that the compiler can do for you is being l337? }:)

Regards.
 
C

Christopher Benson-Manica

Sean Fraley said:
Don't be to set against std::string. If you need to write code that will be
used by other people in you company, and they insist on using c-style
strings, then simply make appropriate use of std::string::c_str(). Just
because other people you work with want to make things hard on themselves
doesn't mean that you have to.

Well, it doesn't seem to be too useful to create a std::string just for
parsing purposes and then convert back to a c_str... (un?)fortunately, the de
facto paradigm here is still C anyway. Not that *I'm* necessarily sad about
that (I *like* C!). The real problem comes from the fact that all the code
uses custom classes and template classes as substitutes for the STL...
 
P

Phlip

Christopher said:
Well, it doesn't seem to be too useful to create a std::string just for
parsing purposes and then convert back to a c_str... (un?)fortunately, the de
facto paradigm here is still C anyway. Not that *I'm* necessarily sad about
that (I *like* C!). The real problem comes from the fact that all the code
uses custom classes and template classes as substitutes for the STL...

Y'all are probably using C-style C++. Unless your C code actually so sloppy
that C++ can't compile it.

Follow this simple regimen:

- use std::string, and any other highest-level C++ thing, at whim

- have less bugs and tighter code than your colleagues

- count said bugs.

Here's Bjarne's "Don't use new[] like malloc()" interview:

http://www.artima.com/intv/goldilocksP.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top