Alternative libraries for C, that are efficient

J

John Reye

Hi,

recently I learned that various "standard C library" functions have
deficiencies in terms of high performance.
Examples include the return values of fgets(), strcpy(), strcat()
(thanks Eric Sosman for mentioning the last 2)

Example: char * strcat ( char * destination, const char *
source );
Return Value: destination is returned.

How useless is that! I already know the distination, in the first
place.
Why not return a pointer to the end of the concatenated string, or the
size of the string. This would cost the library no extra performance
cost whatsoever!


Are these deficiencies _only_ string-related?
Is there no good library which I can use for optimal computing?
Because if I use the "standard C library" I'll end up _not_ using many
of it's routines.
Where is a library, that is done right??

Would this library only be a string-handling library, or buffer-
handling library? Or are there other parts of the "standard C library"
that are also deficient.

Surely there must be a good library somewhere, or do all C programmers
really carry their own routines with them?

Thanks.
 
J

John Reye

Surely there must be a good library somewhere, or do all C programmers
really carry their own routines with them?

Or are the deficiencies of the "standard C libary" only an exception;
so that except for a few minor expections, I'll get fabulous
performance from it. So is it actually a really high-performance
library, except for a few minor blemishes.

Or should I really consider C++ and Boost (which is still actively
developed today), with its monstrous features. ;)
 
T

tom st denis

Hi,

recently I learned that various "standard C library" functions have
deficiencies in terms of high performance.
Examples include the return values of fgets(), strcpy(), strcat()
(thanks Eric Sosman for mentioning the last 2)

Example:            char * strcat ( char * destination, constchar *
source );
Return Value:     destination is returned.

How useless is that! I already know the distination, in the first
place.
Why not return a pointer to the end of the concatenated string, or the
size of the string. This would cost the library no extra performance
cost whatsoever!

It's so you can use it in subsequent calls e.g.

puts(strcat(foo, bar));

You could easily write your own mystrcat_end function to do what you
want.

Tom
 
J

Jens Gustedt

Am 04/12/2012 06:01 PM, schrieb John Reye:
recently I learned that various "standard C library" functions have
deficiencies in terms of high performance.
Examples include the return values of fgets(), strcpy(), strcat()
(thanks Eric Sosman for mentioning the last 2)

Example: char * strcat ( char * destination, const char *
source );
Return Value: destination is returned.

How useless is that!

very useful for constructs such as

char const* s = strcat(strcpy(malloc(100), "Hello "), "world");

or

char const* t = strcat((char[100]){ "Hello " }, s);

I already know the distination, in the first place.

No, not necessarily. The first argument is an expression. This feature
avoids to evaluate that expression multiple times.

Your mileage just varies from that of the C library designers.

Jens
 
B

Ben Pfaff

John Reye said:
Example: char * strcat ( char * destination, const char *
source );
Return Value: destination is returned.

How useless is that! I already know the distination, in the first
place.
Why not return a pointer to the end of the concatenated string, or the
size of the string. This would cost the library no extra performance
cost whatsoever!

SUSv4 has stpcpy() that does just that.
 
I

ImpalerCore

Or are the deficiencies of the "standard C libary" only an exception;
so that except for a few minor expections, I'll get fabulous
performance from it. So is it actually a really high-performance
library, except for a few minor blemishes.

Or should I really consider C++ and Boost (which is still actively
developed today), with its monstrous features. ;)

There are many C libraries that specialize on a single task (like
zlib, fftw), and some libraries that try to take a broader approach to
enhance C functionality (GLib). At the same time, many of these
extensions are scattered about, supported on this subset of OS or
compiler's library, with no centralized location. Unfortunately,
there is no Boost for C to function as a de-facto repository of these
kinds of functions.

On a different note, performance for performance's sake is not really
useful unless you have something that works. I place it last on my
software writing priorities.

1. Make it work.
2. Make it pretty. (well designed API, documentation)
3. Make it robust. (handle errors gracefully)
4. Make it fast. (if you really need to)

Best regards,
John D.
 
J

James Kuyper

Am 04/12/2012 06:01 PM, schrieb John Reye: ....

very useful for constructs such as

char const* s = strcat(strcpy(malloc(100), "Hello "), "world");

Using the possibly null value returned by malloc() is not a good idea,
though it is a commonplace bit of lazy programming.
 
J

James Kuyper

Hi,

recently I learned that various "standard C library" functions have
deficiencies in terms of high performance.

The only thing you've learned from recent threads in this forum is that
some C standard library functions are specified by the standard to have
less than ideal interfaces. That's not exactly an earth-shattering piece
of news - it's true of almost every library. If you've learned of
significant (or even insignificant but measurable) performance problems
with these functions, you learned of it in some other context.

It's meaningless to talk about the performance of *the* C standard
library. There's no single C standard library - just a standard which
documents what the library does and how to use it, and many different
implementations of that library. Performance is meaningful only when you
specify a particular implementation of that library. Which particular
implementation have you seen bad performance figures for? What were
those figures?

....
Are these deficiencies _only_ string-related?

No - there are lots of other kinds of minor problems with the C standard
library.
Is there no good library which I can use for optimal computing?

I think that you can safely assume that no library you've ever heard of,
and no library that you ever will hear of, is perfectly optimal. There's
lots of good sub-optimal libraries out there, including various
implementations of the C standard library.
Because if I use the "standard C library" I'll end up _not_ using many
of it's routines.

Most people use only a small portion of the C standard library; there's
nothing unusual about that. I think that what you mean is something
different - you plan on refusing to use many C standard library
functions that could be useful, because you consider them less than optimal.
Surely there must be a good library somewhere, or do all C programmers
really carry their own routines with them?

I almost never write my own routines for any purpose for which a C
standard library routine can be made to serve, even if the library
routine is a less-than perfect fit to my desires. The main exception is
that I will often manually inline code that is roughly equivalent to
some of the standard str*() and mem*() functions, especially strtok().
For the other str*() and mem*() functions, I'll usually inline the code
only if I need a pointer to the end of the memory that was processed,
something that the standard library functions don't provide. The
in-lining is generally trivial for most of those functions.
 
K

Kaz Kylheku

"Premature optimization is the root of all evil"

If you're designing a programming language, and working with an interpreted
implementation for the time being, is it a case of premature optimization to be
concerned with that the language features will be suitable for compilation?
 
J

John Reye

It's so you can use it in subsequent calls e.g.
puts(strcat(foo, bar));

I'd rather use
strcat(foo, bar);
puts(foo);

and have a well-designed interface of strcat, to return something
useful.
 
J

John Reye

very useful for constructs such as

char const* s = strcat(strcpy(malloc(100), "Hello "), "world");
What's wrong with:

char * s;
if ((s = (char *)malloc(100)) == NULL) {
fprintf(stderr, "Out of memory\n");
exit(1);
}
strcpy(s, "Hello ");
strcat(s, "world");

Are you trying to tell me that your's will be more efficient.
Me'thinks not.

or

char const* t = strcat((char[100]){ "Hello " }, s);
char *t = (char *)((char[100]){ "Hello " });
strcat(t, s);

So the function's interfaces were designed in order to write succint
code, which has absolutely no advantage, if I write it differently???
And as a bonus I cannot use it, when I want to find the end of the
string.

Great. ;)


No, not necessarily. The first argument is an expression. This feature
avoids to evaluate that expression multiple times.

Sorry, I don't follow. Could you explain it (perhaps with an example).
 
J

John Reye

The only thing you've learned from recent threads in this forum is that
some C standard library functions are specified by the standard to have
less than ideal interfaces.
Yes that's it. Plus: these badly designed interfaces, can may force
you to do things (e.g. strlen), that you would not have to do
otherwise.

It's meaningless to talk about the performance of *the* C standard
library. There's no single C standard library
I meant the badly designed interface.
No - there are lots of other kinds of minor problems with the C standard
library.


I think that you can safely assume that no library you've ever heard of,
and no library that you ever will hear of, is perfectly optimal.
What about a library that has better interfaces.
You can always change the internals, but if the function interfaces
are bad, you've got a deficiency from the start. ;)

you plan on refusing to use many C standard library
functions that could be useful, because you consider them less than optimal.
I'd only refuse it, if I thought the interface was bad.

And I'm trying to find my own style.
For what do I want to use the standard C lib; for what my own
routines.

I almost never write my own routines for any purpose for which a C
standard library routine can be made to serve, even if the library
routine is a less-than perfect fit to my desires. The main exception is
that I will often manually inline code that is roughly equivalent to
some of the standard str*() and mem*() functions, especially strtok().
For the other str*() and mem*() functions, I'll usually inline the code
only if I need a pointer to the end of the memory that was processed,
something that the standard library functions don't provide. The
in-lining is generally trivial for most of those functions.

Could you perhaps give a brief example of what you mean with inlining,
here.
Thanks.
 
J

John Reye

1. Make it work.
2. Make it pretty. (well designed API, documentation)
3. Make it robust. (handle errors gracefully)
4. Make it fast.   (if you really need to)

1. Consider what you need: does it need to be fast? is it throw-away-
program? is it part of a large project?
You only need one rule. You cannot generalize, cause it always...
depends. ;)
 
B

Barry Schwarz

Hi,

recently I learned that various "standard C library" functions have
deficiencies in terms of high performance.

Have you actually measured the performance impact of calling strlen
after each fgets?
Examples include the return values of fgets(), strcpy(), strcat()
(thanks Eric Sosman for mentioning the last 2)

Example: char * strcat ( char * destination, const char *
source );
Return Value: destination is returned.

How useless is that! I already know the distination, in the first
place.

Useless is in the eye of the beholder.

char buffer[100] = "The input is ";
puts(strcat(buffer, flag ? "valid" : "invalid");

or

strcat(strcpy(new_large_buffer,
original_text_in_small_buffer),
additional_text);
Why not return a pointer to the end of the concatenated string, or the
size of the string. This would cost the library no extra performance
cost whatsoever!

Was that true 25+ years ago when the language was being designed?
Are these deficiencies _only_ string-related?
Is there no good library which I can use for optimal computing?

Any library where fgets returns something other the destination
pointer would be non-portable in the extreme. Some might be tempted
to say it is not even C.

The approach taken by some compiler writers when they need to tweak a
standard function (such as add an extra parameter to make it safer) is
to create a new function with a similar name that is reserved for the
implementation (such as prepending an _ or appending a _s). Then the
user has the choice of using the standard function or the system
specific extension.
Because if I use the "standard C library" I'll end up _not_ using many
of it's routines.

So you would rather loop through fgetc yourself? You might want to
take one of your programs which is suffering from this fgets
deficiency and replace the calls to fgets and strlen with the fgetc
loop and determine the real benefit.

Or are you planning to rewrite all the standard functions you don't
like? For how many different systems will you do this? And in what
newsgroup will you discuss any problems you run into using these
functions.
Where is a library, that is done right??

Not for any C system.
Would this library only be a string-handling library, or buffer-
handling library? Or are there other parts of the "standard C library"
that are also deficient.

Only you can tell what is deficient.
Surely there must be a good library somewhere, or do all C programmers
really carry their own routines with them?

Most suffer these horrendous performance problems in silence by using
the standard library. Others have been known to write their own
wrapper functions to encapsulate the standard functions. But this is
usually done to add functionality (such as common error checking)
rather than deal with undocumented performance issues.
 
J

James Kuyper

1. Consider what you need: does it need to be fast? is it throw-away-
program? is it part of a large project?
You only need one rule. You cannot generalize, cause it always...
depends. ;)

"Make it work" is always first - if you don't need the program to work
correctly, you don't need to write a new program; there's plenty of
existing programs that already don't do whatever it is that the new
program was supposed to do.
 
W

Willem

James Kuyper wrote:
) On 04/12/2012 02:32 PM, John Reye wrote:
)>> 1. Make it work.
)>> 2. Make it pretty. (well designed API, documentation)
)>> 3. Make it robust. (handle errors gracefully)
)>> 4. Make it fast. (if you really need to)
)>
)> 1. Consider what you need: does it need to be fast? is it throw-away-
)> program? is it part of a large project?
)> You only need one rule. You cannot generalize, cause it always...
)> depends. ;)
)
) "Make it work" is always first - if you don't need the program to work
) correctly, you don't need to write a new program; there's plenty of
) existing programs that already don't do whatever it is that the new
) program was supposed to do.

That's the kind of thinking that brought us 2-megabyte XML blobs
as "database entities". Some stuff you just can't refactor when
it turns out it's a performance killer, so you *have* to account
for that from the beginning. Not to mention 'if it works, ship'


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
W

Willem

Barry Schwarz wrote:
) On Thu, 12 Apr 2012 09:01:40 -0700 (PDT), John Reye
)
)>Hi,
)>
)>recently I learned that various "standard C library" functions have
)>deficiencies in terms of high performance.
)
) Have you actually measured the performance impact of calling strlen
) after each fgets?
)
)>Examples include the return values of fgets(), strcpy(), strcat()
)>(thanks Eric Sosman for mentioning the last 2)
)>
)>Example: char * strcat ( char * destination, const char *
)>source );
)>Return Value: destination is returned.
)>
)>How useless is that! I already know the distination, in the first
)>place.
)
) Useless is in the eye of the beholder.
)
) char buffer[100] = "The input is ";
) puts(strcat(buffer, flag ? "valid" : "invalid");

Not much difference there:

char buffer[100] = "The input is ";
strcat(buffer, flag ? "valid" : "invalid");
puts(buffer);

You only have to type the variable name twice.
That's a very minor inconvenience. So it's marginally useful.
Returning a pointer to the nul terminator would be a lot more useful.

) or
)
) strcat(strcpy(new_large_buffer,
) original_text_in_small_buffer),
) additional_text);

If strcpy and strcat returned a pointer to the nul character at the end of
the string, that would still work, *and* it would be a lot more efficient.

I've seen code where they build a giant string by repeatedly strcat()ing
a few words on the end. That's needless O(n^2) performance right there.

)>Why not return a pointer to the end of the concatenated string, or the
)>size of the string. This would cost the library no extra performance
)>cost whatsoever!
)
) Was that true 25+ years ago when the language was being designed?

Yes. Duh. It's *NO* performance loss. Not negligible, but zero. Nil.
The library just copied a number of characters to somewhere else.
And it added a trailing zero to boot. So it knows where that zero went.

)>Are these deficiencies _only_ string-related?
)>Is there no good library which I can use for optimal computing?
)
) Any library where fgets returns something other the destination
) pointer would be non-portable in the extreme. Some might be tempted
) to say it is not even C.
)
) The approach taken by some compiler writers when they need to tweak a
) standard function (such as add an extra parameter to make it safer) is
) to create a new function with a similar name that is reserved for the
) implementation (such as prepending an _ or appending a _s). Then the
) user has the choice of using the standard function or the system
) specific extension.

I think gnu libc has a few of those. It would have been an easy fix
to accept those extra functions into the C standard interface, and then
everybody would have been happy.

)>Surely there must be a good library somewhere, or do all C programmers
)>really carry their own routines with them?
)
) Most suffer these horrendous performance problems in silence by using
) the standard library. Others have been known to write their own
) wrapper functions to encapsulate the standard functions. But this is
) usually done to add functionality (such as common error checking)
) rather than deal with undocumented performance issues.

The best approach I've seen is to have the build script check for
platform-specific extensions (such as strlcpy and strlcat), and if
those don't exist, include wrappers which emulate their behaviour.


I believe that these functions return the original pointer because
the language designers had some half-assed idea about viewing strings
as some kind of opaque type that you could manipulate through functions.
(I.E. conceptually, strcat takes two strings and returns one string,
if you conveniently forget that the first string is changed, and also
needs to have enough memory backing it to hold the whole string.)


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
J

James Kuyper

What about a library that has better interfaces.

I guarantee you that the library with better interfaces will still be
sub-optimal.
Could you perhaps give a brief example of what you mean with inlining,
here.

Here's an example of code to concatenate some strings:

outstring[0] = '\0';
for(str=0; str < num_strings; str++)
strcat(outstring, in_strings[str]);

My inline equivalent would be as follows:

char *p = outstring;
for(str=0; str < num_strings; str++)
{
const char *q = instrings[str];
while(*p++ = *q++);
p--; // move back to terminating null character
}

It's slightly more wordy, but keeps track of the current end of the
string, avoiding the time wasted by strcat() searching for the end of
the string. It's a minor issue, and I wouldn't bother with it if it were
significantly more difficult to do. Most of the str*() and mem*()
functions are easy to inline.

Here's some code in a utility library that I inherited responsibility
for several years ago. If caused problems when called inside a strtok()
loop in the calling routine:

if((p=strtok(local_string, ","))!=NULL){
do{
/* Byte_sum = MFAIL if datatype_to_DFNT fails.
Get HDF_num_type for p */
if((HDF_num_type=datatype_to_DFNT(p))==MFAIL)byte_sum=MFAIL;

/* If not MFAIL, increment byte_sum by the number of
bytes returned from DFKNTsize */
else byte_sum=byte_sum+(long int)DFKNTsize(HDF_num_type);

}while((p=strtok(NULL,","))!=NULL && byte_sum!=MFAIL);
}

To resolve this problem, a subordinate re-wrote the code, following my
advice, as follows:

while (local_string != NULL && byte_sum != MFAIL) {
next_string = strchr(local_string, ',');
if (next_string)
*next_string++ = '\0';
HDF_num_type = datatype_to_DFNT(local_string);
if (HDF_num_type==MFAIL)
byte_sum = MFAIL;
else
byte_sum += (long int)DFKNTsize(HDF_num_type);
local_string = next_string;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top