Boost process and C

I

Ian Collins

CBFalconer said:
.... snip ...



And, if you write the library in truly portable C, without any
silly extensions and/or entanglements, you just compile the library
module. All the compiler vendor need to do is meet the
specifications of the C standard.
I think the point of the original paragraph has been lost, the reason to
have a standard library is to remove the need for everyone to roll their
own and get newcomers producing useful applications faster. Sure there
are those of us with our own tried and trusted solutions, but that
doesn't help the newcomers.

The same thing happened with C++, before the standard there were many
incompatible commercial and private libraries, now most of these have
been replaced with standard implementations.

It's a win win situation, you can still use your own if you choose.
 
W

websnarf

CBFalconer said:
The last time I took an (admittedly cursory) look at Bstrlib, I
found it cursed with non-portabilities

You perhaps would like to name one?
[...] and unwarrented assumptions,

Such as?
not to mention an interminable API.

It is not necessary to learn or use the entire API to use it
effectively. If it seems large you can blame the current crop of users
who have motivated all the extensions from its original base functions.
In each case I could not make a strong enough case to avoid each
functions' inclusion. You appear to be the only person obsessed with
this non-issue.
[...] This is a criticism very few can make of the standard C string operations.

The C standard says that whether or not "string literals" are writable
is platform specific. It doesn't even specify what wchar_t contains
-- for example, the WATCOM compiler supports the old UCS-2 in its 16
bit compilers, and UTF-32 in its 32 bit compilers (while not
implementing a properly functioning setlocale function). So much for
portability. Bstrlib is *designed* for portability (including semantic
behavior irrespective of platform).

Every C string function which writes makes the one unwarranted
assumption that it cannot make -- i.e., that the size of the buffer
that holds the destination will be large enough for whatever operation
it is doing. fgets() makes the assumption that the input stream is in
text mode, or that it doesn't read a '\0' or that you just don't care
about that case. Nearly every string function *assumes* that the
parameters are non-aliasing (that's worse than just an unwarranted
assumption -- its just degenerate in terms of functionality). It is
*assumed* that you don't make interleaved calls strtok on different
strings. And of course, the format of C strings *assumes* that '\0'
will never be considered part of a string's content -- this assumption
ends up permeating all system string APIs, for any platform that uses C
as its main implementation language.

In terms of API size, Bstrlib is about 80 base C functions (which
includes the stream API functions) and 18 macros (there are 22 "bonus"
functions for doing MIME parsing and other miscellaneous utilities, and
there is a C++ API). The string.h file for one of my compilers has
about 57 extern functions. Then we need to add
(f|s|v|vs|sn|vsn|vf|)printf, (f|)puts, (f|)gets, (s|)scanf, ato(f|i|l),
strto(d|l), which is 19 all by itself. Add in the wide characters and
you'll nearly double that count. Bstrlib is not drammatically more
"interminable" than the standard C library.

And you just can't compare the size of the APIs to judge how easy it is
to use or understand. To know all of the undefined scenarios of the C
library functions, you have to do a function by function examination of
the standard. With Bstrlib, all you have to do is make sure each
parameter is well defined, you don't abuse the write protection, and
don't destroy a value if you use an alias of it later. So the effort
to understand each function is Bstrlib is far lower. The functions in
Bstrlib also tend, on average, to do a lot more per function -- so your
investment in understanding has a higher payoff. And the thing is open
source, so there is never any ambiguity about any Bstrlib function that
you cannot authoritatively figure out on your own.

So I don't know what you last looked at, because you are just plain
wrong.
 
W

websnarf

Keith said:
Ian said:
Where would you draw the line on topicality?
My interpretation is [...]
Potential improvements?

To which the familliar refrain is "if you don't like the features of C,
use some other language".

If you don't like the features of C, you can either:

(1) Do without, or

(2) Use some other language that provides those features, or

(3) Use a C compiler that implements those features as an extension
(thus tying your code to that specific implementation), or

(4) Push to have the new features added to the next revision of the C
standard, and then wait for the new standard to be published, and
then wait for implementations to support the new standard. If
you're *very* lucky, this might take as little as a decade.

If you think the standardization process is too slow, you can discuss
it in comp.std.c.

Its not just slow -- its *WRONG*. They add things that shouldn't be
added, and they rarely remove things, even things where there simply is
no question that they need to be removed.
[...] If you know of an alternative other than the ones
I've mentioned (and they've *all* been mentioned here), feel free to
suggest it.

How about doing what the GMP authors did, what Hans Boehm did with his
Boehm garbage collector, and what I've done in my own priviate
libraries (a self debuging heap with sub-heaps with fast mass-free
capabilities, coroutines, etc.)? Just *do it* and shake your head at
the standards committee who are of no help. (Then there is the Walter
Bright/Jacob Navia solution -- but I don't have the energy to build my
own compiler.)
 
P

P.J. Plauger

[...] If you know of an alternative other than the ones
I've mentioned (and they've *all* been mentioned here), feel free to
suggest it.

How about doing what the GMP authors did, what Hans Boehm did with his
Boehm garbage collector, and what I've done in my own priviate
libraries (a self debuging heap with sub-heaps with fast mass-free
capabilities, coroutines, etc.)? Just *do it* and shake your head at
the standards committee who are of no help. (Then there is the Walter
Bright/Jacob Navia solution -- but I don't have the energy to build my
own compiler.)

For once we are in agreement. Standardization works best when it
*codifies existing practice*. Absent existing practice, standards
committees indulge in speculative invention. Even if that invention
is done well it may not have much of a market.

So if you think you know how to evolve C, *just do it*. Build a
following and your bright new addition will be an obvious candidate
for a future DR, or revision of the C Standard. And meanwhile you
don't have to wait a decade for those stodgy old, uh, committee
members to do the obvious.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
R

Robert Latest

If you don't like the features of C, you can either:

[ suggestions (1) through (4) snipped ]

(5) Write a compiler that supports the needed features,
preferably for a popular platform with a wide user
base such as Windows, and make it available for free.

If people like the language extensions, they will become popular
quickly and make it into other implementations as well, thus
creating what is known as a "de facto standard". No need to wait
for an ISO committee.

robert
 
R

regis

Giorgos said:
No, please. This looks strangely familiar if you know LISP :p

Plus, it doesn't really work for functions with an arbitrary number of
arguments, and this creates an inconsistency in the elegantly simple
syntax of C.

I know no infix scheme for functions in Lisp.
In Lisp, This would look like:

(Vect_Scale
(Vect_Dot
(Vect_Sub v u)
(Vect_Sub w v)
)
(Vect_Sub_va p q r s)
)

which is much like it looks in C without infix notation:

Vect_Scale (
Vect_Dot (
Vect_Sub (u,v),
Vect_Sub (w,v)
),
Vect_Sub_va (p, q, r, s, ARGS_END)
);
 
R

RSoIsCaIrLiIoA

Please post the source code to the last project you wrote in
assembler, and which ported unchanged to, say, an 8080, an 8051, an
8086, a 6800, a68000, an IBM 360/370, a PDP11, an HP3000. I trust
it ran to completion, and produced the same output on all systems
for the same input.

"Please post the source code to the last project you wrote in"
some portable C code more complex than
 
C

CBFalconer

CBFalconer wrote:
.... snip ...

You perhaps would like to name one?

I took another 2 minute look, and was immediately struck by the use
of int for sizes, rather than size_t. This limits reliably
available string length to 32767. I did find an explanation and
justification for this. Conceded, such a size is probably adequate
for most usage, but the restriction is not present in standard C
strings.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
W

websnarf

P.J. Plauger said:
[...] If you know of an alternative other than the ones
I've mentioned (and they've *all* been mentioned here), feel free to
suggest it.

How about doing what the GMP authors did, what Hans Boehm did with his
Boehm garbage collector, and what I've done in my own priviate
libraries (a self debuging heap with sub-heaps with fast mass-free
capabilities, coroutines, etc.)? Just *do it* and shake your head at
the standards committee who are of no help. (Then there is the Walter
Bright/Jacob Navia solution -- but I don't have the energy to build my
own compiler.)

For once we are in agreement. Standardization works best when it
*codifies existing practice*.

Ooooh! You mean like TR 24731? Are you sure that you are codifying
actual existing practice?
[...] Absent existing practice, standards
committees indulge in speculative invention.

You mean like complex numbers (but specifically excluding Gaussian
integers)?
[...] Even if that invention
is done well it may not have much of a market.

So if you think you know how to evolve C, *just do it*. Build a
following and your bright new addition will be an obvious candidate
for a future DR, or revision of the C Standard.

You know, debugging heaps have been around for ever (speaking of
*market*). strlcat, strlcpy, and strtok_r have been around for some
time (certainly from well before 1999), and have had plenty of pickup
in certain environments. The C standard didn't pick them up, why would
they pick up any other useful, successful extension? I don't buy your
claim, because its not consistent with observed reality.

You might also like to observe the efforts of Bjarne Stroustrup, Guido
van Rostum and Roberto Ierusalimschy. These people have garnered a
significant market following that has gone completely unnoticed by the
Standards committee.
[...] And meanwhile you
don't have to wait a decade for those stodgy old, uh, committee
members to do the obvious.

As I posted elsewhere, I am thinking more of the word "flawed" than old
or stodgy.
 
W

websnarf

CBFalconer said:
I took another 2 minute look, and was immediately struck by the use
of int for sizes, rather than size_t. This limits reliably
available string length to 32767.

Using size_t is also not any more *portable* than using int. Any lack
of portability is merely a reflection of the lack of the intrinsic
non-portability of the language itself. size_t can only reliably
contain values as high as 65535 -- so you are saying this difference in
limits (between 32k and 64k) embodies a significant universe of string
manipulation to warrant a claim against Bstrlib's portability?
[...] I did find an explanation and
justification for this. Conceded, such a size is probably adequate
for most usage, but the restriction is not present in standard C
strings.

Your going to need to conceed on more grounds than that. There is a
reason many UNIX systems tried to add a ssize_t type, and why TR 24731
has added rsize_t to their extension. (As a side note, I strongly
suspect the Microsoft, in fact, added this whole rsize_t thing to TR
24731 when they realized that Bstrlib, or things like it, actually has
far better real world safety because its use of ints for string
lengths.) Using a long would be incorrect since there are some systems
where a long value can exceed a size_t value (and thus lead to falsely
sized mallocs.) There is also the matter of trying to codify read-only
and constant strings and detecting errors efficiently (negative lengths
fit the bill.) Using ints is the best choice because at worst its
giving up things (super-long strings) that nobody cares about, it
allows in an efficient way for all desirable encoding scenarios, and it
avoids any wrap around anomolies causing under-allocations. If I tried
to use size_t I would give up a significant amount of safety and design
features (or else I would have to put more entries into the header,
making it less efficient).

Hundreds of people have downloaded Bstrlib and seriously looked at it
already. I've gotten plenty of feedback over its lifetime which has
lead to its evolution (so its not like it hasn't already had
significant review). You, of all people, are not going find any
serious flaw in it by perusing it for 2 minutes.

"Cursed with non-portabilities" indeed ...
 
K

Keith Thompson

Using size_t is also not any more *portable* than using int. Any lack
of portability is merely a reflection of the lack of the intrinsic
non-portability of the language itself. size_t can only reliably
contain values as high as 65535 -- so you are saying this difference in
limits (between 32k and 64k) embodies a significant universe of string
manipulation to warrant a claim against Bstrlib's portability?

Yes.

I can't speak for Chuck, but size_t can reliably contain the size of
any object that the system can create. That's what it's for. If the
maximum value of size_t is 65535, then the system isn't going to be
able to create objects bigger than 65535 bytes.

It's entirely possible to have a conforming implementation in which
int is 16 bits, but size_t is 32 bits. On such a system, using int to
represents sizes needlessly sacrifices the ability to handle objects
bigger than 32767 bytes. For that matter, it's conceivable (but
unlikely) that size_t could be smaller than int.

[...]
Hundreds of people have downloaded Bstrlib and seriously looked at it
already. I've gotten plenty of feedback over its lifetime which has
lead to its evolution (so its not like it hasn't already had
significant review). You, of all people, are not going find any
serious flaw in it by perusing it for 2 minutes.

"Cursed with non-portabilities" indeed ...

Do you have any figures on the number of people who have looked at it
briefly, rejected it because it's too complex, and not bothered to
mention it to you? I'm not saying there are a lot of such people
(obviously I don't have any such figures myself), but I don't think
you can exclude the possibility that it's a common thing.
 
W

Walter Banks

You know, debugging heaps have been around for ever (speaking of
*market*). strlcat, strlcpy, and strtok_r have been around for some
time (certainly from well before 1999), and have had plenty of pickup
in certain environments. The C standard didn't pick them up, why would
they pick up any other useful, successful extension? I don't buy your
claim, because its not consistent with observed reality.

TR18037 codifies embedded standalone practices of many different
compiler companies.
You might also like to observe the efforts of Bjarne Stroustrup . . .
These people have garnered a
significant market following that has gone completely unnoticed by the
Standards committee.

I guess that why Bjarne Stroustrup attends some of the standards meetings.

My point is the process may be long at times but it is a collection of
checks and balances that achieves broad input from a wide range of
sources. A single driven individual can start the process but cannot
force the outcome. It is more than the WG14 committee that needs
to be convinced of the merits of changes.

The TR's have become a kind of standards beta test and with broad
instead of limited use ideas can change.

Walter..
 
W

websnarf

Keith said:
Yes.

I can't speak for Chuck, but size_t can reliably contain the size of
any object that the system can create. That's what it's for. If the
maximum value of size_t is 65535, then the system isn't going to be
able to create objects bigger than 65535 bytes.

It's entirely possible to have a conforming implementation in which
int is 16 bits, but size_t is 32 bits. On such a system, using int to
represents sizes needlessly sacrifices the ability to handle objects
bigger than 32767 bytes. For that matter, it's conceivable (but
unlikely) that size_t could be smaller than int.

None of this speaks of the portability complaint CBF was claiming to
make. Its also an extremely marginal complaint that ignores the
tremendous benefit of this design choice, as I've already thoroughly
explained.
[...]
Hundreds of people have downloaded Bstrlib and seriously looked at it
already. I've gotten plenty of feedback over its lifetime which has
lead to its evolution (so its not like it hasn't already had
significant review). You, of all people, are not going find any
serious flaw in it by perusing it for 2 minutes.

"Cursed with non-portabilities" indeed ...

Do you have any figures on the number of people who have looked at it
briefly, rejected it because it's too complex, and not bothered to
mention it to you? [...]

How do you suppose I would obtain such figures? I instead count the
actual feedback, and from that feedback count the number of people who
complained of portability problems (none) hidden assumptions (none) and
interminable API (none -- well even less than none, since people keep
asking me to add things into it.)

I also roughly count the number of people who have included Bstrlib in
their project but for some reason don't tell me about this fact (just
using google). While some of them may be being irresponsible and just
using the library without looking at the source at all -- I highly
doubt that that represents the majority, or that I can find even a
significant percentage of all such people this way.

Its not like a language standard where the audience is obvious
(compiler vendors) and so if you wanted to know if the standard was
rejected, you could just count up the number of people who upgraded
their compiler, but didn't implement the standard, say.
[...] I'm not saying there are a lot of such people (obviously I don't have any such
figures myself),

You can look in the sourceforge project page; the number of downloads
is not a secret.

There have, in fact, been thousands of downloads, but clearly a
significant number of them are probably just people getting the latest
version -- so I'm just estimating. There are a lot of "first day of
release" downloads of my library, indicating I have repeat customers
(about a hundred within the first few days of any new release) who are
very interested in keeping up to date with the latest version. But at
the same time, there are many more conservative people with on going
projects who I can see from their sources are using old versions of the
library.
[...] but I don't think
you can exclude the possibility that it's a common thing.

That's true, but I have received enough feedback to suggest that this
library has survived quite a thorough amount of auditting. Some
bugs have been caught, and the design has been improved because of it.

For the average source file in any environment, you are lucky if you
can get *one* person to review it. Bstrlib has clearly far exceeded
that.
 
B

Ben C

Um, the same is true for C++.

Yes of course, I never intended to imply that it wasn't.

The point I was making was that operator overloading doesn't mix so
easily with things that might need to be allocated and freed manually--
i.e. objects of user-defined types. You start needing constructors and
destructors, which C++ (but not C) has.
 
J

jacob navia

(e-mail address removed) a écrit :
[snip]
None of this speaks of the portability complaint CBF was claiming to
make. Its also an extremely marginal complaint that ignores the
tremendous benefit of this design choice, as I've already thoroughly
explained.

Look, I have used size_t for my string library, and a "flags" field for
annotating info like "read only", etc. Your method looks much more
interesting since it avoids an extra field an saves 32 bits per string.

True, I would have an upper limit of 2GB for strings but I think that
will be enough... Anyway having a 2GB limit or a 4GB limit is not so
different. You will have to have a limit *anyway*

Besides, if all Chuck has to say about your library is that "it should
have been a size_t" well... it is not a big problem. You are lucky that
he is not complaining that it doesn't run in a 486 :)

jacob
 
B

Ben C

CBFalconer said:
I took another 2 minute look, and was immediately struck by the use
of int for sizes, rather than size_t. This limits reliably
available string length to 32767.
[snip]
[...] I did find an explanation and
justification for this. Conceded, such a size is probably adequate
for most usage, but the restriction is not present in standard C
strings.
Your going to need to conceed on more grounds than that. There is a
reason many UNIX systems tried to add a ssize_t type, and why TR 24731
has added rsize_t to their extension. (As a side note, I strongly
suspect the Microsoft, in fact, added this whole rsize_t thing to TR
24731 when they realized that Bstrlib, or things like it, actually has
far better real world safety because its use of ints for string
lengths.) Using a long would be incorrect since there are some systems
where a long value can exceed a size_t value (and thus lead to falsely
sized mallocs.) There is also the matter of trying to codify
read-only and constant strings and detecting errors efficiently
(negative lengths fit the bill.) Using ints is the best choice
because at worst its giving up things (super-long strings) that nobody
cares about,

I think it's fair to expect the possibility of super-long strings in a
general-purpose string library.
it allows in an efficient way for all desirable encoding scenarios,
and it avoids any wrap around anomolies causing under-allocations.

What anomalies? Are these a consequence of using signed long, or
size_t?
If I tried to use size_t I would give up a significant amount of
safety and design features (or else I would have to put more entries
into the header, making it less efficient).

If you only need a single "special" marker value (for which you were
perhaps using -1), you could consider using ~(size_t) 0.

Things will go wrong for at most one possible string length, but that's
more than can be said for using int.

But whatever the difference in efficiency, surely correctness and safety
first, efficiency second has to be the rule for a general-purpose
library?
 
R

REH

Ben C said:
Yes of course, I never intended to imply that it wasn't.

The point I was making was that operator overloading doesn't mix so
easily with things that might need to be allocated and freed manually--
i.e. objects of user-defined types. You start needing constructors and
destructors, which C++ (but not C) has.

Why? And why do you think objects of user-defined types have to be
"allocated and freed manually"?

struct foo {
int x, y;
};

foo operator+ (const foo& a, const foo& b)
// for it you are of the "I hate references" camp: foo operator+ (foo a,
foo b)
{
const foo z = {a.x + b.x, a.y + b.y};
return z;
}

foo x = {1, 2};
foo y = {3, 4};
foo z = x + y;

simplistic, but no constructors.

REH
 
R

Rod Pemberton

Richard Heathfield said:
jacob navia said:


Nobody is stopping you. Why not get on with it? For example: what containers
do you think a standard C container library should make available? What
should the APIs look like? And how will you persuade people to use the new
library instead of whatever they are using right now?

I have asked these questions before. You seem reluctant to pursue them.



That's right. You are free to throw your reputation down the tubes, and
nobody here can stop you, try as they might.

His reputation is far better than yours. Noone has ever heard of you. And
if they did once learn of you, they sure don't remember you... The same
can be said of Keith Thompson, Martin Ambuhl, Chuck Falconer, etc...
Everyone knows about LCC-Win32 and Jacob Navia.



Rod Pemberton
 
I

Ian Collins

Rod said:
His reputation is far better than yours. Noone has ever heard of you. And
if they did once learn of you, they sure don't remember you... The same
can be said of Keith Thompson, Martin Ambuhl, Chuck Falconer, etc...
Everyone knows about LCC-Win32 and Jacob Navia.
That kind of assumes everyone uses windows....
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,796
Messages
2,569,645
Members
45,369
Latest member
Carmen32T6

Latest Threads

Top