Null terminated strings: bad or good?

Keith Thompson · Jan 7, 2009

jacob navia said:
Bartc said:

I pass what it expects obviously. In my implementation,
I always append a zero to the string stored so the conversion
is very fast.

Click to expand...

OK so you use a length *and* a zero-terminated string? [...]
Makes sense (that's what I do). But then you lose one advantage of
length+string which is dealing with arbitrary binary data, which can
include zeros.

Click to expand...

Is only the zero at String.length that counts. Embedded zeroes inside
the string are NOT significant.

When you say they're not significant, do you mean that they're taken
as part of the string? For example, if the length field is 3 and the
character array contains { 'x', '\0', 'y', '\0' }, does that denote a
"string" of length 3 containing an 'x', a null character, and a 'y'?

I ask because saying that they're not significant could imply that
they're ignored.

jacob navia · Jan 7, 2009

Keith said:
jacob navia said:

Bartc said:

I pass what it expects obviously. In my implementation,
I always append a zero to the string stored so the conversion
is very fast.
OK so you use a length *and* a zero-terminated string? [...]
Makes sense (that's what I do). But then you lose one advantage of
length+string which is dealing with arbitrary binary data, which can
include zeros.

Click to expand...

Is only the zero at String.length that counts. Embedded zeroes inside
the string are NOT significant.

Click to expand...

When you say they're not significant, do you mean that they're taken
as part of the string? For example, if the length field is 3 and the
character array contains { 'x', '\0', 'y', '\0' }, does that denote a
"string" of length 3 containing an 'x', a null character, and a 'y'?

I ask because saying that they're not significant could imply that
they're ignored.

No they are not ignored, I mean they do not denote the end of the string
and are treated as any other character. In your example the string has 3
characters 'x', '\0', and 'y'.

Rafael · Jan 7, 2009

Hello Jacob, List

jacob navia escreveu:

No they are not ignored, I mean they do not denote the end of the string
and are treated as any other character. In your example the string has 3
characters 'x', '\0', and 'y'.

In that case, how one should seek for the length of a (runtime variable
length given) string?

Rafael

jacob navia · Jan 7, 2009

Rafael said:
Hello Jacob, List

jacob navia escreveu:

In that case, how one should seek for the length of a (runtime variable
length given) string?

Rafael

The length of the string is explicitly stored in another field, not
in the characters of the string as now;

Keith Thompson · Jan 7, 2009

Tony said:
It's not about "disallowing" anything. It's about not perverting the common
case for the exceptional case.

In C as it's currently defined, a string's length may be as large
as SIZE_MAX-1, though an implementation may impose a smaller limit
on object sizes.

If you implement a string package that can only handle strings up to,
say, 65535 bytes, imposes what I think is an unnecessary limitation.
A limit of 255 bytes is, I believe, quite unreasonable. Supporting
huge strings shouldn't be much more difficult that supporting short
strings. There's no need to "pervert" anything.

CBFalconer · Jan 7, 2009

Tony said:
1. Ease of IO (writing to/fro disk files, for example).
2. No need to calculate length (a real bummer with null terminated
strings).

None of that appears in C. The writing from a string can be done
by:

int wrtstring(char *s, FILE *f) {
char ch;
int err;

while (ch = *s++) {
if (EOF == (err = putc(ch, f))) break;
}
return err;
}

Note that putc can be a macro, and thus can use the file system
buffers directly. This can make the overall writing very
efficient. However it must not have side effects on the
arguments. Thus the ch is required. If no error occurs wrtstring
returns the last char written.

CBFalconer · Jan 7, 2009

Keith said:
You misunderstood my example. In my hypothetical implementation,
calloc() *did* succeed. It returned a pointer to an object whose
size exceeds SIZE_MAX bytes.

You assert that "no object can exceed SIZE_MAX". I see no *direct*
statement of this in the standard. If there is one, surely you can
provide a citation. (You don't get to just make up rules like this.)

I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

CBFalconer · Jan 7, 2009

Tony said:
.... snip ...

I don't see that as an issue. Everything doesn't have to be
scalable to the largest integer size on a machine. Strings to me
are of "reasonable" length. For instance. If there is a period
marking the end of a sentence, then that is probably one string.
A whole file of sentences and paragraphs, is not a string. 32
bits for a length field is just because it's easy to use on a
32-bit platform. If someone needs a billion byte string, well I'm
not even going to try to conceive of that because it sounds silly.

Sorry, you're wrong. A string does not end on a period, or a '\n',
etc. It ends on the first '\0'. Nothing (other than SIZE_MAX)
limits the length of the string. The reason is simple - the
standard so specifies.

Thus, assuming adequate memory, a whole text file can be stored in
a single string.

Richard · Jan 7, 2009

CBFalconer said:
Sorry, you're wrong. A string does not end on a period, or a '\n',
etc. It ends on the first '\0'. Nothing (other than SIZE_MAX)
limits the length of the string. The reason is simple - the
standard so specifies.

Thus, assuming adequate memory, a whole text file can be stored in
a single string.

Nonsense.

Chapter and verse for the content format of a "text file" please.

jameskuyper · Jan 7, 2009

CBFalconer said:
I see no reason for a 'direct' statement. sizeof returns the size
of ANY object.

No, it only returns the size of the object whose name is the right
operand of sizeof. When you have an object, but no way to name it for
use as the operand of sizeof, then the requirement on sizeof cannot
constrain the size of that object.

Keith Thompson · Jan 7, 2009

CBFalconer said:
I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

One more time. The argument to sizeof is either a parenthesized type
name or an expression. sizeof does not, in general, compute the size
of an object. In particular, an object created by calloc cannot be
named by an expression, and therefore cannot be an operand of sizeof.

You claim that "sizeof returns the size of ANY object". I see
nothing in the standard that directly supports this claim. If you
can prove it from the standard, please do so. I'm not interested
in any response that doesn't include one or more specific citations
from the standard.

Richard Tobin · Jan 7, 2009

CBFalconer said:
I see no reason for a 'direct' statement. sizeof returns the size
of ANY object.

It returns the size of any object that you can give as an argument to
sizeof(). It also returns the size of any type you can give as an
argument to sizeof(). You can't give the object allocated by calloc()
to sizeof(), because all you have is a void * pointer, rather than the
allocated object itself. To give sizeof() the object itself, you have
to use a cast or a declared variable, so what it boils down to is that
you can give sizeof() any object that you can write the type of. So
the ability of calloc() to return very large objects isn't any different
from the fact that you can write down a type name that is too large
to fit in a size_t.

Since there are names for types that don't fit in a size_t, the
possibility of calloc() (or anything else) returning excessively
large objects doesn't introduce any new problem.

-- Richard

Ben Bacarisse · Jan 7, 2009

CBFalconer said:
... The writing from a string can be done by:

int wrtstring(char *s, FILE *f) {
char ch;
int err;

while (ch = *s++) {
if (EOF == (err = putc(ch, f))) break;
}
return err;
}

Undefined behaviour when *s is initially equal to 0
(e.g. wrtstring("", stdout)).

Richard · Jan 7, 2009

CBFalconer said:
None of that appears in C. The writing from a string can be done
by:

int wrtstring(char *s, FILE *f) {
char ch;
int err;

while (ch = *s++) {
if (EOF == (err = putc(ch, f))) break;
}
return err;
}

Note that putc can be a macro, and thus can use the file system
buffers directly. This can make the overall writing very
efficient. However it must not have side effects on the
arguments. Thus the ch is required. If no error occurs wrtstring
returns the last char written.

Standard beginner error #1 : You forgot the input condition where *s ==
0. Result is UB and dirty underpants flying out your nostrils or
whatever it is.

We won't bother being so pedantic as mentioning if S is NULL to start
with.

CBFalconer · Jan 7, 2009

Ben said:
Undefined behaviour when *s is initially equal to 0
(e.g. wrtstring("", stdout)).

True. But I believe the underlined patch fixes that.

jameskuyper · Jan 7, 2009

CBFalconer said:
Keith said:

CBFalconer said:

Keith Thompson wrote:
[...]
I claim that calloc didn't really succeed. It just converted the
total size requested using the usual unsigned conversions. It
returned a pointer to a physical object, which was NOT (SIZE_MAX *
2) big. It can't be, since no object can exceed SIZE_MAX. THIS IS
NOT A TYPE. THIS calloc IS FAULTY.

You misunderstood my example. In my hypothetical implementation,
calloc() *did* succeed. It returned a pointer to an object whose
size exceeds SIZE_MAX bytes.

You assert that "no object can exceed SIZE_MAX". I see no *direct*
statement of this in the standard. If there is one, surely you can
provide a citation. (You don't get to just make up rules like this.)

I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

Click to expand...

One more time. The argument to sizeof is either a parenthesized type
name or an expression. sizeof does not, in general, compute the size
of an object. In particular, an object created by calloc cannot be
named by an expression, and therefore cannot be an operand of sizeof.

You claim that "sizeof returns the size of ANY object". I see
nothing in the standard that directly supports this claim. If you
can prove it from the standard, please do so. I'm not interested
in any response that doesn't include one or more specific citations
from the standard.

Click to expand...

How about this. No exceptions are mentioned, thus it covers all.

It covers all "what"? To make your argument valid, it would have to be
"all objects", but what this covers is "all sizeof expressions". To
prove relevance of this citation to your argument, you have to
identify a sizeof expression applied to the object allocated by the
call to calloc().

6.5.3.4 The sizeof operator

... snip ...

Semantics

[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type. The size is determined from the type of the
operand. The result is an integer. If the type of the
operand is a variable length array type, the operand is
evaluated; otherwise, the operand is not evaluated and the
result is an integer constant.

Keith Thompson · Jan 7, 2009

CBFalconer said:
Keith said:

CBFalconer said:

Keith Thompson wrote:
[...]
I claim that calloc didn't really succeed. It just converted the
total size requested using the usual unsigned conversions. It
returned a pointer to a physical object, which was NOT (SIZE_MAX *
2) big. It can't be, since no object can exceed SIZE_MAX. THIS IS
NOT A TYPE. THIS calloc IS FAULTY.

You misunderstood my example. In my hypothetical implementation,
calloc() *did* succeed. It returned a pointer to an object whose
size exceeds SIZE_MAX bytes.

You assert that "no object can exceed SIZE_MAX". I see no *direct*
statement of this in the standard. If there is one, surely you can
provide a citation. (You don't get to just make up rules like this.)

I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

Click to expand...

One more time. The argument to sizeof is either a parenthesized type
name or an expression. sizeof does not, in general, compute the size
of an object. In particular, an object created by calloc cannot be
named by an expression, and therefore cannot be an operand of sizeof.

You claim that "sizeof returns the size of ANY object". I see
nothing in the standard that directly supports this claim. If you
can prove it from the standard, please do so. I'm not interested
in any response that doesn't include one or more specific citations
from the standard.

Click to expand...

How about this. No exceptions are mentioned, thus it covers all.

6.5.3.4 The sizeof operator

... snip ...

Semantics

[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type. The size is determined from the type of the
operand. The result is an integer. If the type of the
operand is a variable length array type, the operand is
evaluated; otherwise, the operand is not evaluated and the
result is an integer constant.

Nope.

The standard *does not say* that the sizeof operator can be used to
determine the size of any object. It can determine the size of a type
or of an expression. In particular, it cannot be used (directly) to
determine the size of an anonymous object, such as one created by a
call to calloc(). So the description of the sizeof operator is
irrelevant to the question.

Wojtek Lerch · Jan 7, 2009

CBFalconer said:
How about this. No exceptions are mentioned, thus it covers all.

You'd think that; but what about sizeof(char[SIZE_MAX][2])?

Anyway, since no exceptions are mentioned, apparently the intent was to
cover all possible operands. But the operand of sizeof is not an object;
the operand is an expression or a type. Even if the standard means to
forbid types larger than SIZE_MAX bytes and expressions that have such
types, that doesn't necessarily mean that objects larger than SIZE_MAX bytes
are forbidden too.

6.5.3.4 The sizeof operator

... snip ...

Semantics

[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type. The size is determined from the type of the
operand. The result is an integer. If the type of the
operand is a variable length array type, the operand is
evaluated; otherwise, the operand is not evaluated and the
result is an integer constant.

Flash Gordon · Jan 7, 2009

CBFalconer said:
Keith said:

CBFalconer said:

Keith Thompson wrote:
[...]
I claim that calloc didn't really succeed. It just converted the ^^^^^^
total size requested using the usual unsigned conversions. It
returned a pointer to a physical object, which was NOT (SIZE_MAX *
2) big. It can't be, since no object can exceed SIZE_MAX. THIS IS
NOT A TYPE. THIS calloc IS FAULTY.
You misunderstood my example. In my hypothetical implementation,
calloc() *did* succeed. It returned a pointer to an object whose ^^^^^^^^^^^^^^^^^^^^^^
size exceeds SIZE_MAX bytes.

You assert that "no object can exceed SIZE_MAX". I see no *direct*
statement of this in the standard. If there is one, surely you can
provide a citation. (You don't get to just make up rules like this.)
I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

Click to expand...

One more time. The argument to sizeof is either a parenthesized type
name or an expression. sizeof does not, in general, compute the size
of an object. In particular, an object created by calloc cannot be ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
named by an expression, and therefore cannot be an operand of sizeof. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You claim that "sizeof returns the size of ANY object". I see
nothing in the standard that directly supports this claim. If you
can prove it from the standard, please do so. I'm not interested
in any response that doesn't include one or more specific citations
from the standard.

Click to expand...

How about this. No exceptions are mentioned, thus it covers all.

6.5.3.4 The sizeof operator

... snip ...

Semantics

<snip>

Is irrelevant since there is no way to make the object created by calloc
the operand of sizeof. I'm sure this point has been made already but you
seem to have missed it.

Richard · Jan 7, 2009

Keith Thompson said:
CBFalconer said:

Keith said:

Keith Thompson wrote:
[...]
I claim that calloc didn't really succeed. It just converted the
total size requested using the usual unsigned conversions. It
returned a pointer to a physical object, which was NOT (SIZE_MAX *
2) big. It can't be, since no object can exceed SIZE_MAX. THIS IS
NOT A TYPE. THIS calloc IS FAULTY.

You misunderstood my example. In my hypothetical implementation,
calloc() *did* succeed. It returned a pointer to an object whose
size exceeds SIZE_MAX bytes.

You assert that "no object can exceed SIZE_MAX". I see no *direct*
statement of this in the standard. If there is one, surely you can
provide a citation. (You don't get to just make up rules like this.)

I see no reason for a 'direct' statement. sizeof returns the size
of ANY object. The type returned by sizeof is size_t (which must
be capable of holding that result). SIZE_MAX is the maximum value
of a size_t value. These facts suffice to prove that no object can
exceed SIZE_MAX in size.

I have cross-posted to comp.std.c for further resolution, if any.

One more time. The argument to sizeof is either a parenthesized type
name or an expression. sizeof does not, in general, compute the size
of an object. In particular, an object created by calloc cannot be
named by an expression, and therefore cannot be an operand of sizeof.

You claim that "sizeof returns the size of ANY object". I see
nothing in the standard that directly supports this claim. If you
can prove it from the standard, please do so. I'm not interested
in any response that doesn't include one or more specific citations
from the standard.

Click to expand...

How about this. No exceptions are mentioned, thus it covers all.

6.5.3.4 The sizeof operator

... snip ...

Semantics

[#2] The sizeof operator yields the size (in bytes) of its
operand, which may be an expression or the parenthesized
name of a type. The size is determined from the type of the
operand. The result is an integer. If the type of the
operand is a variable length array type, the operand is
evaluated; otherwise, the operand is not evaluated and the
result is an integer constant.

Click to expand...

Nope.

The standard *does not say* that the sizeof operator can be used to
determine the size of any object. It can determine the size of a type
or of an expression. In particular, it cannot be used (directly) to
determine the size of an anonymous object, such as one created by a
call to calloc(). So the description of the sizeof operator is
irrelevant to the question.

I would like to make a point and then ask a serious question here:

*IF* Chuck were called Bill Cunningham you would probably have stopped
responding to him by now.

How on earth can you encourage him being so wrong so often by replying
to him? *IF* Falconer were a polite and well meaning poster who was here
to improve his knowledge in addition to aiding and abetting new C
adopters then all well and good. But he isn't. I mean, hell, we ALL make
mistakes. However, he is here to bully, provoke and generally piss
people off with his arrogant and, at times, nonsensical ramblings and
obvious lack of real practical C in the real world.

Working with NON-NULL terminated strings	4	Jul 14, 2007
Reading null terminated strings in Java	9	Feb 4, 2009
pointer to NULL terminated array of pointer	8	Aug 30, 2012
How to put a null check on this code	0	Jan 4, 2022
Using <algorithm> with null-terminated arrays	4	Dec 18, 2010
strncpy() and null terminated strings	4	Apr 8, 2004
Hello all! Noob here with completely unrealistic ambitions. Happy to join the crew and get good enough to help others.	4	Aug 13, 2024
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022

Null terminated strings: bad or good?

Keith Thompson

jacob navia

Rafael

jacob navia

Keith Thompson

CBFalconer

CBFalconer

CBFalconer

Richard

jameskuyper

Keith Thompson

Richard Tobin

Ben Bacarisse

Richard

CBFalconer

jameskuyper

Keith Thompson

Wojtek Lerch

Flash Gordon

Richard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads