sizeof strings

M

mohi

hello everyone ,
this one might seem wired but i dont know why iam confused with this
simple stuff



the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

and why is it not complaining???





thank you
mohan gupta
 
S

santosh

mohi said:
hello everyone ,
this one might seem wired but i dont know why iam confused with this
simple stuff



the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

The strlen function returns the number of characters it's argument
points to, up to, but *not* including the null character, which is not
a part of the string, though it's needed to create a string. Therefore
your 'name' array is one character too short to hold a string of ten
characters. It can only hold a string of maximum nine characters.

Strcpy will blindly copy it's second argument to it's first. It's your
responsibility to make sure that the buffer pointed to by the first
argument has sufficient space to contain the string pointed to by the
second. Otherwise strcpy will overrun the destination buffer and
undefined behaviour will result.
and why is it not complaining???

By pure luck. Try not to depend too much on it in programming.

<snip>
 
T

Thad Smith

santosh said:
mohi wrote:
char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

The strlen function returns the number of characters it's argument
points to, up to, but *not* including the null character, which is not
a part of the string, though it's needed to create a string.

The null character terminates the string, but is defined by the standard as
part of the string:
7.1.1:
"A string is a contiguous sequence of characters terminated by and
including the first null character."
 
R

Richard

Thad Smith said:
santosh said:
mohi wrote:
char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

The strlen function returns the number of characters it's argument
points to, up to, but *not* including the null character, which is not
a part of the string, though it's needed to create a string.

The null character terminates the string, but is defined by the
standard as part of the string:
7.1.1:
"A string is a contiguous sequence of characters terminated by and
including the first null character."

Which means that strlen is, by definition, wrong.....
 
R

Richard

this one might seem wired but i dont know why iam confused with this
simple stuff
the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

and why is it not complaining???

Show the [exact] code to complement the words, otherwise we can
only but guess.

He has explained it well enough that you should not have to guess.

He has simply overwritten the buffer and has been lucky enough for it
not to crash.

Inconsistency between strlen and standard definition of a string of
course - the same thing which every C programmer in the world gets used
to in lesson 2.

Of course, if he used a debugger such as gdb he could examine the memory
and see the terminating nul character at Addr+10. But that advice would
be far too "off topic" for c.l.c
 
L

Lew Pitcher

hello everyone ,
this one might seem wired but i dont know why iam confused with this
simple stuff



the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

Yes... and no.

All you have done with
char name[10];
is defined the variable "name" to reference an array of 10 character
elements. You may fill this array in any way you please, so long as you do
not try to store an element value that is outside of the value range
permitted by a char entity.

A /string/ is a specific type of character array, which contains zero or
more non-zero char elements, followed by a single char element of zero
(\0). A string with 10 non-zero characters (i.e. "0123456789") is contained
in a character array of 11 elements ( i. e.
{'0','1','2','3','4','5','6','7','8','9',0} ), with the final element being
the \0 character.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

You overflowed your name[] array.

A string of 10 characters (that is, strlen() returns 10) is stored in a
character array of 11 elements. The first 10 elements contain the non-zero
characters, and the 11'th element contains the terminating \0

When you strcpy()ed the string, you copied the first 10 characters of the
string to elements name[0] to name[9]. Since that's all the space that you
allocated to the name[] array, the next character (the terminating \0)
didn't wind up in the name[] array, but was placed somewhere else. This is
bad - you've introduced a buffer overflow, and caused your program to
exhibit undefined behaviour (which could include successful completion, if
you are lucky).
and why is it not complaining???

Because, strcpy() is a function (albeit, a function implemented by the
standard library), and the size of an array defined outside of a function
is not implicitly determinable from within a function (that's how C works).
You, as the programmer, are expected to cope with that, and ensure that you
do not pass strcpy() invalid values.

--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
 
H

Harald van Dijk

Thad Smith said:
[...]
The null character terminates the string, but is defined by the
standard as part of the string:
7.1.1:
"A string is a contiguous sequence of characters terminated by and
including the first null character."

Which means that strlen is, by definition, wrong.....

It means that the length of a string does not mean the length of an array
of char. Similarly, a pointer to a string does not mean a pointer to an
array of char. This is not wrong, but this is more confusing than
necessary.
 
S

Stephen Sprunk

mohi said:
the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

name[9] will be '\0' if you put a string of length exactly 9 in it. It
may be something else if you put a shorter string in.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

and why is it not complaining???

The '\0' went into name[10], which invokes undefined behavior since it's
not part of the object (which ended at name[9]).

One possible example of undefined behavior is that there is no error and
nothing bad happens. In this particular case, that doesn't surprise me;
there's good odds the next 2-6 bytes after 'name' are unused (due to
alignment constraints) and the machine never notices you did something
wrong. However, a minor change in your program may result in memory
corruption of other variables, or porting it to more strict systems may
result in an error, or something entirely random may happen for no
specific reason at all. That's the joy of UB.

S
 
S

santosh

Thad said:
santosh said:
mohi wrote:
char name[10];

this defines a array of char with size 10 where 0-8 positions(count
9) can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.
but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

The strlen function returns the number of characters it's argument
points to, up to, but *not* including the null character, which is
not a part of the string, though it's needed to create a string.

The null character terminates the string, but is defined by the
standard as part of the string:
7.1.1:
"A string is a contiguous sequence of characters terminated by and
including the first null character."

Thanks for the heads-up.
 
C

CBFalconer

santosh said:
mohi said:
this one might seem wired but i dont know why iam confused with
this simple stuff. the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions
(count 9) can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.

The object name can hold up to 10 chars. If any one of them is a
'\0' then name holds a string, and that string ends with the char
(if any) before that '\0'. If there is no '\0' in the char array,
that array just holds chars, not a string, and should not be passed
to any functions that want string input.

Strings are a data format, and are held in objects of the type
'array of char'.
 
A

Andy

hello everyone ,
this one might seem wired but i dont know why iam confused with this
simple stuff



the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions(count 9)
can be characters and name[9] should be '\0' .right?

but when i strcpy() a string of 10 characters (strlen()=10) into
name ,it works fine
and shows name[9]=a valid character from the string .
so where did the '\0' character of name[9] gone?????

I:\c16>type z2.c
#include <stdio.h>
#include <string.h>

int main() {
char string1[10];
char string2[5] = "wxyz";

printf("string2 = '%s'\n", string2);
strcpy(string1, "1234567890");
printf("string1 = '%s'\n", string1);
printf("string2 = '%s'\n", string2);
strcpy(string1, "1234567890a");
printf("string1 = '%s'\n", string1);
printf("string2 = '%s'\n", string2);

return 0;
}

I:\c16>tcc z2.c
Turbo C Version 2.01 Copyright (c) 1987, 1988 Borland International
z2.c:
Turbo Link Version 2.0 Copyright (c) 1987, 1988 Borland
International

Available memory 389890

I:\c16>z2
string2 = 'wxyz'
string1 = '1234567890'
string2 = ''
string1 = '1234567890a'
string2 = 'a'

I:\c16>
and why is it not complaining???

It does not do any checking.
 
B

Barry Schwarz

santosh said:
mohi said:
this one might seem wired but i dont know why iam confused with
this simple stuff. the problem is :

char name[10];

this defines a array of char with size 10 where 0-8 positions
(count 9) can be characters and name[9] should be '\0' .right?

Not necessarily. Even when storing a string in 'name' you need not
consume all it's elements to do so. A one character string stored
in 'name' is perfectly fine, though it may be a waste of storage.

The object name can hold up to 10 chars. If any one of them is a
'\0' then name holds a string, and that string ends with the char
(if any) before that '\0'. If there is no '\0' in the char array,

Since the '/0' is part of the string, it ends with the '\0'.
that array just holds chars, not a string, and should not be passed
to any functions that want string input.

Strings are a data format, and are held in objects of the type
'array of char'.


Remove del for email
 
K

Keith Thompson

Lew Pitcher said:
A /string/ is a specific type of character array, which contains zero or
more non-zero char elements, followed by a single char element of zero
(\0). A string with 10 non-zero characters (i.e. "0123456789") is contained
in a character array of 11 elements ( i. e.
{'0','1','2','3','4','5','6','7','8','9',0} ), with the final element being
the \0 character.

Quibble: a string is a specific *kind* of character array. There is
no string *type* in C. (I know you didn't mean "type" in that sense.)

[...]
A string of 10 characters (that is, strlen() returns 10) is stored in a
character array of 11 elements. The first 10 elements contain the non-zero
characters, and the 11'th element contains the terminating \0

Since a string includes its terminating '\0', it would be clearer to
say "A string of length 10" rather than "A string of 10 characters".
The '\0' is one of the 11 characters that make up the string.
 
L

Lew Pitcher

Quibble: a string is a specific *kind* of character array. There is
no string *type* in C. (I know you didn't mean "type" in that sense.)

Agreed. You knew what I meant, but I should have phrased myself better. Your
phrasing is clearer than mine.
[...]
A string of 10 characters (that is, strlen() returns 10) is stored in a
character array of 11 elements. The first 10 elements contain the
non-zero characters, and the 11'th element contains the terminating \0

Since a string includes its terminating '\0', it would be clearer to
say "A string of length 10" rather than "A string of 10 characters".
The '\0' is one of the 11 characters that make up the string.

Again, agreed.

Thanks for the clarifications :)

--
Lew Pitcher

Master Codewright & JOAT-in-training | Registered Linux User #112576
http://pitcher.digitalfreehold.ca/ | GPG public key available by request
---------- Slackware - Because I know what I'm doing. ------
 
R

Richard

Keith Thompson said:
Quibble: a string is a specific *kind* of character array. There is
no string *type* in C. (I know you didn't mean "type" in that sense.)

It is a character array. Not a "kind" of anything.
[...]
A string of 10 characters (that is, strlen() returns 10) is stored in a
character array of 11 elements. The first 10 elements contain the non-zero
characters, and the 11'th element contains the terminating \0

Since a string includes its terminating '\0', it would be clearer to
say "A string of length 10" rather than "A string of 10 characters".
The '\0' is one of the 11 characters that make up the string.

--
 
K

Keith Thompson

Richard said:
It is a character array. Not a "kind" of anything.
[...]

I don't know what you mean; perhaps you can clarify.

All strings are character arrays. Not all character arrays are
strings. Specifically, a character array without a '\0' is not a
string. For that matter, a character array that contains a '\0' other
than in the final position is not a string, though it does contain
one.

I thought that saying that "a string is a specific *kind* of character
array" was a reasonable way to express that.

(I'm assuming here that a "contiguous sequence of characters",
referred to in the definition of "string" in C99 7.1.1p1, is an array.
I'm not certain that the two are entirely equivalent. But that
doesn't seem to be what you were disputing.)
 
C

Chris Thomasson

Keith Thompson said:
Richard said:
Keith Thompson said:
[...]
A /string/ is a specific type of character array, which contains zero
or
more non-zero char elements, followed by a single char element of zero
(\0). A string with 10 non-zero characters (i.e. "0123456789") is
contained
in a character array of 11 elements ( i. e.
{'0','1','2','3','4','5','6','7','8','9',0} ), with the final element
being
the \0 character.

Quibble: a string is a specific *kind* of character array. There is
no string *type* in C. (I know you didn't mean "type" in that sense.)

It is a character array. Not a "kind" of anything.
[...]

I don't know what you mean; perhaps you can clarify.

All strings are character arrays. Not all character arrays are
strings. Specifically, a character array without a '\0' is not a
string.


For that matter, a character array that contains a '\0' other
than in the final position is not a string, though it does contain
one.

__________________________________________________________________
#include <stdio.h>
#include <string.h>

int main() {
char const names[] = "Chris\0Keith\0Richard\0Lew\0Greg\0Lisa";
char const *pos = names;
char const *const epos = names + sizeof(names);

printf("%lu\n", (unsigned long)sizeof(names));

do {
printf("pos(%p)->\"%s\"\n", (void*)pos, pos);
pos += strlen(pos) + 1;
} while (pos < epos);

printf("____________________________________________________\n"
"names(%p) / pos(%p) / epos(%p)\n", (void*)names,
(void*)pos, (void*)epos);

return 0;
}

__________________________________________________________________

Is `names' a string, or a string of strings?

[...]
 
K

Keith Thompson

Chris Thomasson said:
For that matter, a character array that contains a '\0' other
than in the final position is not a string, though it does contain
one.

__________________________________________________________________
#include <stdio.h>
#include <string.h>

int main() {
char const names[] = "Chris\0Keith\0Richard\0Lew\0Greg\0Lisa";
char const *pos = names;
char const *const epos = names + sizeof(names);

printf("%lu\n", (unsigned long)sizeof(names));

do {
printf("pos(%p)->\"%s\"\n", (void*)pos, pos);
pos += strlen(pos) + 1;
} while (pos < epos);

printf("____________________________________________________\n"
"names(%p) / pos(%p) / epos(%p)\n", (void*)names,
(void*)pos, (void*)epos);

return 0;
}

__________________________________________________________________

Is `names' a string, or a string of strings?

It's neither.

A string is, by definition, "a contiguous sequence of characters
terminated by and including the first null character". The array
``names'' is not terminated by the first null character; it continues
after that. You could say that it contains a *sequence* of strings,
but there's no such thing (in C) as a "string of strings".

In fact, the array ``names'' contains, by my count, 34 strings, some
of which overlap and some of which have the same values. Some of
these are: "Chris", "hris", "ris", "is", "s", "", "Keith", "eith",
.....

I still have no idea what Richard Nolastname Riley was complaining
about; he can clarify it or not, as he chooses.
 
K

Keith Thompson

Keith Thompson said:
Chris Thomasson said:
For that matter, a character array that contains a '\0' other
than in the final position is not a string, though it does contain
one.
...
char const names[] = "Chris\0Keith\0Richard\0Lew\0Greg\0Lisa";
...
Is `names' a string, or a string of strings?

It's neither.
....

What about str after the strcpy call?

char const str[256];
strcpy( str, "foo" );

The array contains characters after the zero terminator which might be
non-zero, yet it logically contains the string "foo".

Of course. It *contains* the string "foo" (in its first 4 elements).
That doesn't mean it *is* a string. It *is* an array of 256
characters; those 256 characters do not constitute a string.
 
C

Chris Thomasson

Keith Thompson said:
Keith Thompson said:
[...]
For that matter, a character array that contains a '\0' other
than in the final position is not a string, though it does contain
one.
...
char const names[] = "Chris\0Keith\0Richard\0Lew\0Greg\0Lisa";
...
Is `names' a string, or a string of strings?

It's neither.
....

What about str after the strcpy call?

char const str[256];
strcpy( str, "foo" );

The array contains characters after the zero terminator which might be
non-zero, yet it logically contains the string "foo".

Of course. It *contains* the string "foo" (in its first 4 elements).
That doesn't mean it *is* a string. It *is* an array of 256
characters; those 256 characters do not constitute a string.

Perfect.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top