[C] simple string question

L

Leor Zolman

You are absolutely right! I don't read the comp.lang.c ng because there is
nothing to learn there.

Wow. I'm not sure which has more shock value, the statement above or
Janet Jackson's stunt last Sunday...

But moral outrage notwithstanding ;-) , check out Agent for reading
your news. I was pleasantly surprised the first time I hit "post
follow-up message" to a cross-posted post: it immediately informed me
of the cross-posting, and offered me the choice (via several buttons)
of whether I wanted my reply to go to just the current group or to
all. Nice feature.
-leor


Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
A

AirPete

Christopher said:
As others pointed out, you missed a '\0' character at the end of the
string.

Which is just a waste of space in a constant length string.
strcpy() does it even more nicely, assuming the destination is
sufficiently large. Among other things, it prevents the above mistake
entirely.

strcpy() relies on the terminating '\0', which was not needed.


- Pete
 
C

Christopher Benson-Manica

In comp.lang.c AirPete said:
Which is just a waste of space in a constant length string.

An array of characters not followed by a NUL character isn't a
"string" as far as C is concerned. Without the NUL, you can't print
strcpy() relies on the terminating '\0', which was not needed.

Unless you want to actually *use* the string for something, of course.
 
L

Leor Zolman

Which is just a waste of space in a constant length string.

I think the problem here is that the OP has not (yet?) clarified
whether his "constant length string" is required to be nul-terminated,
and most of us have been going under the assumption that it is,
because that's the usual MO of C... So I hear your point, but until we
get a clarification on his intent, this horse has pretty much expired.
-leor

Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
G

Gary

Richard Heathfield said:
That's odd. I thought I was a C expert until I started using the comp.lang.c
newsgroup. Its regular contributors taught me an immense amount about the
language.

If you wish to learn about C, perhaps you should look at comp.lang.c a
little harder.

Good Lord, it was a joke. (Well, it was supposed to be. Clearly all ng's
have some teaching value, except for those that purposely misspell stuff,
like warez, etc.)
 
A

AirPete

Christopher said:
An array of characters not followed by a NUL character isn't a
"string" as far as C is concerned. Without the NUL, you can't print
the array or pass it to any of the <string.h> functions. Sounds like
a problem to me.

If the string is constant length, there isn't much use for <string.h>
functions, anyway.
They mostly modify a string's length, making it not constant length, and
strlen() isn't needed because you already know how long it is.
Unless you want to actually *use* the string for something, of course.

fwrite(string, sizeof(char), sizeof(string)/sizeof(char), stdout);
fread(string, sizeof(char), sizeof(string)/sizeof(char), stdin);
if(memcmp(string, "abcd", 4)==0);

I can't think of much else you would need to do with a constant length
string, and anything else could be /very/ easily written.

- Pete
 
D

Dik T. Winter

> I think the problem here is that the OP has not (yet?) clarified
> whether his "constant length string" is required to be nul-terminated,
> and most of us have been going under the assumption that it is,
> because that's the usual MO of C... So I hear your point, but until we
> get a clarification on his intent, this horse has pretty much expired.

Well, considering the original snippet, which was something like:
char string[4] = {};
string = 'a ';
I would say it was clear (count the number of characters in the
"string"-literal).
 
C

Christopher Benson-Manica

In comp.lang.c Dik T. Winter said:
char string[4] = {};
string = 'a ';
I would say it was clear (count the number of characters in the
"string"-literal).

That's if we assume the OP realized there was a NUL character to
contend with, which I doubt.
 
L

Leor Zolman

I think the problem here is that the OP has not (yet?) clarified
whether his "constant length string" is required to be nul-terminated,
and most of us have been going under the assumption that it is,
because that's the usual MO of C... So I hear your point, but until we
get a clarification on his intent, this horse has pretty much expired.

Well, considering the original snippet, which was something like:
char string[4] = {};
string = 'a ';

My point exactly ;-)
I would say it was clear (count the number of characters in the
"string"-literal).

If you also count in the '=' operator and single quotes, it doesn't
add up to a large degree of confidence in the OP's intentions.
-leor




Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
R

Richard Heathfield

Leor said:
But in the case of copying into a "string" (char array) buffer, it
seems to me that copying "too little" is exactly what you'd _want_ it
to do.

Not at all. Think about what "too little" means. It means "not enough",
"less than is good". Imagine a computer program that kept sending letters
addressed to:

Mr Leor Zol
12 West Lex
San Francis

(Made up address, obviously.)
As the subject of the post is "simple string question", I don't
see the point of reading extra requirements into the problem;

Neither did I, but I don't think "make sure you have enough space" counts as
an /extra/ requirement. It's just a requirement, and a fundamental one,
when dealing with any kind of data.
I didn't
see any indication in the OP's (admittedly sparse) code that "string"
was to be used in any way other than as a nul-terminating string,

Nor did I.
and
in that context I'd vote for strcpy/strncpy being the best choice.

I'd agree with strcpy. I'm still not sure why you think strncpy is
appropriate.
IMO, a budding C programmer also needs to understand the concept of
nul-terminated strings and all of the implication of using them,

Of course. Er, I agree, and whoever said anything different?
including fundamental efficiency issues that are in the spirit of C...
I'd place the order of importance of all the things we've discussed
as:
1. not overflowing buffers

Right! So make sure the target buffer has enough room for all the data that
the program will attempt to store in it.
2. doing things efficiently

Yes, and is it not inefficient to learn as one's primary method of copying
strings a technique that can so easily drop important data without notice?
3. filling in dead space (?)

Why is that important?
 
L

Leor Zolman

Not at all. Think about what "too little" means. It means "not enough",
"less than is good". Imagine a computer program that kept sending letters
addressed to:

Mr Leor Zol
12 West Lex
San Francis

(Made up address, obviously.)

I was basing my comments on the assertion that the OP had intended
(whether he knew it or not) for "string" to be nul-terminated. If
you're in the "he meant const length string to mean there's no
nul-terminator" camp, we've been arguing at cross-purposes, and I
agree with all your points in that context...
Neither did I, but I don't think "make sure you have enough space" counts as
an /extra/ requirement. It's just a requirement, and a fundamental one,
when dealing with any kind of data.

Sorry, if I ever gave the impression I wasn't concerned about making
sure we had enough space, I sure never meant to say that.
Nor did I.


I'd agree with strcpy. I'm still not sure why you think strncpy is
appropriate.

I thought I made that clear as well earlier: In the general case when
you don't know the length of the source string, it is safer than using
strcpy. And I believe I explicitly said it wasn't necessary in this
particular case if you're copying from a fixed string you know isn't
going to be too long.
Of course. Er, I agree, and whoever said anything different?
I never implied you said differently. I was embarking on a new point,
sort of...
Right! So make sure the target buffer has enough room for all the data that
the program will attempt to store in it.


Yes, and is it not inefficient to learn as one's primary method of copying
strings a technique that can so easily drop important data without notice?

Again, wouldn't you only be dropping important data if you didn't
consider string to be nul-terminated? What did the OP _really_ mean by
"const length string"?? That's the crux of the question. I sure wish
he'd answer it. _I_ interpreted it to mean a fixed length char array
containing a "variable length nul-terminated sequence of characters".
Maybe that was assuming too much. When the OP says "Oh, I'm using all
four bytes and not expecting to use them with any functions that
expect nul termination", my assumption will have been proven wrong.
Why is that important?

Because under my assumptions, characters after the first nul and
before the end of the array are dead space. [And yes, I _do_ know what
happens when you "assume" ;-) ]
Cheers,
-leor




Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
R

Richard Heathfield

Leor said:
I thought I made that clear as well earlier:

No, this is the heart of the matter.
In the general case when
you don't know the length of the source string, it is safer than using
strcpy.

No, it isn't. The only safe and correct thing to do, if you don't know the
length of the source string, is to ***find out***.
And I believe I explicitly said it wasn't necessary in this
particular case if you're copying from a fixed string you know isn't
going to be too long.

Sure; I'm talking general case. And, in the general case, if you don't know
whether your target buffer is big enough for all the data you need, strncpy
is *not* a good enough solution. Yes, it's safe enough *if* you use it
right - but then, so is strcpy *if* you use it right. And strcpy copies
/all/ the data you need, not just the first chunk. Therefore, as a general
purpose tool it is superior. (And quicker to type.)

I never implied you said differently. I was embarking on a new point,
sort of...

Fine. Misunderstandings all round! :)
Again, wouldn't you only be dropping important data if you didn't
consider string to be nul-terminated?

Non-issue. If it's not null-terminated, it's not a string. End of story.
What did the OP _really_ mean by
"const length string"??

Who cares? Arguing about strncpy vs. strcpy is far more interesting. :)
Why is that important?

Because under my assumptions, characters after the first nul and
before the end of the array are dead space. [And yes, I _do_ know what
happens when you "assume" ;-) ]

Anything after the null terminating character is of little interest, IMHO.
 
B

B. v Ingen Schenau

AirPete said:
Which is just a waste of space in a constant length string.

But in C, the term 'string' is defined as 'a contiguous sequence of
characters terminated by and including the first null character.' (C
standard clause 7.1.1/1).
Without the nul-terminator, you can not call it a string, regardless of what
you do with it.

Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.snurse-l.org/acllc-c++/faq.html (currently
unavailable)
a.c.l.l.c-c++ FAQ mirror: http://www.inglorion.com/acllcc++.html
c.l.c FAQ: http://www.eskimo.com/~scs/C-faq/top.html
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/
 
C

Christopher Benson-Manica

In comp.lang.c AirPete said:
They mostly modify a string's length, making it not constant length, and
strlen() isn't needed because you already know how long it is.

Many functions, such as strcmp() and strchr(), in no way affect the
length of their arguments. Of course, you can use memcmp() and
memchr() to get the same results, but there is no mem* analogue to
strstr(). The bottom line is that the byte you save by not appending
a '\0' to character arrays is more than offset by the hoops you have
to jump through to compensate for its absence. I'm really not sure
why you seem to be obsessed with avoiding the str* functions. I'm
certainly willing to listen to valid reasons; I just can't think of
any.
 
L

Leor Zolman

No, it isn't. The only safe and correct thing to do, if you don't know the
length of the source string, is to ***find out***.

Yes, I believe I see what you mean now re. strncpy, after taking
another look at the Standard's description of it. If the length of the
source text is greater than the capacity of the destination as
conveyed via the size argument, a NUL won't get appended. You're
right, sorry.
Sure; I'm talking general case. And, in the general case, if you don't know
whether your target buffer is big enough for all the data you need, strncpy
is *not* a good enough solution.

In this discussion, situations where you don't know the size of the
target (that means destination, right? Just making sure...) weren't
even on my radar screen. I was only talking about not knowing the size
of the source.
Yes, it's safe enough *if* you use it
right - but then, so is strcpy *if* you use it right. And strcpy copies
/all/ the data you need, not just the first chunk. Therefore, as a general
purpose tool it is superior. (And quicker to type.)

Terms like "data you need" start to get rather subjective; if you saw
a piece of code such as:
strncpy(dest, source, 50);
it would be hard to argue that the coder "needs" stuff past the first
50 characters, But I have a hard time thinking of strcpy as "safer"
when it could easily lead to a buffer overrun, whereas with strncpy
the worst thing likely to happen (assuming the size you provide is
actually sufficient for the destination buffer you're providing), not
that it would be "acceptable", would be a NUL not getting written.

I think of strcpy being to strncpy as gets is to fgets.
Fine. Misunderstandings all round! :)

yah, I thought I read and understood what the docs were saying about
strncpy and I wasn't quite with it there with them. Sorry again.
Non-issue. If it's not null-terminated, it's not a string. End of story.

I didn't say "a string", I said "string" (as in the OP's array of that
name...and as of this writing, unless more has landed from him while I
was composing this, we still don't know his thinking was in that
regard.)
What did the OP _really_ mean by
"const length string"??

Who cares? Arguing about strncpy vs. strcpy is far more interesting. :) Heh.
3. filling in dead space (?)

Why is that important?

Because under my assumptions, characters after the first nul and
before the end of the array are dead space. [And yes, I _do_ know what
happens when you "assume" ;-) ]

Anything after the null terminating character is of little interest, IMHO.

Which was my rationale for preferring strcpy over the other functions;
it doesn't end up taking time to fill in that area that is of little
interest.

Take care,
-leor


Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
R

Richard Heathfield

Leor said:
Yes, I believe I see what you mean now re. strncpy, after taking
another look at the Standard's description of it. If the length of the
source text is greater than the capacity of the destination as
conveyed via the size argument, a NUL won't get appended. You're
right, sorry.

That's one of the problems with strncpy. There are plenty more. For a start,
what if you incorrectly specify the third parameter? (It happens, believe
me.)
In this discussion, situations where you don't know the size of the
target (that means destination, right? Just making sure...)
Right.

weren't
even on my radar screen. I was only talking about not knowing the size
of the source.

Finding out how big the source is, is easy. strlen() does it. Presumably you
know how big your target is. So call strlen on your source, and check that
the receiving buffer is big enough. If it isn't, well, Houston, we have a
problem. We might have been clever enough to make the target buffer
dynamic, in which case we can resize. If not, well, we're stuck.

Terms like "data you need" start to get rather subjective; if you saw
a piece of code such as:
strncpy(dest, source, 50);
it would be hard to argue that the coder "needs" stuff past the first
50 characters,

It would be hard to argue that the coder thought the problem through. If he
only wants the first 50 characters, why can the source buffer hold more
than 50? (If it can't, strcpy will work fine.) And why not just nail it:

source[49] = '\0';
strcpy(dest, source);

In the majority of cases (i.e. strlen(source) < 50), this code will be
quicker, since it will have to write fewer bytes. And it's never slower.
But I have a hard time thinking of strcpy as "safer"
when it could easily lead to a buffer overrun,

Only if you are silly enough not to check that your receiving buffer is big
enough. And so can strncpy lead to a buffer overrun, for the same reason.

whereas with strncpy
the worst thing likely to happen (assuming the size you provide is
actually sufficient for the destination buffer you're providing), not
that it would be "acceptable", would be a NUL not getting written.

That's enough. But look at just a few of the things that could go wrong:

strncpy(target, source, sizeof target); /* possibly no null terminator */
strncpy(target, source, sizeof source); /* tyop in third parameter! */
strncpy(target, source, strlen(target)); /* less data than you hoped? */
strncpy(target, source, sizeof target + 1); /* instead of - 1 */

Three of these four can lead to a buffer overrun.
I think of strcpy being to strncpy as gets is to fgets.

The comparison is invalid. The gets() function /cannot/ be used safely,
whereas strcpy can be. It's a sharp tool, and you can cut yourself on it if
you're not careful, but it is a powerful and legitimate tool in the hands
of a competent practitioner.

The fgets/strncpy comparison is probably fair. I rarely use fgets() in real
code, because it's too awkward.

Have a look at http://users.powernet.co.uk/eton/c/fgetdata.html if you want
to see what I wrote for use in "quickie" programs. For real programs, I use
the CLINT library - http://www.rjgh.co.uk/prg/c/wnn/index.php - so that my
target buffer is *always* big enough (because it stretches).
 
N

nrk

Richard said:
That's one of the problems with strncpy. There are plenty more. For a
start, what if you incorrectly specify the third parameter? (It happens,
believe me.)

The first problem is a strawman. Since the third argument is well known,
and due to the way standard specifies strncpy must behave, you only have to
check and see if dst[n-1] is '\0' or not, to tackle this situation.

The second is a bogeyman that you can use to scare anyone off any library
function. Idiots can wreck havoc with anything. By your logic, we should
probably never use snprintf, fgets, fread, fwrite, memcpy, memmove,
strncmp, strncat, malloc, realloc and such like. I am not saying that this
mistake is not made, but only that a fear of such mistakes will result in
total paralysis that prevents you from writing any meaningful program. C
does not have bounds checking. You should be aware of the risks you take
when you program in this language.

strncpy is pretty useful when you know that most of the time your input is
going to be less than a certain amount of characters. It can be used
safely by those who pay due care to what it is supposed to do.
Finding out how big the source is, is easy. strlen() does it. Presumably
you know how big your target is. So call strlen on your source, and check
that the receiving buffer is big enough. If it isn't, well, Houston, we
have a problem. We might have been clever enough to make the target buffer
dynamic, in which case we can resize. If not, well, we're stuck.

If I know that most of the time my input is between 18-20 characters, why
waste time with strlen, malloc and strcpy? I would go for an array of 21
characters, try to strncpy 21 characters into it, and check if array[20] is
'\0' or not after the strncpy to see if I've hit the rare case.
Terms like "data you need" start to get rather subjective; if you saw
a piece of code such as:
strncpy(dest, source, 50);
it would be hard to argue that the coder "needs" stuff past the first
50 characters,

It would be hard to argue that the coder thought the problem through. If
he only wants the first 50 characters, why can the source buffer hold more
than 50? (If it can't, strcpy will work fine.) And why not just nail it:

source[49] = '\0';
strcpy(dest, source);

I qualify my input parameters with const as far as possible. Modifying the
source unnecessarily is not only not an option, but is also bad style in my
books. Also, if this is a solution, so is:
dst[49] = 0;
strncpy(dst, src, 49);
In the majority of cases (i.e. strlen(source) < 50), this code will be
quicker, since it will have to write fewer bytes. And it's never slower.


Only if you are silly enough not to check that your receiving buffer is
big enough. And so can strncpy lead to a buffer overrun, for the same
reason.

I can twist that argument around to tackle your bogeyman "third argument
incorrect" argument against strncpy: "Only if you're silly enough to pass
the wrong size for the receiving buffer", as you yourself go on to point
out. So, there we go: that argument is a bogeyman by your own admission
:)

Barring that bogeyman argument, if I used strncpy, I *don't* have to check.
All I have to check is that the src was no longer than I expected, which
can be done in a very straight-forward and simple manner. In fact, you can
(and I do), wrap these operations into a function and use it safely. IMHO,
creating a buffer overrun with strncpy is less likely than with strcpy.
YMMSTV. Of course, if you always wanted all of the source regardless of
size, well, that's what strcpy is for :)
That's enough. But look at just a few of the things that could go wrong:

strncpy(target, source, sizeof target); /* possibly no null terminator */ strawman.

strncpy(target, source, sizeof source); /* tyop in third parameter! */ bogeyman.

strncpy(target, source, strlen(target)); /* less data than you hoped? */
bogeyman. malloc(strlen(target))... One sees the mushroom cloud.
strncpy(target, source, sizeof target + 1); /* instead of - 1 */
Nope. sizeof target is perfectly fine. It is because of misunderstanding
strncpy that you're playing around with the -1, +1 stuff. Also, if you
ever see that code and don't realize you're making a mistake, you probably
shouldn't be programming in C. (That's a metaphorical you, not "you" RJH).
Three of these four can lead to a buffer overrun.

strcpy(dst, src);

gives me no information about possible errors. Atleast, strncpy tells me
how much will be written.
The comparison is invalid. The gets() function /cannot/ be used safely,
whereas strcpy can be. It's a sharp tool, and you can cut yourself on it
if you're not careful, but it is a powerful and legitimate tool in the
hands of a competent practitioner.

Yes. That is not a fair comparison. But strncpy is a perfectly safe and
useful function, that you should try to use if appropriate, just as you
should use strcpy when appropriate.
The fgets/strncpy comparison is probably fair. I rarely use fgets() in
real code, because it's too awkward.

This comparison is also not fair. fgets is idiotic. It may or may not
leave a newline in your buffer and may or may not consume a complete line
of input. All the while, it gives you the false illusion that you can use
it to read one "line" from the input stream. There is no easy way to check
how much exactly was read by fgets. If I had a choice, I would make fgets
return the number of characters read instead of uselessly returning the dst
pointer. scanf is complicated and its usage is error-prone (by me
atleast). However, I usually try to take the time to get a working scanf
solution when it is feasible and shun fgets. strncpy on the other hand
suffers none of those drawbacks. You know exactly, in one comparison,
whether you had enough space for all of the source, and whether your dst is
a valid string or not.

My point is that all library functions have pros and cons. Some like gets
are hopelessly broken. But being dogmatic about rejecting one and favoring
the other without thinking the issues through is ridiculous. So, yes,
there are good, valid situations where strncpy is an excellent fit for the
problem. Don't blindly reject it in favor of strcpy.

-nrk.
 
A

Alan

Alan said:
hi all,

I want to define a constant length string, say 4
then in a function at some time, I want to set the string to a constant
value, say a
below is my code but it fails
what is the correct code?
many thx!


char string[4] = {0};

string = 'a '; /* <-- failed */

sorry, maybe I did not make the question clear so there are confusions and
discussions about the "nul-terminated". In fact, I did not take too much
notice about the "nul-terminated" I must admit.

what I want to do is,
copy characters from some fixed positions at a source file, and then write
those fixed length characters to a new binary file. And there are times that
I assign values directly to those fixed length characters instead of reading
from a source file and then write to the binary file.
After writing the binary file, I will read the fixed length characters base
of the length of characters.
Here is how I do it:

char array[10] = {0};

....
fgets(line, sizeof(line), sourceFile); // get the beginning position of the
characters from source file
strncpy(array, line+27, 10); // and copy to "array"
array[10] = 0;

....
strcpy(array, "abcde "); // assign values myself

....
if (fwrite(&array, sizeof(array), 1, binaryFile) != 1)
{
printf("Error writing array to binary file!\n");
}
 
C

CBFalconer

.... self-recursively about strncpy, strcpy, strlen, gets, etc ...

As far as I am concerned a better solution exists in the BSD
strlcpy and strlcat routines. I have published an implementation
(see URL below, download section), and that includes references to
the original BSD description and rationale. They are much harder
to misuse, and generally will do just what you wish.
 
M

Mac

Mac said:
E. Robert Tisdale said:
Alan wrote:

I want to define a constant length string, say 4
then, in a function at some time,
I want to set the string to a constant value,
say a below is my code but it fails.
What is the correct code?

char string[4] = {0};

string = 'a '; /* <-- failed */

I'm going to assume that you really meant an array of characters.
In C, a *string* must be terminated by a nul character '\0'.

The Alan's code does terminate the string.

Why did you change "The OP's" to "The Alan's?" I'm sure I make grammatical
mistakes from time to time, but I would appreciate it if you did not add
new ones when quoting me.
More precisely, it leaves it *empty*.
The assignment is obviously wrong,
but the initialization leaves the string terminated.
#include <string.h>

char array[4] = {0, 0, 0, 0};

This is a verbose way of doing the same thing as the OP.
Yes.
memcpy(array, "a ", 4);

This leaves array as an array of chars, as you said.
I just need to emphasize to Alan that
that is probably not a good idea.

Well, at least you didn't change the sentence to a grammatically incorrect
one.
Agreed, if Alan really believes that
his string array is a character *string*.
His right-hand-side 'a ' contains four non-nul characters
so I can only assume that he believes string is really
just an array of four characters and *not* a string.
Actually, I think that Alan has *not* decided this point
and is still confused.

I agree that the OP is probably confused.

--Mac
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top