Constant strings

B

BartC

Malcolm McLean said:
And you very rarely want to call strchr with an explicit string literal,
but
you quite often want to parse a string literal passed from above.
The poor rules for string literals are seldom much of a practical problem,
because if in a real program a function modifies a string passed to it,
the
caller is going to want to examine the result. So he can't pass a string
literal.
const rules mean that you have to have two versions of strchr, which isn't
too bad. But they also mean that anyone writing a strchr-like function to
parse a string needs to write two versions. That's unacceptable.

In general, you need one function taking const char*, as that will accept,
without errors, either char* or const char* arguments.

The problem in this example, is when you return the same input string, or a
slice of it, then you want the output type to match that of the argument:
either const char* or char*. That's tricky to do with just the one function
and its one return type.

This is a more general problem with 'const': it can quickly turn into a
nuisance, if you strive to do the right thing everywhere, and possibly
distract you from other things. Perhaps better to just accept that you are
programming in C, after all, where you need to live a little more
dangerously.)

It doesn't help that as C implements it (a necessity, given its fairly low
level and its role in implementing everything else), any type (parameter
type etc) can be arbitrary complex and 'const's can appear anywhere in the
type structure: const/non-const pointers to arrays of const/non-const
elements, or to structs of a mix of const/non-const elements including
const/non-const pointers to other const/non-const structs and arrays...

Even if a compiler can untangle it all, frequently a human can't!
 
M

Malcolm McLean

No, they don't.

They would only need two versions if one were to modify the input. If
that were the case, two functions with different names would be in order.
Take the following function

char *getnextcsvfield(char *str)

Csv files are basically list of numbers or strings separated by commas, however
there are rules for quotes and escapes which make the function not entirely
trivial to write.
Now the vast majority of people are going to get the line, use the function
to iterate over the fields, and call strtod or similar to extract the data.
But just occasionally you'll get someone who wants to modify the field in
place. That's legitimate.
So if the function is specified

const char *getnextcsvfield(const char *str)

He's got to cast the constness away. Allowed, but driving a coach and horses
though the system.
If the functions specified
char *getnexcsvfield(char *str)

we can't pass it a string literal. Not that in reality anyone would want to do
that anyway, except for testing. But a nuisance.

So the only option is two functions, one taking a cost and one not. That's also
unacceptable.
This is the sort of thing that makes languages which attempt to be a safer C
difficult to use. The programmer is constantly programming around the
restrictions.
 
I

Ike Naar

And you very rarely want to call strchr with an explicit string literal, [...]

I sometime use

if (strchr("KMGTP",c))

as a shorthand for

if (c=='K' || c=='M' || c=='G' || c=='T' || c=='P')
 
K

Kaz Kylheku

And you very rarely want to call strchr with an explicit string literal, [...]

I sometime use

if (strchr("KMGTP",c))

as a shorthand for

if (c=='K' || c=='M' || c=='G' || c=='T' || c=='P')

On the same note, I just remembered something. Check out this utility:

It has tons of strchr over literals to do set testing, just like
your shorthand above.

http://www.kylheku.com/cgit/c-snippets/tree/autotab.c

This is little a program which analyzes a file to determine the tabstop,
expandtab and shiftwidth settings for Vim.
 
B

Barry Schwarz

Take the following function

char *getnextcsvfield(char *str)

Csv files are basically list of numbers or strings separated by commas, however
there are rules for quotes and escapes which make the function not entirely
trivial to write.
Now the vast majority of people are going to get the line, use the function
to iterate over the fields, and call strtod or similar to extract the data.
But just occasionally you'll get someone who wants to modify the field in
place. That's legitimate.
So if the function is specified

const char *getnextcsvfield(const char *str)

A function named get... should never modify whatever source data it is
processing.

A function which can modify the source data should never deceive the
user with a const.
 
K

Keith Thompson

Or if one were to permit the caller to modify the input.
Take the following function

char *getnextcsvfield(char *str)

Csv files are basically list of numbers or strings separated by commas, however
there are rules for quotes and escapes which make the function not entirely
trivial to write.
Now the vast majority of people are going to get the line, use the function
to iterate over the fields, and call strtod or similar to extract the data.
But just occasionally you'll get someone who wants to modify the field in
place. That's legitimate.
So if the function is specified

const char *getnextcsvfield(const char *str)

He's got to cast the constness away. Allowed, but driving a coach and horses
though the system.
If the functions specified
char *getnexcsvfield(char *str)

we can't pass it a string literal. Not that in reality anyone would want to do
that anyway, except for testing. But a nuisance.

Well, in C you *can* pass it a string literal, because string literals
aren't const, but yes, if string literals were const (which I personally
would greatly prefer), then you wouldn't be able to pass a string
literal.

If the intent of the function is to allow the string to be modified, why
*should* you be able to pass a string literal to it?
So the only option is two functions, one taking a cost and one
not. That's also unacceptable.

One function that lets you modify the string you pass to it, and another
that doesn't. Two different functions *with significantly different
semantics*. What's unacceptable about that?

And how often are you going to want to modify data in place in a string
containing CSV data? You can only do that if the replacement data is
exactly the same length as the original data; you can't change
"...,9,..." to "...,10,..." without shifting the rest of the string and
perhaps allocating extra memory. I'd say it's perfectly acceptable in
this case to provide *only* a function that doesn't let you modify
anything. (And if you really need to modify the string, you can either
write that second function or *carefully* use casts to override the
"const".)
This is the sort of thing that makes languages which attempt to be a
safer C difficult to use. The programmer is constantly programming
around the restrictions.

So what's your solution? Would you drop "const" from the language?
Suppose you have a function that modifies a string, and you accidentally
pass it a string literal; do you not *want* the compiler to warn you?
(C compilers typically won't warn in that case, but I don't think that's
an argument for loosening the const rules.)
 
S

Stefan Ram

Barry Schwarz said:
A function named get... should never modify whatever source data it is
processing.

»getc«, »getchar« and »gets« (»gets_s«) all seem to advances
the associated file position indicator for the stream used.
 
B

BartC

Keith Thompson said:
So what's your solution? Would you drop "const" from the language?

Am I allowed to say Yes? I don't think it would be missed. It's easy enough
though to just not use it, or to remove const keywords with an editor (then
maybe the code will be a lot clearer!).
Suppose you have a function that modifies a string, and you accidentally
pass it a string literal; do you not *want* the compiler to warn you?

There are plenty of other things to worry about. If you are calling a
function which does in-place modifications of an argument, then you need to
know about it, as you might not want that to happen even if your string is
writeable; you just don't want it written to by that function.

Modification of a string literal at least has a chance of being detected by
the hardware.

Using 'const's might just be giving a false sense of security.
 
K

Kaz Kylheku

Am I allowed to say Yes? I don't think it would be missed.

Dennis Ritchie wasn't crazy about qualifiers either, so you wouldn't be in bad
company.

See http://www.lysator.liu.se/c/dmr-on-noalias.html

"Let me begin by saying that I'm not convinced that even the pre-December
qualifiers (`const' and `volatile') carry their weight; I suspect that what
they add to the cost of learning and using the language is not repaid in
greater expressiveness."
 
B

Barry Schwarz

»getc«, »getchar« and »gets« (»gets_s«) all seem to advances
the associated file position indicator for the stream used.

But they don't alter the data extracted from the stream.
 
K

Keith Thompson

BartC said:
Am I allowed to say Yes?

Certainly -- and I'm allowed to say that I completely disagree.

(Incidentally, the question was addressed to Malcolm, which in no way
implies that your input is unwelcome.)
I don't think it would be missed. It's easy enough
though to just not use it, or to remove const keywords with an editor (then
maybe the code will be a lot clearer!).


There are plenty of other things to worry about. If you are calling a
function which does in-place modifications of an argument, then you need to
know about it, as you might not want that to happen even if your string is
writeable; you just don't want it written to by that function.

Exactly -- and you can express that by defining that function's
parameter as "const char*".

char s[] = "hello"; /* s is writable */
strlen(s); /* strlen() promises not to modify the string */
Modification of a string literal at least has a chance of being detected by
the hardware.

Yes, but not at compile time. I like to detect errors as early as
possible.
Using 'const's might just be giving a false sense of security.

I might be hit by a meteorite; why bother to wear a seatbelt?
 
S

Seungbeom Kim

const rules mean that you have to have two versions of strchr, which isn't
too bad. But they also mean that anyone writing a strchr-like function to
parse a string needs to write two versions. That's unacceptable.

Still a nuisance, but not too bad as one version can often delegate
the actual work to the other version and just add a cast. For example:

const char *strchr_c(const char *str, int c) { /* actual work */ }
char *strchr_nc(char *str, c) {
return (char *)strchr_c(str, c);
}

// or

char *strchr_nc(char *str, int c) { /* actual work */ }
const char *strchr_c(const char *str, c) {
return strchr_nc((char *)str, c);
}

(Casting in itself is typically a no-op, and the compiler may be smart
enough to inline the delegating function or make it into a thunk.)

With generics, a single function (such as std::find in C++) works with
many different types, including const/non-const; needing const/non-const
versions for strchr is just analogous to needing float/double/long double
versions for each math function.
 
S

Seungbeom Kim

A function named get... should never modify whatever source data it is
processing.

And we're not talking about such a function that does; it's just a matter
of allowing the *caller* to modify the source data through the return value.
 
B

BartC

Keith Thompson said:
Exactly -- and you can express that by defining that function's
parameter as "const char*".

That won't work:

(1) The function *needs* to modify the thing pointed to by its argument

(2) It might not be your function to modify

Nothing to do with 'const' anymore, just awareness of the side-effects of a
function. If you're going to pass it a writeable string that you don't want
modified, then const is not going to help. (Maybe pass it a copy, use
another function, or just be aware it could be modified. Or if it is a
literal you're thinking of passing, to think twice about that!)
char s[] = "hello"; /* s is writable */
strlen(s); /* strlen() promises not to modify the string */

In the case of strlen(), a short note in the function docs can state the
same thing.
Yes, but not at compile time. I like to detect errors as early as
possible.

But as I showed in my OP, potential errors such as 'char* q="ABC"' are not
picked up unless using an obscure gcc option, which isn't included in -Wall
and -Wextra, and in other compilers may not be detectable at all.
I might be hit by a meteorite; why bother to wear a seatbelt?

This is exactly my point: there might be so many signs everywhere warning
about meteorites, that they obscure the more ordinary ones, like stop signs.
 
R

Rosario193

As far as I can gather from my experiment below, a string constant in source
code has a 'char*' type, not 'const char*'. Why is that?

Here, I can't get a compiler to complain about passing a string constant as
a char* parameter where it is clearly going to be modified. But it doesn't
like the q=p line which does the same. The q="Bart" line shows up the issue
more simply:

char* change_initial(char* s,char c){
*s=c;
return s;
}

int main (void) {
const char *p;
char* q;

change_initial("Bart",'C');

q=p;
q="Bart";
*q='C';

}

"const" would exist for declare memory space too and not one type
so "const int *p;"
means
that p can point only mem that is const

Buona Pasqua a Tutti
 
R

Rosario193

"const" would exist for declare memory space too and not one type
so "const int *p;"
means
that p can point only mem that is const

Buona Pasqua a Tutti

but if one think better where is the const mem in a pc? not exist...
the memory is both or readable writable or not readable writable
at last i think that
 
K

Keith Thompson

BartC said:
That won't work:

(1) The function *needs* to modify the thing pointed to by its argument

Then I must have misunderstood. You said "you just don't want it
written to by that function". Can you clarify? Does the function
modify the string or not?
(2) It might not be your function to modify

A function with a char* parameter pointing to a string *should* declare
that parameter as "const char*". If it doesn't, and if you're not able
to fix it, then you'll just have to deal with that.

[...]
char s[] = "hello"; /* s is writable */
strlen(s); /* strlen() promises not to modify the string */

In the case of strlen(), a short note in the function docs can state the
same thing.

Sure, but it doesn't have to, because we have "const".
But as I showed in my OP, potential errors such as 'char* q="ABC"' are not
picked up unless using an obscure gcc option, which isn't included in -Wall
and -Wextra, and in other compilers may not be detectable at all.

Right, that's one special case, already explained and repeatedly
acknowledged as a flaw in the language. The solution is to remember to
add the const yourself.
This is exactly my point: there might be so many signs everywhere warning
about meteorites, that they obscure the more ordinary ones, like stop signs.

Are you suggesting that errors involving code that modifies things that
shouldn't be modified are comparable in rarity to meteorites?
 
K

Keith Thompson

Rosario193 said:
"const" would exist for declare memory space too and not one type
so "const int *p;"
means that p can point only mem that is const

In what language? That's not what "const" means in C.
 
K

Keith Thompson

Rosario193 said:
but if one think better where is the const mem in a pc? not exist...
the memory is both or readable writable or not readable writable
at last i think that

PCs do have read-only memory. They can also have regions of RAM
treated as read-only by the operating system, so that a program can read
it but not write it.

Try reading and writing a string literal in a C program and see what happens.
 
B

BartC

Keith Thompson said:
Then I must have misunderstood. You said "you just don't want it
written to by that function". Can you clarify? Does the function
modify the string or not?

Yes it does. I'm just saying there are plenty of situations where functions
modify string arguments etc, but you don't want your data changed. 'const'
is only of use in certain places.
Sure, but it doesn't have to, because we have "const".

Which in this case, isn't very useful: you can pass writeable or read-only
strings to it, just like you can to a function not using 'const'; what have
we gained?

(Maybe the compiler can find these useful hints to do help with optimising
and stuff, but it doesn't do much for the clarity of the source code.)
Are you suggesting that errors involving code that modifies things that
shouldn't be modified are comparable in rarity to meteorites?

Well, issues linked to const or non-const arguments are as rare as meteorite
strikes in my programming.

Take this fragment of code posted by Stefan Ram in 'question about scanf':

char const * const string = "abc17";
char const * p = string;

Maybe he was being deliberately obfuscatory here, or maybe not. But those
consts do nothing for the clarity of the code (actually they make things
worse: I didn't know you could write char const as well as const char; I
thought I had that sussed).

Maybe they will help trap any attempts to write via the pointer p, but then
the code will likely not work anyway. But there are plenty of other ways to
shoot yourself in the foot (running past the end of the string for example).
All I know is the code is much easier on the eye without those consts.

I mean, most people surely don't bother with marking arbitrary files in
their file systems as read-only, why the need to do it with type-specs in a
language?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top