Constant strings

M

Malcolm McLean

Applying it blindly would be useless. It would render it impossible to
get anything done. The sole justification for declaring something const
is to ensure that any code which might modify the thing so-declared is a
constraint violation which will generate a warning message indicating
that such code should not be written. Applying 'const' to anything other
than something which should not be written to is pointless, and would
usually prevent the code from even compiling. Why are you suggesting
that such ridiculous coding practices would be necessary?
Some people use const to document that a pointer is input rather than output.
Say we've got an employee with an embedded pointer giving his name. Say we also write
the function

char *employee_surnameincaps(const Employee *employee);

that tells a reasonably intelligent caller that employee_surnameincaps is going to construct a
temporary string with the employee's surname capitalised, either dynamically allocated or a static
buffer. It's not going to modify his surname in place. We're not actually enforcing that, however.
 
J

James Kuyper

Take this code:

/* somewhere in an external library */
void show_inverted(Image_t bm) {
.....
}

You've just declared show_inverted as taking bm by value. If
show_inverted is going to contain a copy of the original image, then no
problem occurs if it modifies that copy, and there is therefore no
reason why you need to use const.

If, on the other hand, "Image_t" is a typedef for a pointer, such as
"struct Image*", then this provides a prime example why it's a bad idea
to define such typedefs. What you need, in order to properly protect
certain pieces of code, is the ability to declare some pointers as
"const struct Image*", and you can't construct such a type using that
typedef. "const Image_t" is equivalent to "Image_t const" - both mean
"struct Image * const", which is a very different type from "struct
Image const *". "struct Image * const" means that the pointer value
should not be written to, whereas "struct Image const*" means the same
thing as "const struct Image*", which is that the object pointed at
should not be written to.
/* In your code */
Image_t img;

.... hundreds of lines later...

show_inverted(img);

But you understand that either (1) show_inverted() requires the image to be
writeable, but is unchanged at the end; or (2) show_inverted() will corrupt
the image. How best to make use of 'const' here?

Maybe, you change all these calls (if you have many dozens of them scattered
everywhere) so that they look like this:

show_inverted((const Image_t)img);

But a few problems with this:

(0) It has no useful effect.

It's possible to declare objects as 'const' in themselves, but only if
they never need to be modified after initialization, which is fairly
unusual, especially for large objects, (such as those that might contain
an entire image). More commonly, 'const' is used in 'const T*', to mark
the object pointed at as something that should not be modified. A
pointer to a non-const object will end up being converted to "const T*",
but almost never by explicit conversion. It usually is done by passing a
"T*" argument to a function declared as taking a "const T*" parameter -
precisely the situation you complained about being allowed in an earlier
message. Allowing such implicit conversions is a key part of what makes
'const' useful - saying that the reverse conversion can NOT occur
implicitly is another key part.

It's also feasible to do such an implicit conversion without a function
call:

struct Image img;

// code that fills in img

const struct Image *cpimg = &img;

// Code that uses cpimg to do thing that should not modify 'img'.

Within the scope of a "const T*" declaration, the 'const' ensures that
attempts to modify the pointed-at object through that pointer will be
constraint violations, guaranteeing that you will be warned of the
necessity of re-writing your code to avoid that problem. Such code is
problematic, whether or not 'const' is used. 'const' serves only to
enable warnings about such code.
 
B

BartC

James Kuyper said:
You've just declared show_inverted as taking bm by value. If
show_inverted is going to contain a copy of the original image, then no
problem occurs if it modifies that copy, and there is therefore no
reason why you need to use const.
If, on the other hand, "Image_t" is a typedef for a pointer, such as
"struct Image*", then this provides a prime example why it's a bad idea
to define such typedefs. What you need, in order to properly protect
certain pieces of code, is the ability to declare some pointers as
"const struct Image*", and you can't construct such a type using that
typedef. "const Image_t" is equivalent to "Image_t const" - both mean
"struct Image * const", which is a very different type from "struct
Image const *". "struct Image * const" means that the pointer value
should not be written to, whereas "struct Image const*" means the same
thing as "const struct Image*", which is that the object pointed at
should not be written to.

This is more a prime example of why const is of limited use.

Image_t is supposed to be an opaque type, in reality probably some sort of
handle implemented as a pointer (to a descriptor which itself points to the
data) or struct, which implements the descriptor.

It might happen that a pointer directly points to the image data, but that
would be unusual; that might be the case at a lower level where you have to
pass dimensions, image depth etc. explicitly.

But with all these possibilities:

* A typedef-ed pointer to a descriptor
* A typedef-ed descriptor struct
* Some typedef-ed index into a table of image descriptors
* Even a typedef-ed pointer to raw image data
* Etc.

You can't protect the data with a simple const cast. That only works with a
simple non-typedef-ed pointer to image data (where you have to deal with
other details of the image that you would not need to bother with, using the
opaque type).

And without having this knowledge, const would be useless. (And even with
the knowledge, it's a bad idea to use const, as the idea of using an opaque
type is that the implemention could change, but ought to still work, after
recompiling.)
(0) It has no useful effect.

You finally agree with me!
 
K

Keith Thompson

Ian Collins said:
It also the wrong solution. You transform the data, then parse it.

Not if you just want to convert, say, the 3rd comma-delimited field to
upper case.

But I don't think a rare special case like that justifies dropping
"const" from the language.
 
K

Keith Thompson

Malcolm McLean said:
I don't have an easy answer, but essentially const makes the wrong
distinction, which is between opaque and transparent pointers, not
between writeable and non-writeable ones. Normally it's none of
caller's business whether a subroutine is setting flags in a structure
passed to it or not.

I'm not sure what you mean by "opaque" vs. "transparent".

The caller "owns" the structure, and needs to control whether it's
modified. Should a function that prints a structure, or that sends its
value somewhere, modify it? Should the owner of a structure not have a
way to specify that its value should not be modified?

In some cases, you might have fields within a structure that can be
modified without changing the *logical* value. C++ has mechanisms (the
"mutable" keyword) for dealing with that; C does not.
 
K

Keith Thompson

BartC said:
How is it going to get the job done otherwise? If changing the passed data
is a problem for the caller, then it arranges for it not to be a problem
(such as passing a copy of the data).

What "job" are you talking about? Please provide a concrete example.

If you have data that you don't want modified, and you have a function
that would modify that data, then don't pass that data to that function.
If you "need" to do so, then there's something wrong with your design,
something that won't be corrected by removing "const".
This must occur everywhere where function parameters do not have 'const' in
their types. Does every single call to such functions always apply a (const
char*) or equivalent cast to each argument just in case it might write to
it? (Which would be lying anyway as it wouldn't be read-only, but a weak
attempt to apply write-protection)

No, of course not; such a cast would be pointless (which is why nobody
has suggested casting arguments to const char*).

Do you need to apply a cast when passing a non-const argument to
strlen()?

The "const" goes on the parameter declaration, not on the call. Adding
"const" allows the function to be called with either const or non-const
data. The only case that's forbidden is passing const data to a
function with a non-const parameter.

void no_change(const char *param);
void change(char *param);

char modifiable[LEN];
char read_only[LEN];

no_change(modifiable); /* ok */
no_change(read_only); /* ok */
change(modifiable); /* ok */
change(read_only); /* error, caught by compiler */
But usually you would have some clue as to what each function did and how it
worked.

Otherwise, for 'const' to be any use at all, you would have to blindly apply
it just about everywhere.

No, you don't apply it to anything that you want to be writable. Don't
apply const blindly; apply it intelligently (which requires
understanding what it means).
As for an example, this is derived from an actual project: you have a
tokeniser function F which takes a string representing a source file, and
generates a list of tokens. String/name tokens might make use of pointers
(ie. slices) into the caller's string.

But it may be necessary for the caller's string to be modified (convert to
lower case, convert string const escapes, add 0-terminators etc) for this to
work.

That's fine if the caller agrees, otherwise (because it needs to preserve
the original for error reports, or it doesn't own the data) then it might
just pass a copy.

strtok() is an example of this. You can pass a pointer to a modifiable
string to strtok(), and it will be modified. You can't pass a pointer
to a non-modifiable (const) string to strtok(). If you need to use
strtok() on read-only data, you need to make a copy of it.

That's how the language works right now. Without "const", the only
difference would be that the compiler wouldn't warn you if you pass a
pointer to a read-only string, because you'd have no way to tell the
compiler that a string is read-only.

How is this an argument against "const"?
Another example (contrived this one), you have a function F that needs to
invert, reverse, or do some operation to some data (string or image for
example) in order to display the result or whatever. But the operation is
reversible. After the call, the caller's data is unchanged, but it needs to
be writeable.

Right, "const" doesn't handle cases where a function needs to
temporarily modify data and then change it back to its original state.
I don't think that's a common case, certainly not comon enough to
justify throwing away the benefits of "const" in other cases.
Or maybe F needs to do some irreversible operation, and it is simplest to do
it in-place (like my first example). Maybe that's OK for the caller, maybe
not. But /how is const going to help in specifying such a function/? It
won't. That's why I say it's not of vital importance to the language.

It helps because it prevents the caller from performing an irreversible
operation on read-only data. If the caller needs to change the data,
then the caller *shouldn't have declared that data "const".

The existence of "const" doesn't prevent you from writing and calling
functions that modify data. You just don't use "const" for those cases.
 
K

Keith Thompson

BartC said:
[...]

This is more a prime example of why const is of limited use.

Image_t is supposed to be an opaque type, in reality probably some sort of
handle implemented as a pointer (to a descriptor which itself points to the
data) or struct, which implements the descriptor.

It might happen that a pointer directly points to the image data, but that
would be unusual; that might be the case at a lower level where you have to
pass dimensions, image depth etc. explicitly.

But with all these possibilities:

* A typedef-ed pointer to a descriptor
* A typedef-ed descriptor struct
* Some typedef-ed index into a table of image descriptors
* Even a typedef-ed pointer to raw image data
* Etc.

You can't protect the data with a simple const cast. That only works with a
simple non-typedef-ed pointer to image data (where you have to deal with
other details of the image that you would not need to bother with, using the
opaque type).

The "simple const cast" you keep talking about *doesn't do anything*.
If you don't understand that, we're not going to get anywhere.

The functions declared in <stdio.h> are similar to what you're talking
about. FILE is an opaque type, and the functions take FILE* arguments.
They can modify the contents of the FILE object. None of those
functions take a "const FILE*" argument. The caller doesn't need to
know what's inside the FILE object.

You can write code that doesn't use "const" if it makes sense to do so,
and the existence of "const" doesn't make it inconvenient to do so.

And without having this knowledge, const would be useless. (And even with
the knowledge, it's a bad idea to use const, as the idea of using an opaque
type is that the implemention could change, but ought to still work, after
recompiling.)


You finally agree with me!

The cast that you introduced to this discussion makes no sense. It's a
misuse of "const" and of casting.

This:

func((const char*)arg);

is effectively equivalent to this:

func(arg);

regardless of how func defines its parameter. This does not constitute
an argument against proper use of "const".
 
M

Malcolm McLean

I'm not sure what you mean by "opaque" vs. "transparent".
An opaque structure is one which has a documented name, and it's got functions which operate on
it and have documented behaviour. But the members aren't documented and are subject to
change.
An example is the options parser on my website. You create one by passing argv and argc to a
constructor function. Then you call functions with a scanf-like interface to extract the options.
Then you call an error function to check for any errors, and a destructor to destroy it.

In fact it caches the data. So if the user calls a program with a path, there will be one copy in argv,
another held internally, and third passed back to caller. But there's no reason to know that, and
if it became necessary to avoid the overhead it could be changed so that it holds onto argv
directly. Whilst getting options is basically a read operation, it reports if user tries to specify
an option which isn't read. So it needs to store a flag to indicate that an option has been
read, and so presumably is supported. That's to make it easier to use, caller doesn't have to pass
in a list of supported options then go through the list again extracting them.
 
S

Stephen Sprunk

This particular issue isn't bothering me at the minute. I was just
intrigued at the more rigorous enforcement of const types on one hand,
compared with the lax approach used with string constants. It's never
really come up before because I refuse to use 'const' anywhere.

Then you're throwing away one of the few type-safety features that C
offers, which doesn't seem wise.
(And I have tried using g++ to compile my code (with a view to
simplifying using libraries only having a C++ interface), but it's even
more of a nightmare getting it to compile my C code, most of which is
auto-generated in various ways.)

Sounds like your auto-generator is broken; getting it to produce code in
the C-like subset of C++ should be trivial.

S
 
K

Keith Thompson

Malcolm McLean said:
An opaque structure is one which has a documented name, and it's got
functions which operate on it and have documented behaviour. But the
members aren't documented and are subject to change.

Right, I know what "opaque" means; I just didn't understand the way you
used it in this context.

I don't have an easy answer, but essentially const makes the wrong
distinction, which is between opaque and transparent pointers, not
between writeable and non-writeable ones. Normally it's none of
caller's business whether a subroutine is setting flags in a
structure passed to it or not.

I agree that "const" does not always deal well with opaque data
structures. It enforces component-wise (effectively bitwise)
read-only semantics, not necessarily logical read-only semantics.
(As I've mentioned, C++ adds "mutable" to deal with this; I'm not
sure whether adding "mutable" to C would be a good idea.)

But I find that "const" does deal well with transparent data
structures (such as the ones used to implement opaque data
structures).

The fact that "const" is not necessarily suitable for *all* cases
where you want to specify read-only semantics doesn't argue against
continuing to use it where it makes sense.

Your statement that const makes the distinction between opaque and
transparent pointers doesn't make much sense to me. Is that really
what you meant?

[...]
 
K

Keith Thompson

BartC said:
You might try casting the argument to (const T*), but then you just get a
compiler error here; what good will that do? You need to call F, so a
solution has to be found. And you wouldn't have deliberately put in this
cast unless you'd already researched the side-effects of F, in which case
you don't need the compiler to tell you what you already know!

Or maybe you routinely just use (const T*) casts everywhere you ever pass a
pointer to anything, but then I wouldn't want to have to read your code!

Actually I can't see point of using 'const T*' as any function parameter
type, since it will accept both const and non-const arguments (but see
below).
[...]

More briefly (since this point may have been lost in previous walls of
text):

If you think that casting a T* argument to (const T*) does anything
at all, or if you think that anyone else in this discussion has
advocated such casts, then I submit that you do not properly
understand what "const" means in C as it's currently defined.
I advise you to correct that gap in your knowledge before criticizing
the current language definition. I'll be happy to help with any
questions.

A concrete example:

char message[] = "hello";
strlen(message);

message is non-const. strlen has a const char* parameter. No cast is
needed on the call, and adding such a cast would do nothing.
 
J

James Kuyper

This is more a prime example of why const is of limited use.

Image_t is supposed to be an opaque type,

The following is a fully opaque type:

typedef struct Image Image_t;

You should generally avoid hiding the fact that something is a pointer
or an array by using a typedef. C provides several syntactic features
(pointers to qualified types is just one example) that can lead to
considerable confusion if the use of a typedef is not aware of the fact
that it is a pointer or an array.

....
You finally agree with me!

No, you believe that 'const' has no useful effect. I believe that
misusing 'const' in the way you've suggested serves no useful effect,
while 'const', properly used, can be very useful. The fact that you
suggested misusing 'const' in that fashion suggests an extremely severe
misunderstanding of what 'const' means, that calls into question the
validity of all of your judgements about its usefulness.
 
S

Stephen Sprunk

How is it going to get the job done otherwise? If changing the passed
data is a problem for the caller, then it arranges for it not to be a
problem (such as passing a copy of the data).

It'd be annoying if every time we called a function, we had to make a
copy of all the data just in case that function tried to modify it; in
most cases, the function won't so that effort is wasted. If only there
were a way for a function to indicate that it won't do so, which
compilers could then enforce for us... Oh wait, there is; it's called
"const"!

Also, what about the functions we use to copy our data? _They_ might
modify the original data as well, so now you've got an infinite loop!
Gosh, maybe that's why memcpy(), strcpy(), et al use const for the
source arguments!
This must occur everywhere where function parameters do not have
'const' in their types. Does every single call to such functions
always apply a (const char*) or equivalent cast to each argument just
in case it might write to it? (Which would be lying anyway as it
wouldn't be read-only, but a weak attempt to apply write-protection)

You can't pass a (const char *) argument to a function taking (char *);
that's the entire point.

You _can_ pass a (char *) argument to a function taking (const char *).
But usually you would have some clue as to what each function did
and how it worked.

And one of those clues is whether the arguments are const; that clearly
indicates to both programmer and compiler that the argument won't be
changed by the function.

Yes, malicious programmers can cast away the constness, but if you have
to worry about that, then you have far bigger problems to worry about.
Otherwise, for 'const' to be any use at all, you would have to
blindly apply it just about everywhere.

As for an example, this is derived from an actual project: you have
a tokeniser function F which takes a string representing a source
file, and generates a list of tokens. String/name tokens might make
use of pointers (ie. slices) into the caller's string.

But it may be necessary for the caller's string to be modified
(convert to lower case, convert string const escapes, add
0-terminators etc) for this to work.

That's fine if the caller agrees, otherwise (because it needs to
preserve the original for error reports, or it doesn't own the data)
then it might just pass a copy.

You have two choices in that case; either the function makes copies and
returns those, or it documents that it modifies its input and the caller
has to make the copy. Examples of both can be seen in the wild.
Another example (contrived this one), you have a function F that
needs to invert, reverse, or do some operation to some data (string
or image for example) in order to display the result or whatever. But
the operation is reversible. After the call, the caller's data is
unchanged, but it needs to be writeable.

Such a function should put the inverted/reversed/etc. data in its own
temporary buffer, which gets destroyed on exit, leaving the original
unchanged.
Or maybe F needs to do some irreversible operation, and it is
simplest to do it in-place (like my first example). Maybe that's OK
for the caller, maybe not. But /how is const going to help in
specifying such a function/? It won't. That's why I say it's not of
vital importance to the language.

If the function is going to modify its input, simply don't specify the
argument as const. If the caller needs to preserve its data, it will
make a temporary copy of its data to pass to the function.
It might be of help in the odd place, but that's offset by the extra
clutter that can make it harder to see real bugs.

No, const helps _prevent_ and _diagnose_ real bugs.

S
 
M

Malcolm McLean

Your statement that const makes the distinction between opaque and
transparent pointers doesn't make much sense to me. Is that really
what you meant?
I accidentally wrote nonsense.

The distinction we really need to draw is between opaque and transparent pointers. const doesn't do
that, it distinguishes between writeable and non-writeable ones, and even that not very well,
because it doesn't struck to nested pointers embedded within the structure.

const isn't appropriate for an opaque pointer, because the whole point is that caller doesn't need to
know whether its being modified internally or not. With transparent pointers, it does have a purpose,
but a pretty minor one, because normally if a transparent pointer is modified, then the purpose of
the call is to create that modification and examine it or use it in some way. If it's not modified,
then the purpose of the structure is to provide input parameters for the call. So "const" isn't
especially helpful - not to say there won't be a few situations where it's very helpful, but as a
general rule it's a reminder you don't need.
 
K

Keith Thompson

Ike Naar said:
void no_change(const char *param);
void change(char *param);

char modifiable[LEN];
char read_only[LEN];

There seems to be a 'const' missing here.
Yes.
no_change(modifiable); /* ok */
no_change(read_only); /* ok */
change(modifiable); /* ok */
change(read_only); /* error, caught by compiler */

Corrected example:

void no_change(const char *param);
void change(char *param);

char modifiable[LEN];
const char read_only[LEN];

no_change(modifiable); /* ok */
no_change(read_only); /* ok */
change(modifiable); /* ok */
change(read_only); /* error, caught by compiler */
 
K

Keith Thompson

Malcolm McLean said:
I accidentally wrote nonsense.

It happens to all of us.
The distinction we really need to draw is between opaque and
transparent pointers. const doesn't do that, it distinguishes between
writeable and non-writeable ones, and even that not very well, because
it doesn't struck to nested pointers embedded within the structure.

const isn't appropriate for an opaque pointer, because the whole point
is that caller doesn't need to know whether its being modified
internally or not. With transparent pointers, it does have a purpose,
but a pretty minor one, because normally if a transparent pointer is
modified, then the purpose of the call is to create that modification
and examine it or use it in some way. If it's not modified, then the
purpose of the structure is to provide input parameters for the
call. So "const" isn't especially helpful - not to say there won't be
a few situations where it's very helpful, but as a general rule it's a
reminder you don't need.

I mostly agree with your point about const with opaque pointers. For
opaque data structures (FILE is a good example), you typically want the
functions defined in the interface to deal with the internals. Whether
those functions modify members of the FILE structure (assuming it's a
structure) are typically of no relevance to client code, which will deal
only with FILE* pointers created by functions in the interface.

Which is why fopen, fread, et al take arguments of type FILE* and not
const FILE*.

But there's plenty of code behind the scenes that implements that
abstract interface, and I would *hope* that it makes judicious use of
"const" to avoid and detect logical errors.

How much C code actually use opaque types? Without having done a
survey, I'd say "not a whole lot" and "probably not enough".
 
B

BartC

Keith Thompson said:
BartC said:
You might try casting the argument to (const T*), but then you just get a
compiler error here; what good will that do? You need to call F, so a
solution has to be found. And you wouldn't have deliberately put in this
cast unless you'd already researched the side-effects of F, in which case
you don't need the compiler to tell you what you already know!

Or maybe you routinely just use (const T*) casts everywhere you ever pass
a
pointer to anything, but then I wouldn't want to have to read your code!

Actually I can't see point of using 'const T*' as any function parameter
type, since it will accept both const and non-const arguments (but see
below).
[...]

More briefly (since this point may have been lost in previous walls of
text):

If you think that casting a T* argument to (const T*) does anything
at all, or if you think that anyone else in this discussion has
advocated such casts, then I submit that you do not properly
understand what "const" means in C as it's currently defined.

You think so?

My remark was about casting a T* argument to const T*, when passed to a
function taking a T* parameter.

It doesn't exactly do nothing:

strcpy("abc","def");

generates no warnings from gcc, but applying a cast does:

strcpy((const char*)"abc","def");

A bit of an imposition though to add everywhere.
I advise you to correct that gap in your knowledge before criticizing
the current language definition. I'll be happy to help with any
questions.
A concrete example:

char message[] = "hello";
strlen(message);

message is non-const. strlen has a const char* parameter. No cast is
needed on the call, and adding such a cast would do nothing.

Indeed. But that is not what I said, which was more along the lines of the
const in the 'const char*' formal parameter type of strlen() doing nothing,
in terms of generating warnings. (Although I will admit it might help out
the compiler with its optimising.)
 
M

Malcolm McLean

How much C code actually use opaque types? Without having done a
survey, I'd say "not a whole lot" and "probably not enough".
I use opaque types pretty heavily. But a lot of people would turn to C++ for the same problems.
 
B

BartC

Stephen Sprunk said:
Then you're throwing away one of the few type-safety features that C
offers, which doesn't seem wise.


Sounds like your auto-generator is broken; getting it to produce code in
the C-like subset of C++ should be trivial.

I wasn't specifically generating C++ code. I just tried a C++ compiler on
the off-chance it would work.

For example C++ doesn't like pointers to unbounded arrays, which my source
language likes to use to differentiate between pointers to arrays and
non-arrays (more useful than const/non-const IMO).

Fixing that isn't so trivial.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top