Non-constant constant strings

Aleksandar Kuktin · Jan 21, 2014

[snip]

char *read_only[] = { "Rick", "Jane", "Marc", 0 };
char **read_write;

char **init_readwrite(char **readonly) {
unsigned int i, count;
char **readwrite;

for (count=0, i=0; readonly; i++) {
count++;
}
readwrite = malloc(count * sizeof(*readwrite));
/* no check */
return memcpy(read_write, read_only, count * sizeof(*readwrite));
}

read_write = init_readwrite(read_only);

...And then you operate on read_write and ignore read_only.

Click to expand...

read_only is an array of pointers; the things that the pointers point at
are not modifiable. It should therefore, for safety, have been declared
as "const char*[]".

You allocate enough space to copy over all of the pointers to
read_write. Then you do copy them over. The new pointers in read_write
still point at the same locations as the ones in read_only; those
locations still cannot be safely written to, so nothing has been gained
by the copy. It is therefore incorrectly named.

Correct. I also made a slight error because I didn't copy the terminating
zero.

My trigger-happines got the better of me again. My understanding was that
the OP had wanted a list of lines and that he wanted to exchange the
lines accoding to some rule.

Only later did I realize he actually wants what amounts to run-time macro
expansion. That, obviously, requires a different approach...

That's why you need to create a deep copy, as in Rick's code. It copies
the strings themselves to memory that it guaranteed writable.

Click to expand...

....this being one of the better suited ones. Another possibility would be
to malloc() a big flat buffer, copy the lines into it (possibly doing
macro expansion while copying) and manipulate it as if it were a mmap()-
ed file. A change in macro expansion rules can be effected by re-copying-
and-expanding the lines.

Rick C. Hodgin · Jan 21, 2014

char defaultOption = "4";
Did you mean "char *defaultOption" or "char defaultOption[]" rather than
"char defaultOptions", or did you mean '4' rather than "4"?

Click to expand...

I meant char defaultOption[] = "4";

Click to expand...

Then you've already got your wish; defaultOption contains a writable
string.

I was using this an example of me creating a writable string. My issue relates to the identical encoding syntax used for the string contents itself, but that it exists in another place in this:

char* list[] = { "one", "two", "three", null };

In the defaultOption I encode a literal that is not a constant. In list I encode three literals that are constants. In both of them I use double-quote, text, double-quote, but the compiler makes the items in the list[] array of pointers all read-only/constant by convention. That is my issue.

My example was only to demonstrate that read-write strings are encoded the same way as constant strings.

...

I find that both documentation and clarity is best served by defining
each variable with the smallest scope that is consistent with the way it
will be used (except that I will not create a separate compound
statement for the sole purpose of more tightly constraining the scope -
that would require a separate compound statement for each variable; that
way lies madness).

I find that both documentation and clarity is best served by defining
all variables at a common location, and then using a GUI which has smart
windowing or hover abilities to indicate from where it came.

Among other benefits, that approach minimizes the distance I have to
search for the definition of the variable (since such search normally
starts at a point where the variable is being used).

The GUI minimizes the distance involved by creating a constant lookup
window that shows all code definitions when they're needed. In addition,
add-on tools like Whole Tomato's Visual Assist X (for use in Visual
Studio) allows Ctrl+Alt+F to find all references, showing other source
code line uses, etc.

We're beyond the days of text-based editors.

I'm curious - are you familiar with the threading support that was added
to C2011?

No. I'm not familiar (I'm sure) with the prior standards either. I know how to program in C and I just do so ... yet without knowing standards.

It's not at all similar to your way of handling threads, but
it does have the advantage of being based upon existing common practice.
As a result, it can be implemented as a thin wrapper over many existing
threading systems, such as those provided by POSIX or Windows. It
requires a somewhat thicker wrapper on other, more exotic threading
systems, but it should be widely implementable.

Interesting. My design logic comes from looking ahead. We've moved from
single-core systems to multi-core, and soon we will have many-core. These
will be large core CPUs without much processing power per thread, but the
ability to do a lot of work in parallel. As such, within a single function
there will be the need to do a lot of thread-level parallelism, such as
being able to schedule both branches of an IF ahead of knowing the results,
provided all dependencies are satisfied, so that by the time the results are
known the branch has already been taken. This kind of micro-threading will
be made possible by having many cores (64+) that can work in parallel easily.
The language and OS must work in harmony to provide these facilities so that
they can be spawned, executed, and terminated within a minimal amount of
clock cycles.

I remember in the old days the 8087 FPUs used to monitor all 8086 CPU instructions and ignore them, as the 8086 monitored all 8087 instructions and
ignored them. Perhaps something similar is required, but with peek-ahead
setting such as using a new JMPWAIT instruction which causes a particular
thread to jump ahead to a location and wait for the "controlling thread" to
catch up, and then it kicks off, so the cache is already filled, memory reads
have been made, etc.

In any event, that's the logic behind my threading model.

Best regards,
Rick C. Hodgin

James Kuyper · Jan 21, 2014

Makes sense. I still would've opted for the deprecated allowance and phased it out over time.

The combination of those two sentences doesn't work. If it ever made
sense to accommodate their needs, then it still makes sense - machine
generated code is at least as popular as it has ever been, possibly more
so. Deprecating that allowance would only make sense if the allowance
itself doesn't make sense.

Rick C. Hodgin · Jan 21, 2014

Do you open the input file and the output file in binary mode or text

That's a bad idea, if you're reading and writing text files - that's
what text mode is for.

I don't know how you do it, but most of my text processing is on source code
files. I use the terminating end-of-line characters to break out lines
during the read as I create structures.

Binary. Always binary.

It makes your code less portable. Even for code intended exclusively for
a platform that uses a specific method of handling line endings, if that
method is anything other than '\n' (as is, in fact, the case on your
system), it just makes more work for yourself.

Windows uses two-character line endings. It's pretty universal.

The work I have in this method is more on token parsing, identifying groups
of related characters, and so on. I process through every file I load byte
by byte anyway ... it's no more work to have a token which identifies line-
ending characters. It's actually just a setting in my token lookup logic,
which is a series of related structures which are parsed by a small engine
which goes through the source file identifying everything it can into known
groups, later parsed out into known tokens, later parsed out into known logic,
whereby errors are reported.

It works quite well.

Binary. Always binary.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Jan 21, 2014

The combination of those two sentences doesn't work. If it ever made
sense to accommodate their needs, then it still makes sense

Your explanation as to why it was setup the way it was makes sense. I still
would've opted for it to not continue over time, but to be allowed for some
while, and as I say a prior version of a compiler could be used to parse those
old generated files with a common object format that could be linked together
for the foreseeable future, just by maintaining compatibility with that obj
file format.

- machine
generated code is at least as popular as it has ever been, possibly more
so. Deprecating that allowance would only make sense if the allowance
itself doesn't make sense.

I doubt people today are using the same generated source code files they were
back then. And I would still argue that it's nothing short of a catering hack
to include the ability to allow buggy code generator logic to pass through.

I'm sorry, but it's absolutely lame. It was a lame decision, in my opinion,
and whereas I probably would've opted to allow it for a time as through a
newly branded "deprecated feature," I would not have allowed it in moving
forward. We are better than that as (1) human beings, (2) developers, and
(3) all of our products should be better than that as well. We don't cater
to bad designs for bad reasons over the long term. If we need a Grandfather
Clause to get us by for a time, that's one thing ... but we don't keep it
going when it was a catering hack from the start.

All of this is my opinion. YMMV.

Best regards,
Rick C. Hodgin

Keith Thompson · Jan 21, 2014

Rick C. Hodgin said:
Not at all. I presumed "those members" involved in the ANSI
authorizing were a small number, power-seeking representatives of the
entire C language developer base of us "little people," while the much
larger group (developers who were coding in C in general) contained
many developers, and it was many of them who screamed "WHAT!" and then
passed out.

(Extremely long line reformatted. Can you find a way to *consistently*
format your articles with shorter lines?)

I'll assume that

screamed "WHAT!" and then passed out

is not meant to be taken literally. But if a substantial number of C
programmers were greatly upset by the decision to make string literals
effectively read-only, I presume you can provide evidence. Please do
so.

And I believe your characterization of the members of the ANSI C
committee is inaccurate.

[...]

Appendix J of what?

Of the ISO C standard, a recent draft of which can be downloaded at no
charge from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

James Kuyper · Jan 21, 2014

On 01/21/2014 12:22 PM, Rick C. Hodgin wrote:
....

I don't know how you do it, but most of my text processing is on source code
files. I use the terminating end-of-line characters to break out lines
during the read as I create structures.

Yes, and that's (very marginally) easier to do if you only have to look
for '\n' rather than '\r\n' - which would be the case if you used text
mode rather that binary mode. Your input files file could still have
'\r\n', and your output files would still have '\r\n': text mode takes
care of those things for you automatically. However, internal to your
program you can match those sequences just by searching for '\n'.

....

Windows uses two-character line endings. It's pretty universal.

It's not even close to universal - it's specific to DOS/Windows and few
other places.

The work I have in this method is more on token parsing, identifying groups
of related characters, and so on. I process through every file I load byte
by byte anyway ... it's no more work to have a token which identifies line-
ending characters. ...

Actually, it is more work than that, as this thread has already shown.
As a side effect of that decision, you've had to type extra '\r'
characters in your string literals. As far as I can see, the sole affect
of your decision is that you have to occasionally type "\r\n" where you
otherwise would have been able to type "\n". You haven't identified a
single compensating advantage, not even one tiny enough to make up for
that admittedly very minor disadvantage.

... It's actually just a setting in my token lookup logic,
which is a series of related structures which are parsed by a small engine
which goes through the source file identifying everything it can into known
groups, later parsed out into known tokens, later parsed out into known logic,
whereby errors are reported.

And it wouldn't make your program any more complicated (slightly less
so, in fact) to re-write it to work with text mode. If you did so, it
would continue to work, without modification to the line ending token,
even if someone someday decides to try porting it to a platform using a
different convention for line endings.

It works quite well. Binary. Always binary.

That's a recipe for locking your code to a single platform. If it
simplified your code in any way, that might make sense for code that you
are certain will never be ported anywhere else (though such certainty is
often delusional). However, it actually makes your code (very slightly)
more complicated.

Keith Thompson · Jan 21, 2014

Rick C. Hodgin said:
char defaultOption = "4";

Click to expand...

Did you mean "char *defaultOption" or "char defaultOption[]" rather than
"char defaultOptions", or did you mean '4' rather than "4"?

Click to expand...

I meant char defaultOption[] = "4";

I would like to be able to specify that with a const prefix, as in this
type of syntax:

char* list[] =
{
"foo1",
const "foo2",
"foo3"
}

In this case, I do not want the second element to be changed, but the
first and third... they can change.

Click to expand...

If I were to suggest a new language feature to support that, I'd want an
explicit marker for a string that I *do* want to be able to change.

Click to expand...

I realize that. My view, of course, differs.

In your proposed C-like language, what would this snippet print?
for (int i = 0; i < 2; i ++) {
char *s = "hello";
if (i == 0) {
s[0] = 'H';
}
puts(s);
}

Click to expand...

Interesting.

FWIW, I don't believe in defining variables in this way in C. I believe an
initialization block should exist so it is being done explicitly, both for
documentation purposes, and clarity in reading the source code (it's very
easy to miss a nested declaration when a group of variables is created of
a similar type.

In my proposed language, it would print "Hello" both times because the
char* s definition would've been pulled out of the loop and defined as a
function variable.

That's fine; if you don't want to write code like that, you don't have
to. But I didn't ask how you'd re-write it; I asked how *that code*
should behave.

You're proposing (I think) a change to the language. That change would
affect compiler writers as well as developers. A compiler writer needs
to do *something* with the specific code I wrote above.

The language definition can either:

1. Define clearly how the code behaves;

2. State that the behavior is unspecified, undefined, or
implementation-defined); or

3. Introduce a new rule making the above code a constraint violation or
syntax error.

Which of those do you advocate? (Any suggestion that declarations in
nested blocks should be banned is a non-starter; that's been a feature
of C and its predecessors as far back as I can find documentation, and a
lot of existing code depends on it.)

[...]

Exactly. So, you don't code that way.

I don't code that way because string literals are read-only.

You make everything a function-level variable and it's done. You make
all code items read/write unless they are explicitly prefixed with a
const or have some macro wrapper like _rw("foo") or _fo("foo") to
explicitly name them.

Macros do not add functionality. They have to expand to *something*.

Not just string literals, but a separation of the "before" and the "after."
Programming today is, by default, targeted at multiple CPUs. There are
functions which run top-down, but on the whole we are creating
multi-threaded congruent code execution engines running on commensurate
hardware. The time for a new language syntax is at hand.

I propose new extensions to C in general:

in (thread_name) {
// Do some code in this thread
} and in (other_thread_name) {
// Do some code in this thread
}

And a new tjoin keyword to join threads before continuing:
tjoin this, thread_name, other_thread_name

Are you aware that the 2011 ISO C standard includes a specification of
threading? I suggest you study it before proposing changes.

[SNIP]

And I have other ideas. You can read about them on this page. This page
specifically relates to extensions to Visual FoxPro, but my intention is
my RDC (Rapid Development Compiler) which is C-like, but relaxes a lot of
stringent errors in C reporting them only as warnings, such as pointer-to-
pointer conversions, allowing for them to be perfectly valid, and many
other changes as well.

http://www.visual-freepro.org/wiki/index.php/VXB++

So your problem with C is that it's too stringent. Hmmm.

[...]

I don't expect to get anywhere trying to change anything in C. It's why
I'm moving to my own language. I hit the GCC group a while back asking for
the self() extension, which allows recursion to the current function without
it being explicitly named or even explicitly populated with parameters. They
said it was not a good idea. I asked for something else (can't remember what)
and they said the same. So ... it was fuel for me to get started on my own.

Good luck with that. But if you're giving up on C and inventing your
own new language, comp.lang.c is not the best place to discuss it.

Rick C. Hodgin · Jan 21, 2014

Yes, and that's (very marginally) easier to do if you only have to look
for '\n' rather than '\r\n' - which would be the case if you used text
mode rather that binary mode. Your input files file could still have
'\r\n', and your output files would still have '\r\n': text mode takes
care of those things for you automatically. However, internal to your
program you can match those sequences just by searching for '\n'.

My algorithm actually looks for either ASCII-10 or ASCII-13, in any order, and
then looks at the next character. If it's the corresponding char, as in "\r\n"
or "\n\r" then it accepts that as a line feed. If it doesn't, then it reads it
in as a single-character line ending and continues processing. This allows
combinations like "\r\r" or "\n\n" to be recognized as two lines.

...
It's not even close to universal - it's specific to DOS/Windows and few
other places.

It's pretty universal ... "in Windows."

Actually, it is more work than that, as this thread has already shown.
As a side effect of that decision, you've had to type extra '\r'
characters in your string literals. As far as I can see, the sole affect
of your decision is that you have to occasionally type "\r\n" where you
otherwise would have been able to type "\n". You haven't identified a
single compensating advantage, not even one tiny enough to make up for
that admittedly very minor disadvantage.

I don't have to type in the extra "\r", but I choose to do so because when we
do periodically bring up the text files in editors that care about the line
ending combination, it doesn't generate that error. It works perfectly fine
with or without the "\r" ... it's just done as a nicety.

And it wouldn't make your program any more complicated (slightly less
so, in fact) to re-write it to work with text mode. If you did so, it
would continue to work, without modification to the line ending token,
even if someone someday decides to try porting it to a platform using a
different convention for line endings.

Perhaps. But it's not important enough for me for it to be an issue. I've
already coded for all combinations of line endings. That part is coded. At
this point it would be more work for me to retrofit it.

That's a recipe for locking your code to a single platform. If it
simplified your code in any way, that might make sense for code that you
are certain will never be ported anywhere else (though such certainty is
often delusional). However, it actually makes your code (very slightly)
more complicated.

I use binary files with this logic on Linux as well. It works the same. My
logic accounts for combinations that I've seen, so any combination of \r or
\n, repeated or not repeated, all parses out properly.

Best regards,
Rick C. Hodgin

James Kuyper · Jan 21, 2014

Your explanation as to why it was setup the way it was makes sense. I still
would've opted for it to not continue over time, but to be allowed for some
while, ...

If you're only going to deprecate it now, and remove it later, why even
create it in the first place? If it makes sense to remove it, it makes
even more sense not to create it. The arguments in favor of the allowing
the extra comma were not time dependent - it wasn't a matter of old
legacy code, but of current desires of those who like to write source
code generators.

... and as I say a prior version of a compiler could be used to parse those
old generated files with a common object format that could be linked together
for the foreseeable future, just by maintaining compatibility with that obj
file format.

I has nothing to do with object files; the same object file will be
created with or without the extra comma.

I doubt people today are using the same generated source code files they were
back then.

I'm sure that some are; code has a long range of lifetimes, and some
code that is still in use has been around a lot longer than you seem to
consider likely. However, more important to the argument I was
describing (NOT endorsing!) is the fact that new source code generators
are being created all the time, and newly generated source code is also
being created. If the desires of the creators of those generators are
not worth catering to now, then it never made sense to cater to those
desires (I might have some sympathy with that conclusion - but you're
the one who said it "makes sense", not me).

Rick C. Hodgin · Jan 21, 2014

In your proposed C-like language, what would this snippet print?
for (int i = 0; i < 2; i ++) {
char *s = "hello";
if (i == 0) {
s[0] = 'H';
}
puts(s);
}

Click to expand...

In my proposed language, it would print "Hello" both times because the
char* s definition would've been pulled out of the loop and defined as a
function variable.

Click to expand...

That's fine; if you don't want to write code like that, you don't have
to. But I didn't ask how you'd re-write it; I asked how *that code*
should behave.

I answered you. How should it behave?

In my compiler, I would pull the variable out and make it a function-variable
defined at the top, so it would've been altered the first time through and
both times would print Hello.

You're proposing (I think) a change to the language. That change would
affect compiler writers as well as developers. A compiler writer needs
to do *something* with the specific code I wrote above.

No ... I'm creating my own new language, RDC, which is C-like, but dumps a
lot of what I view as "hideous baggage left over from a bygone era" ... while
also adding a lot of new features I see as looking to the future of multiple
cores, GUI developer environments, touch screens, eventual 3D interfaces, and
more.

The language definition can either:
1. Define clearly how the code behaves;
2. State that the behavior is unspecified, undefined, or
implementation-defined); or
3. Introduce a new rule making the above code a constraint violation or
syntax error.
Which of those do you advocate? (Any suggestion that declarations in
nested blocks should be banned is a non-starter; that's been a feature
of C and its predecessors as far back as I can find documentation, and a
lot of existing code depends on it.)

I choose 1 in general, with a periodic injection of 2.

[...]

Exactly. So, you don't code that way.

Click to expand...

I don't code that way because string literals are read-only.

Well ... They shouldn't be.

Macros do not add functionality. They have to expand to *something*.

Call it something else then. I would introduce the cask and be done:

char* list[] = { (|rw|)"foo" };

The (|rw|) cask is injected as a single override in the GUI, and indicates
that the following token is to be read/write.

Are you aware that the 2011 ISO C standard includes a specification of
threading? I suggest you study it before proposing changes.

I'm not proposing changes to C. These are my extensions to a C-like language
called RDC (Rapid Development Compiler) which is very C-like, but it is not C.

So your problem with C is that it's too stringent. Hmmm.

In some areas, yes. I also don't believe in going deeper than a pointer to
a pointer. I think if you're coding further out than that you're probably
doing something wrong.

I come from an assembly background, and I desire to give the developers a full
set of tools they can use, leaving them, as competent human beings and skilled
developers, to best make the use of those tools. Compiler warnings will exist
where many errors do today ... but so long as there's logic in what's being
done, as in:

int i;
int* iptr;
char* cptr;

i = 5;
iptr = &i;
cptr = iptr; // No cast, no error, because it's simply pointer to pointer
// and valid, but the compiler would generate a warning.
*cptr = '2';

My compiler won't force a cast. It will generate a warning, but no more than
that.

The developer should have all of the tools necessary, and without
clunky syntax hoops to jump through (unions galore, and so on).

Only if something weird happens will they get an error:
*cptr = "Hello there, Billy!";

Error!

Good luck with that. But if you're giving up on C and inventing your
own new language, comp.lang.c is not the best place to discuss it.

Agreed. This is all just back story as to why I desire to have read/write
string literals in all cases unless explicitly cast as const.

Best regards,
Rick C. Hodgin

Rick C. Hodgin · Jan 21, 2014

If you're only going to deprecate it now, and remove it later, why even
create it in the first place? If it makes sense to remove it, it makes
even more sense not to create it. The arguments in favor of the allowing
the extra comma were not time dependent - it wasn't a matter of old
legacy code, but of current desires of those who like to write source
code generators.

I would've deprecated it back then had I been the decision maker. Today I
will not support it. In the future, if I change my mind on something that I
initially introduce, I will deprecate it and phase it out over time, but I
won't phase out anything unless there's some exceedingly valid reason to do
so, such as we've moved from binary computers to quantum computers, or such.
Would have to be major.

I has nothing to do with object files; the same object file will be
created with or without the extra comma.

Yes, but the parsing engine is the compiler, which would've read that syntax
in the beginning, and generated the object file. That old version of the
compiler that supported the extra comma syntax could be used well into the
future as new compilers are written which handle it without the extra comma
allowance. In that way, legacy code that cannot be changed can still be
supported through the object file format of the code generated by the compiler
which supported it.

I'm sure that some are; code has a long range of lifetimes, and some
code that is still in use has been around a lot longer than you seem to
consider likely. However, more important to the argument I was
describing (NOT endorsing!) is the fact that new source code generators
are being created all the time, and newly generated source code is also
being created. If the desires of the creators of those generators are
not worth catering to now, then it never made sense to cater to those
desires (I might have some sympathy with that conclusion - but you're
the one who said it "makes sense", not me).

Yes. I would stand up in front of all of them in a large room and say, "NO!
YOU CANNOT DO THIS ANY LONGER. THERE ARE BETTER WAYS. CLEARER PATHS. YOU
DON'T NEED TO WALLOW IN EXTRA COMMA LAND ANY LONGER. COME OUT AND BE FREE!"

And I think I would get a standing ovation. Perhaps not.

Best regards,
Rick C. Hodgin

James Kuyper · Jan 21, 2014

I would've deprecated it back then had I been the decision maker. ...

That's what I don't understand - why introduce a new feature as
"deprecated"? Or, perhaps, by "back then", you're referring to some
particular time after that feature was first introduced? If so, what
time was that?

... Today I
will not support it. ...

I don't see the distinction - deprecating a feature is definitely not
supporting it.

....

allowance. In that way, legacy code that cannot be changed can still be
supported through the object file format of the code generated by the compiler
which supported it.

Well, legacy code was never the primary issue. It was people wanting to
generate new code with that feature.

....

Yes. I would stand up in front of all of them in a large room and say, "NO!
YOU CANNOT DO THIS ANY LONGER. THERE ARE BETTER WAYS. CLEARER PATHS. YOU
DON'T NEED TO WALLOW IN EXTRA COMMA LAND ANY LONGER. COME OUT AND BE FREE!"

And I think I would get a standing ovation. Perhaps not.

Certainly not from the people who were requesting the feature. Those
words might make them doubt your emotional stability, but they don't say
anything likely to change their minds.

Rick C. Hodgin · Jan 21, 2014

allowance. In that way, legacy code that cannot be changed can still be

Well, legacy code was never the primary issue. It was people wanting to
generate new code with that feature.

AH! I misunderstood. No. In that case I never would've introduced it.

Certainly not from the people who were requesting the feature. Those
words might make them doubt your emotional stability, but they don't
say anything likely to change their minds.

They wouldn't be the first to doubt my emotional stability.

Best regards,
Rick C. Hodgin

Öö Tiib · Jan 21, 2014

That's a recipe for locking your code to a single platform. If it
simplified your code in any way, that might make sense for code that you
are certain will never be ported anywhere else (though such certainty is
often delusional). However, it actually makes your code (very slightly)
more complicated.

Software can use text mode only for files produced by itself for itself
on same platform. That is sort of narrow corner case today.

Life is complicated and world is interconnected and so varying
line endings of different platforms are nuisance that one has to
deal with by supporting them all. If software must work on Mac,
Windows and Linux and should eat text files produced by itself
and other text editors on Mac, Windows and Linux then
"always binary" is good choice.

Kaz Kylheku · Jan 21, 2014

Why do we have "void function(void)" when "function()" would work
sufficiently at that level in a source file?

The void type was introduced by C++, because Stroustrup wanted stronger type
safety. C++ introduced void * pointers, and using void to declare function
returns.

However, in C++, a function with no parameters is just (); it does not
mean "unspecified parameters".

The C people "ported" void into C, and invented the (void) hack to mean "empty
parameter list", so that () could continue to mean "unspecified number of
parameters" for compatibility with the old-style C that is described in
the first edition of the Kernighan and Ritchie text.

Then, for better compatibility with C (something they no longer give
a damn about today), the C++ people back-ported the (void) kludge into C++.

It's allowed even in contexts that could never be C, like:

MyClass::MyClass(void) { ... }

Needless to say, don't do this. Only use (void) in C++ that also compiles as C,
such as declarations in header files that are included in both C++ and C.

It's the same here. "Oh, another comma ... was the developer finished? Was
there supposed to be more? What is missing? What was left out? Please ...

You could say the same thing about statement/declaration-terminating
semicolons.

Nobody needs that kind of stress in their life.

Decently designed languages do not have comma and semicolon diseases
to begin with.

Eric Sosman · Jan 21, 2014

That's what I don't understand - why introduce a new feature as
"deprecated"? Or, perhaps, by "back then", you're referring to some
particular time after that feature was first introduced? If so, what
time was that?

I don't see the distinction - deprecating a feature is definitely not
supporting it.

...

Well, legacy code was never the primary issue. It was people wanting to
generate new code with that feature.

...

Certainly not from the people who were requesting the feature. Those
words might make them doubt your emotional stability, but they don't say
anything likely to change their minds.

I'm one who would not readily change his mind, because (in
part) as things stand I can write stuff like:

const char *archiveFormats[] = {
#if CPIO_SUPPORTED
"cpio",
#endif
#if TAR_SUPPORTED
"tar",
#endif
#if ZIP_SUPPORTED
"ZIP",
#endif
#if APK_SUPPORTED
"apk",
#endif
};

It's *possible* to manage this sort of thing without introducing
an extra comma, but it's ugly as all-get-out:

const char *archiveFormats[] = {
#if CPIO_SUPPORTED
"cpio"
#if TAR_SUPPORTED | ZIP_SUPPORTED | APK_SUPPORTED
,
#endif
#endif
#if TAR_SUPPORTED
"tar"
#if ZIP_SUPPORTED | APK_SUPPORTED
,
#endif
#endif
#if ZIP_SUPPORTED
"ZIP"
#if APK_SUPPORTED
,
#endif
#endif
#if APK_SUPPORTED
"apk"
#endif
};

Kaz Kylheku · Jan 21, 2014

I doubt people today are using the same generated source code files they were
back then.

C has built in code-generation: macros.

#define ELEM(A,B,C} { FOO(A, BAR(B), 0, (void *) (C) },

struct foobar array[] {
ELEM(3, 2, 4)
ELEM(1, 2, 3)
};

This re-generates each time you compile it.

I wouldn't write it that way myself; I'd leave the comma out and have:

struct foobar array[] {
ELEM(3, 2, 4),
ELEM(1, 2, 3)
};

yet, there is code like that out there, and probably situations in which
it makes sense to hide the comma in the "macrology".

Conditionally generating a comma in the C macro language is difficult to
impossible, and requiring the macro caller to specify it can sometimes
break the abstraction of the macro.

And I would still argue that it's nothing short of a catering hack
to include the ability to allow buggy code generator logic to pass through.

That is pure nonsense. If the code generator developer knows that commas can
be treated as terminating rather than separating puncutation, then it's fine to
design the code generator logic that way.

I'm sorry, but it's absolutely lame. It was a lame decision, in my opinion,

What's lame is not knowing C, but criticizing it.

Nobody cares what you "would have" done back when these decisions were
made, because back then you had -X years of experience in C.

A good 10-15 years of coding should be required for anyone who is to have
any input on the future direction of a language.

James Kuyper · Jan 21, 2014

Software can use text mode only for files produced by itself for itself
on same platform. That is sort of narrow corner case today.

In my experience, the special features of text mode as compared to
binary mode are conventions associated with operating systems. As such,
files adhering to those conventions can be used to communicate between
any two programs compiled for that operating system, whether or not
they're running on the same platforms or different platforms.
I wouldn't be surprised to learn that there are conventions for the
layout of text files that are associated with things other than
operating systems - but offhand I can't think of any.

Life is complicated and world is interconnected and so varying
line endings of different platforms are nuisance that one has to
deal with by supporting them all. If software must work on Mac,
Windows and Linux and should eat text files produced by itself
and other text editors on Mac, Windows and Linux then
"always binary" is good choice.

I think using specialized routines to translate line endings (such as
dos2unix) are the more reasonable way to go. Otherwise a file editing
program could end up creating a document containing a mixture of Mac,
Windows, and Linux line endings. Then you have to decide how to
interpret the result: how many line endings does "\r\n\r" encode? How
many does "\n\r\n" encode?

How many programs do you know of that can correctly handle some of the
other, more exotic possibilities in use by currently existing systems,
such as lines whose length is not indicated by a special character at
the end of the line, but by a count at the beginning? Or block-oriented
files where all of the characters between the end of a line and the end
of a block are null (or even more confusing, blanks)? The standard
defines processing of text-mode files in ways that are compatible with
all of those possibilities. As a result, my text-oriented C code doesn't
need to know anything about those possibilities, I just let the
<stdio.h> library take care of it. My code will work on such systems
without modification, and without me having to write it in any way that
is different from the way I would write it if I were only targeting the
unix-like systems where it normally runs.

Ian Collins · Jan 21, 2014

James said:
In my experience, the special features of text mode as compared to
binary mode are conventions associated with operating systems. As such,
files adhering to those conventions can be used to communicate between
any two programs compiled for that operating system, whether or not
they're running on the same platforms or different platforms.
I wouldn't be surprised to learn that there are conventions for the
layout of text files that are associated with things other than
operating systems - but offhand I can't think of any.

Most if not all of the programmer's editors I've used on Windows
recognise Unix line endings and gcc on Unix recognises Windows endings.
Text mode is something of a curse!

Tic Tac Toe Game	2	Mar 10, 2024
Universal BMP Steganography Tool (AES-128-CTR + SP800-90A CSPRNG) Full Encoder/Decoder with 3LSB Payload, PasswordDerived Key & External Key File	4	Mar 26, 2026
RSA implementation issues in public key pem loader function	0	May 21, 2025
Constant Strings	17	Aug 30, 2007
Building a Professional Neural Network Framework: Full C++ Implementation with Windows GUI and Real-Time Training Visualization	11	Jun 14, 2026
I need help with a Gemini prompt	1	May 14, 2025
Newbie: Array of pointers to strings questions.	22	May 10, 2005
Help in this program.	2	May 14, 2022

Non-constant constant strings

Aleksandar Kuktin

Rick C. Hodgin

James Kuyper

Rick C. Hodgin

Rick C. Hodgin

Keith Thompson

James Kuyper

Keith Thompson

Rick C. Hodgin

James Kuyper

Rick C. Hodgin

Rick C. Hodgin

James Kuyper

Rick C. Hodgin

Öö Tiib

Kaz Kylheku

Eric Sosman

Kaz Kylheku

James Kuyper

Ian Collins

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads