Requesting advice how to clean up C code for validating string represents integer

  • Thread starter robert maas, see http://tinyurl.com/uh3t
  • Start date
C

Chris Dollin

Richard said:
C&V, please.

I'm not talking about C. That's why I ended "And I'm done.".

[And the lvalue stuff fed into what became denotational semantics,
in which the meaning of constructs is given by equations over values;
hence "there are only values".]
 
R

robert maas, see http://tinyurl.com/uh3t

From: Keith Thompson said:
As I already told you, it invokes undefined behavior. That means
the standard imposes no requirements.

Clearly this makes it totally useless for validating input.
I'll stick with my use of strscn etc. for semi-manual validation of input.
I suggest you get your own copy of
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
so you can look these things up yourself rather than depending on the
rest of us to do it for you.

I'd be glad to do that if and when you please tell me how to view
such a file in a meaningful way over a VT100 terminal emulator.
In case of further questions as to my situation here, please study:
<http://www.rawbw.com/~rem/NewPub/mySituation.html>
There will be a test on that material the next time you post here.
 
K

Keith Thompson

These attributions are messed up. I wrote the block of text starting
with "Unfortunately, sscanf()," and the one starting with "As I
already told you". I don't remember who wrote the other stuff.

I think the problem is that you seem to be writing your own
attribution lines. You write, for example:

From: So-and-So <[email protected]>
Text that So-and-So wrote ...

The standard form for attribution lines and quoted text is:

So-and-So said:
> Text that So-and-So wrote ...

Any decent newsreader will do this for you automatically.
Clearly this makes it totally useless for validating input.
I'll stick with my use of strscn etc. for semi-manual validation of input.

I think you're right. You *might* be able to do the job with sscanf()
if you can find out how your implementation handles overflow *and* if
you don't care too much about portability. It's not something I'd
recommend.

There may also be open-source implementations of sscanf() that behave
sensibly on overflow. If you can grab one of them, and if it's
implemented in portable C, you can rename it and use it in your own
code.
I'd be glad to do that if and when you please tell me how to view
such a file in a meaningful way over a VT100 terminal emulator.
In case of further questions as to my situation here, please study:
<http://www.rawbw.com/~rem/NewPub/mySituation.html>
There will be a test on that material the next time you post here.

Sorry, I don't think I'll bother reading that, and I'm not going to
take your test. However much I might sympathize with your situation,
this really isn't the place to discuss it.

I'm posting this through a VT100 terminal emulator, but I'm able to
download and view PDF files on the computer where I'm running the
emulator. You may or may not be able to do something similar.

But perhaps you should try reading this:
<http://groups.google.com/group/comp.lang.c/msg/268fb3f0e9caa789>
It's an article I posted here two weeks ago, and it was a direct
followup to something you posted. In it, I wrote:
| Are you unable to read PDF files? I understand you're using an old
| and limited computer system, but there are cost-free PDF viewers that
| should work for you.
|
| If you can't deal with PDF, search for "n869.txt"; it's a committee
| draft dated Jan 18, 1999. There are a few significant differences
| between n869 and the actual C99 standard.

At least one of the regulars here uses n869.txt as his primary
reference.

Also, Adobe Reader is able to export a PDF file as plain text, and
I've just created an ASCII copy of n1124.pdf (n1124.txt). If you're
interested, let me know so we can figure out how to get it to you.
It's about a megabyte of text, or about 25% of that if I compress it.
The formatting isn't as nice as n869.txt; it's up to you to decide
which one you want to use.

I've dropped alt.support.shyness from the Newsgroups line.
 
C

CBFalconer

.... snip ...


I'd be glad to do that if and when you please tell me how to view
such a file in a meaningful way over a VT100 terminal emulator.
.... snip ...

You can get a bz2 compressed version of N869.TXT from:

<http://cbfalconer.home.att.net/download/>

FX-Mozilla-Status: 0009
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
N

Nelu

CBFalconer said:
.... snip ...

You can get a bz2 compressed version of N869.TXT from:

<http://cbfalconer.home.att.net/download/>

FX-Mozilla-Status: 0009
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

There's also a program called pdftotext that I used once to
convert n1124.pdf to txt. If one doesn't mind some strange
characters that remain in the txt file after the conversion then
it does a pretty good job.
 
C

CBFalconer

Keith said:
.... snip ...

I've dropped alt.support.shyness from the Newsgroups line.

I had to do likewise, since the news server doesn't recognize that
group and rejects the whole post. This just illustrates the
foolishness of untrammelled cross-posting.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
R

Robert Maas, see http://tinyurl.com/uh3t

From: Mark McIntyre said:
It defines it as "a region of data storage... the contents of which
can represent values". This seems an entirely reasonable definition to
me. As someone has already said, the word has a wide variety of exact
meanings in many walks of life, so being precise is /not/
inappropriate.

Let me check if I understand you correctly: You're saying that it's
perfectly good, even desireable, to use a word (in a supposed
technical definition of a programming language) which in ordinary
usage has a "wide variety of exact meanings in many walks of life",
and to utterly fail to specify **which** of those wide variety of
meanings is intended, thereby leaving that part of the
specification [cough, cough] grossly ambiguous?

My opinion on this matter is the exact opposite of that. That
whenever a word in ordinary usage has a wide variety of different
meanings, then such a word should *not* be used in any technical
specification without first defining which of the various meanings
is intended.

For example, do you see any introductory mathematical text on
"group theory" starting off by saying "a group is whatever you
thought it meant when you were in sensitivity training or whereever
your vibes are" and then trying to derive any useful mathematical
conseuence from such a starting point?

If a "region" can mean something like the Western USA, or the Great
Plains of the USA, which partly overlap with the former, or the
Mississippi River drainage area, which greatly overlaps with the
great plains, or the midwest, which overlaps with the drainage
area, or the "south" which overlaps with both the great plains and
the drainage area, or the East Coast which overlaps with the
"south", or New England which is a part of the East Coast, or the
USA which includes disconnected states Alaska and Hawaii plus
various disconnected islands such as Catalina, or the British
Empire at its hight which included disconnected parts all around
the Globe, then how can that be of any value in pinning down
whether some random set of memory cells in a C core image, not
necessarily contiguous, not necessarily having any bearing on any
formal C constructs such as arrays or structs, is or is not a
"region" for the purpose of the specification [cough cough] of
"lvalue"? Is an "lvalue" just any random subset of physical or
virtual memory?? The spec, as you interepret it, seems to be like
that, totally useless.
This is pretty close to the C definition,

What the **** C definition are you talking about? It's not in the
formal spec we've been discussing here.
if you think it through.

Please explain what thinking can convert what you've been saying
about "lvalue" (refers to any "region" which can be "anything that
might be called a 'region' in ordinary informal
usage outside the field of programming languages")
into anything even remotely similar to the original lisp sense I
cited.
I don't think Lisp required the block of memory to be complex.

I agree, but that's not relevant. The point is that the memory
associated with an object has a single toplevel "handle" by which a
user-level program can gain access to exactly that memory occupied
by that object and nothing else.
(At the machine level, and in MacLisp, it's possible to convert a
handle into a number representing the machine address of the entry
point of that object, then perform arithmetic to get any other
random numeric value, then convert that number back into a "handle"
onto some random new starting point where a user-level program may
or may not find something resembling a consistent object. That's
not what counts for user-level program getting access via the handle.)

Now in C you are allowed to do exactly that sort of machine-level
manipulation of pointers, and thereby gain access to places in
memory that really aren't properly accessible via formal definition
of the pointer as pointing to an *object*. But if you write code
that performs such hackery, it won't necessarily produce consistent
results on different implementations, or even at different times
within the same implementation.

Now if the C standard were to state clearly what constitutes
"reasonably accessible" per staying within a single allocation unit
(simple variable, array, struct, malloc-block, etc.) plus any
additional allocation units you can get by following pointers from
your starting unit, and define both objects and lvalues
consistently with respect to that understanding, and thus
guaranteeing (as much as possible) that programs restricted to such
access would be portable, I would like that. But the "new and
wonderful" ANSI/ISO C standard we've been discussing doesn't do
that, so I rate it as crap in this respect.
The standard itself doesn't define region. You would have to check
back in ISO/IEC 2832-1:1993 "Information Technology - Vocabulary
Part1: Fundamental Terms" to see what ISO defined it as.

Your browser may not have a PDF reader available. Google recommends
visiting our text version of this document.
Clicking on the link "text version" doesn't work. It just takes me
to another listing of search results. Google appears to be broken today.

I looked through the first two groups of search results, but the
only plain-text documents I could find were FAA regulations and
fisheries, so obviously Google is giving me partial matches which
are totally unrelated to what I asked for.

If you know where there's a text version available please post the URL.
if a particular compiler optimizes space by moving the long int
array ahead of either of the short int arrays to reduce amount of
padding needed to respect long boundaries, so that
a[7] a[8] c[0] c[1] c[2]
form a contiguous block of memory, is that considered a "region"
hence an "object"??
There's nothing which requires these to be contiguous, so I can't
see how they can be considered either an object or a single region.

The San Francisco Bay Area is considered a region, despite that
fact that the parts West of the San Andreas Fault are slowly moving
north-west relative to the rest of the Bay Area. There's nothing
requring the SF Bay Area to be contiguous, even though at the
moment it's contiguous, more or less anyway. So your appeal to
ordinary common usage outside the field of programming languages,
whereby a bunch of land that is merely temporarily and incidentally
contiguous, neither permanently nor required to be contiguous,
contradicts your claim that my example above wouldn't count as a
"region" hence an "object" per the shoddy ANSI/ISO spec [cough
cough].
FWIW, the IAU had no need to define that since it can be inferred from
an amazing property known as "common sense".

Common Sense says that Neptune hasn't cleared its neighborhood of
Pluto, and the Earth hasn't cleared its neighborhood of Luna, and
Jupiter hasn't cleared its neighborhood of Io or Europa or Ganymede
or Calisto. So none of those is a planet per IAU's 2006 definition???
 
R

Robert Maas, see http://tinyurl.com/uh3t

From: Keith Thompson said:
And what exactly do you mean by "ANSI"?

Whatever is the **current** (latest official) standard.
At the moment, that seems to be C99.

Or whatever the GNU C compiler uses when the -ansi switch is turned
on. From the man pages here:
-ansi Support all ANSI standard C programs.
This turns off certain features of GNU C that are incompatible
with ANSI C, such as the asm, inline and typeof keywords, and
predefined macros such as unix and vax that identify the type of
system you are using. It also enables the undesirable and
rarely used ANSI trigraph feature, and disallows `$' as part of
identifiers.
-pedantic
Issue all the warnings demanded by strict ANSI standard C; re-
ject all programs that use forbidden extensions.
Valid ANSI standard C programs should compile properly with or
without this option (though a rare few will require `-ansi').
However, without this option, certain GNU extensions and tradi-
tional C features are supported as well. With this option, they
are rejected. There is no reason to use this option; it exists
only to satisfy pedants.
Please tell me whether GNU C MAN pages are referring to C99 or what?

Can you forgive me for reading the GNU C MAN pages and thinking the
jargon used in them was in any way acceptable here in the
newsgroup? (Especially after somebody here suggested that I use
both the -ansi and -pedantic switches, and somebody here suggested
I read the man pages?) I suspect you-all here have deliberately
baited me into looking at those man pages and using the jargon
therein, just so you could the slap me down for using it.
The current official C standard, C99, was issued by ISO (and
later adopted by ANSI).

So it would seem reasonable, in articles posted any time from about
2000 to the present and for a while in the fiture, to call it "ANSI
C", no??
The previous standard, which is still in wide use, is C90, also
issued by ISO (and adopted by ANSI).

I would consider that "old ANSI C", specifically "previous ANSI C".
I suggest avoiding the use of "ANSI" as an adjective; many people
still use "ANSI C" to refer to the language defined by the ANSI
C89 and ISO C90 standard documents, but strictly speaking that
usage is incorrect.

So you're forbidding me to use perfectly correct jargon, just
because some other people use the same jargon in an incorrect way?
So I'm not allowed to refer to this newsgroup as "comp.lang.c"
because some other people call it by some incorrect name instead?
I'm not allowed to refer to this country where California is
located as "The United States of America", because somebody else
uses the same term for something entirely different?
If you instead refer to "C90" or "C99", you avoid the ambiguity.

What about C89 as an alternate designation for C90?
(ISO standardized it in 1989, but ANSI didn't join in approving it
until 1990, right. So from ISO's point of view, it's really C89?)

What term should I use to refer to the part of C that hasn't
changed from C89/C90 to C99, such as that if M and N are short
integers then M+N is an expression, of type short integer, that
results from adding M and N while discarding overflow (thus
wraparound)? For the most part, it's *that* language, the common
subset of C89/C90 and C99, that I want to describe in my
Cookbook/Matrix WebTree.
On the web page, I saw, there were two functions marked "GNU-c",
both of them incorrectly. Both functions are defined by C99, but
not by C90.

Ah, that was because those functions were in both the original K&R
and also in C89/C90, and prior to 1999 were also in GNU C as a
nonstandard add-on, and I wasn't aware that C99 existed at the time
I wrote that. How do those paragraphs in my Matrix look now? Should
I perhaps flag them as C99, to warn users of older C systems?

As a more general question, which paragraphs of my Matrix document
currently define something that is in C99 but not C89/C90, such
that a warning about that would be appropriate?
<OT>
The phrase "the GNU C version of the stdlib library" doesn't make much
sense, unless you're referring to glibc. I use gcc on Linux, where
the C runtime library is glibc. I also use gcc on Solaris, where the
C runtime library is the one provided by Solaris. gcc is a compiler,
not a complete implementation.
</OT>

How is that off-topic for comp.lang.c or for discussion of errata
in my Cookbook/Matrix WebTree?

In any case, when I issue a gcc command on FreeBSD Unix, or on
RedHat Linux, in either case the gnu compiler goes ahead and links
in various library modules based on headers that I've specfied. I
have *not* specfied which version of the library is to be used,
such as specifing the directory-path in quotes. Rather I've
specified the generic header name in pointy-brockets. Somehow gcc
picks one particular library to load form, without me having to
tell it. Apparently gcc is pre-configured (by whoever originally
installed it on that machine) to automatically use one particular
library, whether it's glibc or something else. From the point of
view of the application programmer, the compiler and whatever
loader the compiler selects and whatever libraries the compiler
selects are all part of the compiler system which is used
transparently to compile source files and build executables.

What jargon would *you* use to include the compiler and the loader
and the libraries together as a transparent unit??
Be careful with the term "original c" (or, preferably, "original C").
Versions of C existed long before the first ANSI standard.

I thought it was clear already: "original c" means K&R.
Yes, that was long before any ISO or ANSI standard C.
Nobody disputes that fact. (Well, except for somebody like |-|erc.)
n1124.pdf, referenced above, is the C99 standard with two
Technical Corrigenda merged into it. Any post-C99 chanages are
marked with change bars.

I have no way to view that here:
<http://www.rawbw.com/~rem/NewPub/mySituation.html>
I don't suppose you know of a plain-text or HTML version?
Why is it that the Common Lisp folks (well one of them, KMP,
anyway) have produced a nice HTML version of the ANSI standard:
<http://www.lispworks.com/documentation/HyperSpec/Front/>
and Sun Microsystems has produced equally good HTML documentation
for all their API, which is the default standard for Java, but the
idiots who have access to the ANSI standard (1999) C haven't
figured out how to do a similar thing for it? Is it really that
hard for somebody with PDF access to copy the text from the ANSI
C99 standard into a Web framework?
 
R

Richard Heathfield

Robert Maas, see http://tinyurl.com/uh3t said:
If a "region" can mean something like the Western USA, or the Great
Plains of the USA, which partly overlap with the former, or the
Mississippi River drainage area, which greatly overlaps with the
great plains, or the midwest, which overlaps with the drainage
area, or the "south" which overlaps with both the great plains and
the drainage area, or the East Coast which overlaps with the
"south", or New England which is a part of the East Coast, or the
USA which includes disconnected states Alaska and Hawaii plus
various disconnected islands such as Catalina, or the British
Empire at its hight which included disconnected parts all around
the Globe,

How many of these, are regions of data storage, the contents of which
can represent values? If any of them are, then they are objects as far
as the C Standard is concerned. (And I don't have a problem with that.)

(At the machine level, and in MacLisp, it's possible to convert a
handle into a number representing the machine address of the entry
point of that object, then perform arithmetic to get any other
random numeric value, then convert that number back into a "handle"
onto some random new starting point where a user-level program may
or may not find something resembling a consistent object. That's
not what counts for user-level program getting access via the
handle.)

Now in C you are allowed to do exactly that sort of machine-level
manipulation of pointers,

No, you're not. You can do the first bit - converting a pointer value
into an integer - but the result of doing so is implementation-defined
and need not be meaningful in terms of object locations. But the rest?
No, as soon as you do that, you're firmly in Undefined Behaviour
territory. Yes, sure, on some machines you can get away with it. On
others, however, you can't. C doesn't offer any guarantees in that
regard.
Now if the C standard were to state clearly what constitutes
"reasonably accessible" per staying within a single allocation unit
(simple variable, array, struct, malloc-block, etc.) plus any
additional allocation units you can get by following pointers from
your starting unit, and define both objects and lvalues
consistently with respect to that understanding, and thus
guaranteeing (as much as possible) that programs restricted to such
access would be portable, I would like that.

Then you'll like the C Standard, because it does state this very
clearly, provided you know what you're talking about...
But the "new and
wonderful" ANSI/ISO C standard we've been discussing doesn't do
that, so I rate it as crap in this respect.

....and, from the above statement, I deduce that you don't.
 
E

Erik de Castro Lopo

Robert said:
So it would seem reasonable, in articles posted any time from about
2000 to the present and for a while in the fiture, to call it "ANSI
C", no??

ANSI is an American standard (not that there is anything wrong with
that :)) but since this is an international newsgroup, the
international ISO standard should probably take precedence.
I would consider that "old ANSI C", specifically "previous ANSI C".

How about "ISO C90 standard"?

Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"So if you were to say 'Microsoft,' I'd say 'Where? Let me grab
me my stake and crucifix!'." -- Bridget Kulakauskas
 
F

Flash Gordon

Robert Maas, see http://tinyurl.com/uh3t wrote, On 18/03/07 06:52:
Whatever is the **current** (latest official) standard.
At the moment, that seems to be C99.

Was making that clear really so difficult?
Or whatever the GNU C compiler uses when the -ansi switch is turned
on. From the man pages here:

Please tell me whether GNU C MAN pages are referring to C99 or what?

If you read the rest of the GCC documentation you will find one of two
things:
1) It tells you that it is referring to C89
2) It is a version pre-dating C99 which would therefore obviously not be
referring to C99.

Try reading enough of the documentation to know how to use the tools you
are using, i.e. most of it.
Can you forgive me for reading the GNU C MAN pages and thinking the
jargon used in them was in any way acceptable here in the
newsgroup? (Especially after somebody here suggested that I use
both the -ansi and -pedantic switches, and somebody here suggested
I read the man pages?) I suspect you-all here have deliberately
baited me into looking at those man pages and using the jargon
therein, just so you could the slap me down for using it.

If you keep suggesting people are out to get you then it is quite likely
to become self fulfilling.

The suggestion to read the documentation for GCC was simply so that you
know how to use the tools you are using. Nothing more and nothing less.
Do you think it is a bad idea to know the tools you are using?
So it would seem reasonable, in articles posted any time from about
2000 to the present and for a while in the fiture, to call it "ANSI
C", no??

It is reasonable to allow for historic usage and be clear when it matters.
I would consider that "old ANSI C", specifically "previous ANSI C".

It is also old ISO C.
So you're forbidding me to use perfectly correct jargon, just
because some other people use the same jargon in an incorrect way?

Since when does the word "suggest" mean forbid?

What about C89 as an alternate designation for C90?

Accept when referring to specific section numbers, yes.
(ISO standardized it in 1989, but ANSI didn't join in approving it
until 1990, right. So from ISO's point of view, it's really C89?)

Wrong, and I'm sure that Keith would not have made that mistake.
What term should I use to refer to the part of C that hasn't
changed from C89/C90 to C99, such as that if M and N are short
integers then M+N is an expression, of type short integer, that
results from adding M and N while discarding overflow (thus
wraparound)? For the most part, it's *that* language, the common
subset of C89/C90 and C99, that I want to describe in my
Cookbook/Matrix WebTree.

Where it does not matter which standard using the terms ANSI C or by
preference in an international setting ISO C is not unreasonable. Where
it matters that it is the common subset, which *does* matter when
writing reference documentation (and sometimes matters here) you should
be specific about it.u

In any case, when I issue a gcc command on FreeBSD Unix, or on
RedHat Linux, in either case the gnu compiler goes ahead and links
in various library modules based on headers that I've specfied. I

<snip more ranting>

Stop talking complete and utter rubbish. It has been pointed out to your
more than once already that the headers do NOT control what happens when
you link. I have NEVER come across a compiler where including a header
causes the relevant library to be linked in.

Your situation is your problem.
I don't suppose you know of a plain-text or HTML version?

<snip ranting>

Such things have been posted here during the time you have been here.
 
P

pete

Robert Maas, see http://tinyurl.com/uh3t wrote:
What term should I use to refer to the part of C that hasn't
changed from C89/C90 to C99, such as that if M and N are short
integers then M+N is an expression, of type short integer, that
results from adding M and N while discarding overflow (thus
wraparound)?

The term you should use is:
"I make so many mistakes that they almost cancel each other out"

If M and N are short integers,
then M+N is an expression of type int and not of type short.

If (M+N) overflows, then the result is undefined.

ISO/IEC 9899:1999 (E)
6.3.1 Arithmetic operands
6.3.1.1 Boolean, characters, and integers
2 The following may be used in an expression
wherever an int or unsigned int may be used:
— An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
If an int can represent all values of the original type,
the value is converted to an int;
otherwise, it is converted to an unsigned int.
These are called the integer promotions.48)
48) The integer promotions are applied only:
as part of the usual arithmetic conversions,
to certain argument expressions,
to the operands of the unary +, -, and ~ operators,
and to both operands of the shift operators,
as specified by their respective subclauses.
6.3.1.8 Usual arithmetic conversions
1 Many operators that expect operands of arithmetic type
cause conversions and yield result types in a similar way.
The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted,
without change of type domain, to a type whose corresponding
real type is the common real type.
Unless explicitly stated otherwise,
the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if
they are the same, and complex otherwise.
This pattern is called the usual arithmetic conversions:

Otherwise, the integer promotions are performed on both operands.

6.5.6 Additive operators
Semantics
4 If both operands have arithmetic type,
the usual arithmetic conversions are performed on them.

3.4.3
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct
or of erroneous data,
for which this International Standard imposes no requirements
2 NOTE Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or program execution in a documented
manner characteristic of the environment
(with or without the issuance of a diagnostic message),
to terminating a translation or execution
(with the issuance of a diagnostic message).
3 EXAMPLE An example of undefined behavior is the behavior on integer
overflow.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
printf("sizeof ((char)0 + (char)0) is %u.\n",
(unsigned)sizeof ((char)0 + (char)0));
return 0;
}

/* END new.c */
 
P

pete

pete said:
The term you should use is:
"I make so many mistakes that they almost cancel each other out"

If M and N are short integers,
then M+N is an expression of type int and not of type short.

If (M+N) overflows, then the result is undefined.

ISO/IEC 9899:1999 (E)

I will now provide references from ISO/IEC 9899: 1990

3 Definitions and conventions
Examples
2. An example of undefined behavior
is the behavior on integer overflow.

5.1.2.3 Program execution
2. In executing the fragment
char c1, c2;
/*...*/
c1 = c1 + c2;
the “integral promotions” require that the abstract machine
promote the value of each variable to int size
and then add the two ints and truncate the sum.
Provided the addition of two chars can be done without creating
an overflow exception, the actual execution need only produce
the same result, possibly omitting the promotions.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
char c1 = 0;
char c2 = 0;

printf("sizeof(c1 + c2) is %u.\n", (unsigned)sizeof(c1 + c2));
return 0;
}

/* END new.c */
 
M

Mark McIntyre

//tinyurl.com/uh3t[/url]) wrote:

(I wrote, some weeks ago)
Your browser may not have a PDF reader available. Google recommends
visiting our text version of this document.
Clicking on the link "text version" doesn't work. It just takes me
to another listing of search results. Google appears to be broken today.

Apparently you wrote this in response to something I wrote. I have no
clue what the above refers to, or why google pointed you at some
random text link.
If you know where there's a text version available please post the URL.

A text version of ISO/IEC 2832 ? No idea. Buy one from ISO?
The San Francisco Bay Area is considered a region,

You're pretty much making my point by this bollocks. the Bay Area may
be a postal region or tax region (for all I know) but its not a
geological region, nor yet a weather region.
Common Sense says that Neptune hasn't cleared its neighborhood of
Pluto, and the Earth hasn't cleared its neighborhood of Luna, and
Jupiter hasn't cleared its neighborhood of Io or Europa or Ganymede
or Calisto. So none of those is a planet per IAU's 2006 definition???

Apparently you're unfamilar with the meaning of yet another phrase...
:)


--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
K

Keith Thompson

Whatever is the **current** (latest official) standard.
At the moment, that seems to be C99.

Ok. You need to be aware, though, that a lot of people use the term
"ANSI C" to refer to the language defined by the 1989 ANSI standard.
Which is why I refer to "C90" or "C99", to avoid the potential
ambiguity.
Or whatever the GNU C compiler uses when the -ansi switch is turned
on. From the man pages here:
[snip]

gcc's "-ansi" option is equivalent to "-std=c90" or
"std=iso9899:1990". (I've suggested that "std=c89" should also be
accepted.) So, as you can see, the term "ANSI C" is ambiguous.
Please tell me whether GNU C MAN pages are referring to C99 or what?

The gcc documentation refers to C89 when it discusses the "-ansi"
option.
Can you forgive me for reading the GNU C MAN pages and thinking the
jargon used in them was in any way acceptable here in the
newsgroup?

Forgive you? Sure, whatever, it's an easy mistake to make.
(Especially after somebody here suggested that I use
both the -ansi and -pedantic switches, and somebody here suggested
I read the man pages?) I suspect you-all here have deliberately
baited me into looking at those man pages and using the jargon
therein, just so you could the slap me down for using it.

Paranoid nonsense. I don't think anybody here cares enough to bait
you like that.
So it would seem reasonable, in articles posted any time from about
2000 to the present and for a while in the fiture, to call it "ANSI
C", no??

It would *seem* reasonable, but as I've explained, it does cause
problems.
I would consider that "old ANSI C", specifically "previous ANSI C".

Or you could call it "ANSI C89".
So you're forbidding me to use perfectly correct jargon, just
because some other people use the same jargon in an incorrect way?

Forbidding you? What the hell are you talking about? I am not
"forbidding" you to do anything. I am offering you advice, as you can
tell by my use of the word "suggest" above.

[...]
What about C89 as an alternate designation for C90?
(ISO standardized it in 1989, but ANSI didn't join in approving it
until 1990, right. So from ISO's point of view, it's really C89?)

Not quite. ANSI issued a C standard (the first one) in 1989. ISO
adopted it, with some non-substantive changes, in 1990; ANSI then
adopted the 1990 ISO standard as an ANSI standard.

The 1989 ANSI standard and the 1990 ISO standard describe the same
language. You can sensibly refer to that langauge as either C89 or
C90. (Or you can call it Ralph Kramden if you want to; you don't need
my permission.)
What term should I use to refer to the part of C that hasn't
changed from C89/C90 to C99, such as that if M and N are short
integers then M+N is an expression, of type short integer, that
results from adding M and N while discarding overflow (thus
wraparound)? For the most part, it's *that* language, the common
subset of C89/C90 and C99, that I want to describe in my
Cookbook/Matrix WebTree.

I don't know of a name for the common subset (i.e., the intersection)
of C90 and C99.

Incidentally, I'm not sure what the stuff about M and N is supposed to
mean.
Ah, that was because those functions were in both the original K&R
and also in C89/C90, and prior to 1999 were also in GNU C as a
nonstandard add-on, and I wasn't aware that C99 existed at the time
I wrote that. How do those paragraphs in my Matrix look now? Should
I perhaps flag them as C99, to warn users of older C systems?

I think you mean that they *weren't* in C89/C90. I don't remember
where your Matrix is.
As a more general question, which paragraphs of my Matrix document
currently define something that is in C99 but not C89/C90, such
that a warning about that would be appropriate?

I'm afraid that would require more of my time than I'm currently
willing to spend on this.
How is that off-topic for comp.lang.c or for discussion of errata
in my Cookbook/Matrix WebTree?

It's off-topic for comp.lang.c because it concerns specific
implementations, not the actual language.

[...]
What jargon would *you* use to include the compiler and the loader
and the libraries together as a transparent unit??

"The implementation".
I thought it was clear already: "original c" means K&R.
Yes, that was long before any ISO or ANSI standard C.
Nobody disputes that fact. (Well, except for somebody like |-|erc.)

There were versions of C *before* K&R1. See
<http://cm.bell-labs.com/cm/cs/who/dmr/chist.html>. (Versions of C
prior to K&R1 are now almost entirely of only historical interest.
K&R itself is nearly so; support for at least the C89/C90 standard is
now almost, but not quite, universal.)
I have no way to view that here:
<http://www.rawbw.com/~rem/NewPub/mySituation.html>
I don't suppose you know of a plain-text or HTML version?

This has already been answered several times. Please pay attention.
Once again:

<http://cbfalconer.home.att.net/download/n869_txt.bz2> is a compressed
(with bzip2) plain-text draft. It's from not long before the final
C99 standard; there are a few significant differences, but not many.
(No, I don't have a list.)

n1124.pdf is what I use. Adobe Reader is able to translate it to
plain text. The result isn't as nicely formatted as n869.txt, but it
has the advantage that it includes the full C99 standard plus the two
Technical Corrigenda. I already offered to send you a copy; you'll
need to contact me and let me know how to get it to you (compressed?
if so, how? e-mail? ftp? www?). I reserve the right to withdraw this
offer at some indefinite point in the future.
Why is it that the Common Lisp folks (well one of them, KMP,
anyway) have produced a nice HTML version of the ANSI standard:
<http://www.lispworks.com/documentation/HyperSpec/Front/>
and Sun Microsystems has produced equally good HTML documentation
for all their API, which is the default standard for Java, but the
idiots who have access to the ANSI standard (1999) C haven't
figured out how to do a similar thing for it? Is it really that
hard for somebody with PDF access to copy the text from the ANSI
C99 standard into a Web framework?

Most of us find the PDF version sufficiently useful. Most of those
who prefer something other than PDF probably use n869.txt. If you
don't find either of those usable, alternatives have already been
offered.
 
R

Robert Maas, see http://tinyurl.com/uh3t

From: Yevgen Muntyan said:
Just don't mention GNU-only functions at all.

I'd like to do that, but I don't know of any search engine where I
type in the name of a C library function and get back the status of
that particular function and the various standards (C99, C89/C90,
K&R) and implementations. So occasionally I will read about some
function, try it, find it works, and describe it in my document,
without any to know it isn't standard.
Is this how *you* get the documentation? You should consider something
better, like C libraries manuals, man pages, C standard. man pages
are pretty good, they tell you about the standards given function
conforms to.

I'm not aware of any man pages that tell whether a given function
in c is standard and if so which standard. The C libraries manuals
are precisely what I intended the reader to consult after using
Google to find it in the first place. In some cases I have direct
links from my document to individual sections of an online C
library manual, where the chapter/section of the manual closely
matches the way I have the material organized. For example, with
the trigonometric functions I just have a general description of
the typical functions available in most programming languages, with
direct links to Common Lisp and C manual section on that topic.
Don't write documentation about stuff you don't know.

There's no such thing as knowing a subject absolutely and
completely. If everyone waited until they knew *everything* about a
topic before they wrote *any* documentation about that topic, no
documetation would ever get written.
 
R

Robert Maas, see http://tinyurl.com/uh3t

From: Keith Thompson said:
Overflow in sscanf() for any numeric type invokes UB (which, of
course, includes the possibility of raising a signal).

Yet another reason not to use sscanf to validate user input from
Web forms, <cliche>as if the camel's back weren't already broken by
previous straws</cliche>.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,252
Latest member
MeredithPl

Latest Threads

Top