Pointer question

D

DSF

Hello,

Is a pointer harmless as long as it's not dereferenced, or fed to a
function that dereferences it? After all, a pointer is just a number
until you try to access where it points to (at?) It would seem
logical that the answer to this is yes, but C can sometimes appear to
be "highly illogical," so I thought I'd ask here.

The reason I ask is that I'm creating a few string functions for my
personal library and I noticed that two of the pointers in the
following code may point to well beyond the string 'str' if 'pos' or
'n' are out of range.

/* deletes n chars from string str starting at position pos */
/* if pos > length of str, returns error */
/* if pos + n > length of str, deletes from pos to end of str */

int strdel(char *str, size_t pos, size_t n)
{
char *p1 = str + pos;
char *p2 = p1 + n;
char *l = str + strlen(str);

if(p1 < l)
{
if(p2 > l)
p2 = l;
while(*p2)
*p1++ = *p2++;
*p1 = *p2;
return 0;
}
return 1;
}

DSF
 
D

Daniel Giaimo

Hello,

Is a pointer harmless as long as it's not dereferenced, or fed to a
function that dereferences it? After all, a pointer is just a number
until you try to access where it points to (at?) It would seem
logical that the answer to this is yes, but C can sometimes appear to
be "highly illogical," so I thought I'd ask here.

The reason I ask is that I'm creating a few string functions for my
personal library and I noticed that two of the pointers in the
following code may point to well beyond the string 'str' if 'pos' or
'n' are out of range.

/* deletes n chars from string str starting at position pos */
/* if pos> length of str, returns error */
/* if pos + n> length of str, deletes from pos to end of str */

int strdel(char *str, size_t pos, size_t n)
{
char *p1 = str + pos;
char *p2 = p1 + n;
char *l = str + strlen(str);

if(p1< l)
{
if(p2> l)
p2 = l;
while(*p2)
*p1++ = *p2++;
*p1 = *p2;
return 0;
}
return 1;
}

First of all, yes, AFAIK, a pointer is harmless unless dereferenced.
However, and I do agree this isn't likely to be a problem, but there is
technically an error in this code. Specifically, if p1 points to a
point in memory that is very near the limit of addressable memory, it
is possible for the line which calculate p2 to overflow if n is too
large. In this case you will get undefined behavior and likely a crash.
 
S

Seebs

Is a pointer harmless as long as it's not dereferenced, or fed to a
function that dereferences it? After all, a pointer is just a number
until you try to access where it points to (at?) It would seem
logical that the answer to this is yes, but C can sometimes appear to
be "highly illogical," so I thought I'd ask here.

Answer: There have existed machines on which a "pointer" variable would
be loaded into a special kind of register, which the machine knew to be
used for "addresses". On some such machines, loading an invalid address
can cause problems.

Or, to put it in C standardese terms: A pointer that isn't to allocated space
may be a trap representation, and any access of an object which is a trap
representation is undefined behavior, not just dereferencing.

In short:
int *p1, *p1;
if (p1 == p2) {
}

1. This is undefined behavior.
2. There have existed real implementations on which it blew up.
/* deletes n chars from string str starting at position pos */
/* if pos > length of str, returns error */
/* if pos + n > length of str, deletes from pos to end of str */
int strdel(char *str, size_t pos, size_t n)

Note that this name is reserved for future extensions or use by
the implementation. (Yeah, I know, annaying.)
{
char *p1 = str + pos;
char *p2 = p1 + n;
char *l = str + strlen(str);

I would do this differently.
size_t len = strlen(str);
if (pos > len)
return 1;
if (pos + n > len)
return 1;
memmove(str + pos, str + pos + n, len - (pos + n) + 1);
return 0;

(Please don't actually run this without sanity-checking it, I haven't tested
it and I'm sorta muddle-headed right now. I mean, more than usual even.)

-s
 
S

spinoza1111

Answer:  There have existed machines on which a "pointer" variable would
be loaded into a special kind of register, which the machine knew to be
used for "addresses".  On some such machines, loading an invalid address
can cause problems.

Although this is a logical possibility, your (self-confessed) lack of
academic computer science is on exhibit here, because the preferred
method is to wait until the effective address is calculated.
Or, to put it in C standardese terms:  A pointer that isn't to allocated space
may be a trap representation, and any access of an object which is a trap
representation is undefined behavior, not just dereferencing.

You really have no clue. The neologism "trap representation" is not
part of computer science. Instead, it was incoherently invented by the
C standards committees which were formed of unqualified people like
you in order to preserve vendor profits.

Like "sequence points" it is ersatz science invented ad-hoc to
standardize a language that CANNOT BE STANDARDIZED by fantasizing a
language which is itself held harmless from any errors but is almost
impossible to implement in reality.

You see, Petey, "trap representation" makes no sense. If the register
or storage area holds the value for any length of time, it IS legal.

Since you majored in psychology and have not by your own admission
taken computer science at university level, you expect machine
designers to design registers which when loaded with a value but
before effective address calculation will "throw" a "trap" (a word
which is itself outdated) when loaded with a "bad" value. Of course, I
can load a register in the real world with a negative number and add a
positive number in the real world for a good purpose, but your fantasy
machine will have a fit.

You want to impose rules which actually predate Von Neumann here,
since you think you can legislate what a pointer should look
like...or, more precisely, you think you can meta-theorize about what
REAL compiler writers must do.


In short:
        int *p1, *p1;
        if (p1 == p2) {
        }

1.  This is undefined behavior.
2.  There have existed real implementations on which it blew up.

You mean there "could" exist fanciful implementations in which
registers magically know when a bad, Communist address is being loaded
into them. You don't know where they are but they might exist.
Note that this name is reserved for future extensions or use by
the implementation.  (Yeah, I know, annaying.)


I would do this differently.
        size_t len = strlen(str);
        if (pos > len)
                return 1;
        if (pos + n > len)
                return 1;
        memmove(str + pos, str + pos + n, len - (pos + n) + 1);
        return 0;

(Please don't actually run this without sanity-checking it, I haven't tested
it and I'm sorta muddle-headed right now.  I mean, more than usual even..)

Meaning: you're a little creep who calls people "morons" and takes
credit for fixing their errors when they do so themselves and you
watch, but when it comes to your errors, you plead ADHD.

Please don't pay any attention to Peter Seebach.
 
S

spinoza1111

First of all, yes, AFAIK, a pointer is harmless unless dereferenced.
However, and I do agree this isn't likely to be a problem, but there is
technically an error in this code.  Specifically, if p1 points to a
point in memory that is very near the limit of addressable memory, it
is possible for the line which calculate p2 to overflow if n is too
large.  In this case you will get undefined behavior and likely a crash..

Details at eleven. The lie of "C Standards" is that you WON'T get a
crash in C but you WILL get a crash in C no matter what, because C is
poorly designed. C is designed to crash, because it ain't C if it
don't crash (that is, das ist, it ain't C if it don't alias).
 
N

Nick Keighley

[...] The neologism "trap representation" is not
part of computer science. Instead, it was incoherently invented by the
C standards committees which were formed of unqualified  people [...]

when I did my computer science degree I did a course that analysed
various constructs in programming languages in a formal or semi-formal
manner. It did this in terms of a meta-language (if you know about VDM
then you've roughly got the idea). The meta-language had a concept
called "bottom" that was written U+22A5 (it looked like an inverted
capital T http://en.wikipedia.org/wiki/Table_of_logic_symbols). All
variables were initialised to \bot when they were created (defined)
and it was an error to read a varaible holding bottom.

When I came across an Algol-60 implementation that loaded -0 (it was a
ones-complement machine) into newly defined ("declared" if I recall
correctly was the Algol-60 term) variable and terminated with a stack
trace if you read a -0 froma a varaible it looked vry much like the
meta-langauges \bot and it also looks (to me) very like C's "trap
representation".

To say a formalisable concept has *never* been mentioned in computer
science is pretty brave.
 
B

Ben Bacarisse

First of all, yes, AFAIK, a pointer is harmless unless dereferenced.

Even referencing a bad pointer can be undefined behaviour.
I.e. writing int *ip; printf("%p\n", ip); can do pretty much anything.
The reason is that C is designed to be implemented on a wide range of
hardware -- some not yet conceived of. Hardware that might check that
a pointer is invalid could be very useful for secure systems, and C
permits all operations on invalid pointers to do anything the
implementation wishes to accommodate such designs.
However, and I do agree this isn't likely to be a problem, but there is
technically an error in this code. Specifically, if p1 points to a
point in memory that is very near the limit of addressable memory, it
is possible for the line which calculate p2 to overflow if n is too
large. In this case you will get undefined behavior and likely a
crash.

That's one way that constructing such a pointer may go wrong, but one
could imagine another hardware-checked architecture where constructing
any pointer outside of the permitted range (basically from the start to
one-element past the end) gives a run-time error. C is designed so
that this is possible.
 
K

Keith Thompson

Ben Bacarisse said:
Even referencing a bad pointer can be undefined behaviour.
I.e. writing int *ip; printf("%p\n", ip); can do pretty much anything.

This:
int *ip; printf("%p\n", (void*)ip);
would have been a slightly better example. But even this:
int *ip;
int *ip2 = ip;
or this:
int *ip;
ip;
has the same problem: undefined behavior that's likely not to cause
any visible problems on most systems.

[...]
 
S

spinoza1111

[...] The neologism "trap representation" is not
part of computer science. Instead, it was incoherently invented by the
C standards committees which were formed of unqualified  people [...]

when I did my computer science degree I did a course that analysed
various constructs in programming languages in a formal or semi-formal
manner. It did this in terms of a meta-language (if you know about VDM
then you've roughly got the idea). The meta-language had a concept
called "bottom" that was written U+22A5 (it looked like an inverted
capital Thttp://en.wikipedia.org/wiki/Table_of_logic_symbols). All
variables were initialised to \bot when they were created (defined)
and it was an error to read a varaible holding bottom.

When I came across an Algol-60 implementation that loaded -0 (it was a
ones-complement machine) into newly defined ("declared" if I recall
correctly was the Algol-60 term) variable and terminated with a stack
trace if you read a -0 froma a varaible it looked vry much like the
meta-langauges \bot and it also looks (to me) very like C's "trap
representation".

To say a formalisable concept has *never* been mentioned in computer
science is pretty brave.

The problem: "trap representation" doesn't have this precise meaning.
It can mean "bottom" but it can also mean NAN or infinity. The fact is
that just because actual computer scientists used a similar word
doesn't make "trap representation" scientific.
 
S

spinoza1111

Even referencing a bad pointer can be undefined behaviour.
I.e. writing int *ip; printf("%p\n", ip); can do pretty much anything.
The reason is that C is designed to be implemented on a wide range of
hardware -- some not yet conceived of.  Hardware that might check that
a pointer is invalid could be very useful for secure systems, and C
permits all operations on invalid pointers to do anything the
implementation wishes to accommodate such designs.


That's one way that constructing such a pointer may go wrong, but one
could imagine another hardware-checked architecture where constructing
any pointer outside of the permitted range (basically from the start to
one-element past the end) gives a run-time error.  C is designed so
that this is possible.

This is utter nonsense. The design of C has no cause and effect
relationship with regards to new hardware design. Hardware designers
worry about C only in designing for the efficient execution of
existing C code.

This hardware-checked architecture is (1) not RISC (2) proprietary and
microprogrammed in an unfashionable way (3) object oriented, therefore
best programmed in an OO language (not C).

It was a pretense that C was being designed "so that hardware
designers can program cool machines". C is gradually being abandoned
for OS code as C programmers retire and die. The real mission was to
refrain from any hint of cleaning up the mess that is C semantics,
because this would have forced greedy vendors to rehire compiler
people.
 
S

spinoza1111

This:
    int *ip; printf("%p\n", (void*)ip);
would have been a slightly better example.  But even this:
    int *ip;
    int *ip2 = ip;
or this:
    int *ip;
    ip;
has the same problem: undefined behavior that's likely not to cause
any visible problems on most systems.

So how did the C standard prevent this? Is not the above "standard
C"?

"Standard C" is a fraud, since many possible strings of "standard C"
have "undefined" semantics. But the MEANING of being standard is
usually to be well-defined. Many examples of "standard C" are "well-
defined" only in the sense that their "definition" is "undefined".

The standard gave compiler writers no way of reporting "your code is
not standard". This is because the remit of the standard was to write
a document that would not require vendors to change anything. Instead,
the "standard" merely became a way to abuse ordinary programmers for
making intelligent assumptions under pressure.

Take the absurdity of evaluating actual parameters right to left.
Sure, a lot of programmers might have a Hebrew or Asian language
background and read from right to left (or top down as in the case of
classical Chinese). But all intelligent programmers world wide
recognize that because of colonialist Western hegemony, programming
languages read left to right.

To assume that it's sensible that actual parameters be evaluated the
other way is insane because the reason why this bug was a feature was
strictly "convenience in the old days".

The C standard kiddies didn't have either that balls or the remit to
change this! Their remit was simply to produce an unreadable document
that reads like Genesis in the Bible, where God manages to create
women twice and Cain screws his sisters to give Adam grandchildren.

At least men believed the Bible for good reasons. People here treat
the Standard as Holy Writ as pure money worship:

What sphinx of cement and aluminum bashed open
their skulls and ate up their brains and imagi-
nation?
Moloch! Solitude! Filth! Ugliness! Ashcans and unob
tainable dollars! Children screaming under the
stairways! Boys sobbing in armies! Old men
weeping in the parks!
Moloch! Moloch! Nightmare of Moloch! Moloch the
loveless! Mental Moloch! Moloch the heavy
judger of men!
Moloch the incomprehensible prison! Moloch the
crossbone soulless jailhouse and Congress of
sorrows! Moloch whose buildings are judgment!
Moloch the vast stone of war! Moloch the stun-
ned governments!
Moloch whose mind is pure machinery! Moloch whose
blood is running money! Moloch whose fingers
are ten armies! Moloch whose breast is a canni-
bal dynamo! Moloch whose ear is a smoking
tomb!

(Allen Ginsberg, HOWL)
[...]

--
Keith Thompson (The_Other_Keith) (e-mail address removed)  <http://www.ghoti.net/~kst>
Nokia
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
 
N

Nick

pointlessly arguing with said:
One can easily construct well-spelled, grammatically correct nonsense
in English, too - but that isn't a *fault* of English, and the ability
to construct nonsense in C isn't a fault of C. One can stop people
doing stupid things only by removing the flexibility that also allows
them to do clever things.

This is why it's pointless arguing with him. I gave up when he claimed
that this was false; that by definition if it is clear it is correct.

You're trying to build a solid house of argument on quicksand. And he's
running around with a pump agitating the sand.
 
R

Richard Bos

Nick Keighley said:
To say a formalisable concept has *never* been mentioned in computer
science is pretty brave.

*Sigh* No, it's a _troll_. Stop feeding it already, damn your skin.

Richard
 
S

spinoza1111

spinoza1111wrote:


The C Standard does not prevent anyone from typing nonsense, as your
egregious contributions to this newsgroup clearly demonstrate. The above
code does not breach any syntax rules or violate any constraints, so the
Standard doesn't require any implementation to diagnose the nonsense
code, but nonsense code it remains.

One can easily construct well-spelled, grammatically correct nonsense in
English, too - but that isn't a *fault* of English, and the ability to
construct nonsense in C isn't a fault of C. One can stop people doing
stupid things only by removing the flexibility that also allows them to
do clever things.

I don't want you to be able to "do clever things" in a language where
your "clever things" will break "things". What you think is clever was
clever in 1971.
 
S

spinoza1111

Nick wrote:



<g> Nicely put.

If you think that clumsy image was "nicely put"...you need to take an
evening class in literary appreciation and basic writing, dawg.

You see, by saying that "you're trying to build a solid house of
argument on quicksand", your friend used an image which means you're a
fool who doesn't know how to build a solid house of argument. And
"running around with a pump agitating the sand" is incoherent.

I'd have said "you're trying to build a solid house of argument with
bricks, and he's the Big Bad Wolf trying and failing to blow it down".

How stupid can you get? I mean, I looked up my high school IQ score:
it was only 120. But the mean here seems to be 80.
 
H

Herbert Rosenau

Hello,

Is a pointer harmless as long as it's not dereferenced, or fed to a
function that dereferences it?

No. It depends on the implementation.

After all, a pointer is just a number
until you try to access where it points to (at?) It would seem
logical that the answer to this is yes, but C can sometimes appear to
be "highly illogical," so I thought I'd ask here.

No, same as above.
The reason I ask is that I'm creating a few string functions for my
personal library and I noticed that two of the pointers in the
following code may point to well beyond the string 'str' if 'pos' or
'n' are out of range.

There are some traps.

It is undefined behavior to compare 2 pointers usirn <, <=, >, >= when
not both pointers are pointing to different objects or one or both
pointin outside the rasnge of the object it points to.

A pointer may have more significant bits than an (unsigned) int, long
or long long. So movinc/copying ponters to/from other types may
resolut in something other than valid address ending in some strange
behavior.

Not on all implementations is each possible value in a pointer an
valid one. A pointer may contain padding bits and/or trap
represenations letting some CPU instrunctions fail or trap while it
does nothing more than access the value.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2R Deutsch ist da!
 
P

Phil Carmody

Richard Heathfield said:
The C Standard does not prevent anyone from typing nonsense, as your
egregious contributions to this newsgroup clearly demonstrate.


@comp_lang_c =~ s/Scott Nudds fly out of your nose/random nonsense be posted to usenet under the moniker spinoza1111/;

The bind moggles. Gonna run sparse over all my code twice from now on...

Phil
 
K

Keith Thompson

christian.bau said:
Just noticed: Yes, if the caller passes very large values for pos and
n, it is quite possible that p1 < l evaluates to true even though pos
your code will therefore end up accessing memory that it shouldn't and
likely crash.

Take any pointer valid pointer char* p and call

strdel (p, 0 - (size_t) p, 1);

and I'd say a crash is quite likely.

That depends on what strdel does. There's no such function in the C
standard (or even in POSIX).
 
D

DSF

Just noticed: Yes, if the caller passes very large values for pos and
n, it is quite possible that p1 < l evaluates to true even though pos
your code will therefore end up accessing memory that it shouldn't and
likely crash.

Take any pointer valid pointer char* p and call

strdel (p, 0 - (size_t) p, 1);

and I'd say a crash is quite likely.

I have actually modified the code to eliminate pointers from the
range checking. See my post "Pointer Question II - The Rebirth" for
the changes made and why I started a new thread.
Also see my reply to Peter Nilsson in the same thread as to very
large input parameters.
For my own curiosity, I restored my original code (posted in this
thread) and tried it with your parameters.

Walking through the code:
pos evaluates to 4,293,664,481
char *p1 = str + pos;
p1 evaluates to NULL.
char *p2 = p1 + n;
produces error "Pointer arithmetic in invalid memory"
This error is from CodeGuard, a memory watchdog.
char *l = str + strlen(str);

if(p1 < l)
{
if(p2 > l)
p2 = l;
while(*p2)
*p1++ = *p2++;
Crashes here trying to write p1 since p1 == NULL

I thought you might find that interesting.
DSF
 
D

DSF

That depends on what strdel does. There's no such function in the C
standard (or even in POSIX).

That's because it's a function I wrote. See my post "Pointer
Question II - The Rebirth" for the latest info and the str naming
issue.
DSF
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top