pointer = &membuff[-2];

B

Bo S.

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

Anyway, I take a chance: I have stripped out a piece of code that
we have in a large application that in some circumstances will do
what this example program show. The interesting part is the

pointer = &buffer[-2]; /* is it really -2, or 4294967294 */
x = pointer->data; /* Reading something >2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.
I would appreciate if anyone can explain why, or why not, without
arguing about good or bad programming style (this code has already
been redesigned but exists alive and well in older versions of
our product).

I guess buffer[-2] can be written *(buffer-2) and &buffer[-2]
will simply be (buffer-2) and pointer->data will access memory
at, say, (buffer-2)+2. But isn't n1-n2 = 4294967294 in the
example below? Ok, 4294967294+2 is 0...or?

Here's an example program:

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

signed int
main(signed int argc, char ** argv)
{
unsigned int n1 = 0;
unsigned int n2 = 2;
unsigned int n3;
unsigned int *membuff;
struct dummy {
unsigned int d1;
unsigned int d2;
unsigned int d3;
} *sp;

membuff = malloc(10*sizeof(unsigned int));
membuff[0] = 54;

sp = (struct dummy *) &membuff[n1 - n2];
n3 = sp->d3;

printf("Result = %u", n3);

exit(0);

}

Bo
 
J

Joe Pfeiffer

Bo S. said:
pointer = &buffer[-2]; /* is it really -2, or 4294967294 */

If it's a 32 bit system, does it matter? If you take an address and add
4294967294 to it, and then truncate to 32 bits, do you get something
different from subtracting 2?

When I taught 2's complement arithmetic in sophomore assembly language
classes, the example I liked to use was an unscrupulous car seller
"rolling back" a car's odometer. Take 20,000 miles off the car's
mileage and it's worth more.

So... in addition to being illegal, car odometers are now made so they
won't wind backwards. No problem -- on a 100,000 mile odometer, just
roll it 80,000 miles forward instead. No way to tell the difference.

In effect, that's the same thing you're seeing here.

(of course, more modern yet odometers also put up a flag when they wrap
around so you can tell....)
 
K

Keith Thompson

Bo S. said:
Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

Anyway, I take a chance: I have stripped out a piece of code that
we have in a large application that in some circumstances will do
what this example program show. The interesting part is the

pointer = &buffer[-2]; /* is it really -2, or 4294967294 */
x = pointer->data; /* Reading something >2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.
I would appreciate if anyone can explain why, or why not, without
arguing about good or bad programming style (this code has already
been redesigned but exists alive and well in older versions of
our product).

I guess buffer[-2] can be written *(buffer-2) and &buffer[-2]
will simply be (buffer-2) and pointer->data will access memory
at, say, (buffer-2)+2. But isn't n1-n2 = 4294967294 in the
example below? Ok, 4294967294+2 is 0...or?

Here's an example program:

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

signed int
main(signed int argc, char ** argv)
{
unsigned int n1 = 0;
unsigned int n2 = 2;
unsigned int n3;
unsigned int *membuff;
struct dummy {
unsigned int d1;
unsigned int d2;
unsigned int d3;
} *sp;

membuff = malloc(10*sizeof(unsigned int));
membuff[0] = 54;

sp = (struct dummy *) &membuff[n1 - n2];
n3 = sp->d3;

printf("Result = %u", n3);

exit(0);

}

The indexing operator is not defined in terms of any particular integer
type. If you write p[-2], then the value -2 is added to the pointer
value p, and the result is dereferenced. There is no implicit
conversion of the index value (to size_t, ptrdiff_t, or anything else);
the computation just uses the value of the index.

However, in your case the index expression is ``n1 - n2''. Both n1 and
n2 are of type unsigned int, with values 0 and 2, respectively -- which
means that the result of the subtractin is *not* -2, but UINT_MAX-2
(on your system, 4294967294U).

So your statement
sp = (struct dummy *) &membuff[n1 - n2];
is equivalent to
sp = (struct dummy *) &membuff[4294967294U];
and the behavior is undefined.

It's likely that &membuff[4294967294U] behaves the same way as
&membuff[-2], but the language doesn't guarantee it. Probably the
generated code adds the value 4294967294U to the base address,
and it quietly wraps around, ignoring any overflow. If you're
curious, take a look at the generated code (many compilers use "-S"
to generate an assembly listing).

Converting the offset value from unsigned int to int is probably
cleaner:

int offset = n1 - n2;
sp = (struct dummy *) &membuff[offset];

Here, &membuff[offset] is equivalent to &membuff[-2].

You seem to be retrieving 3 unsigned int values starting *before*
the beginning of the allocated block of memory. The language
doesn't define the behavior of attempting to do that. I'll take
your word for it that that's what you want to do and that it works.

BTW, 500 posts every day? We *might* have that many in a month
(not including spam, but my NNTP server does a good job of filtering
that out).
 
E

Eric Sosman

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"?

Yes, but choosing to antagonize from the outset is not guaranteed
to suppress flames.

Oh, sorry: s/./, you imbecile./
pointer =&buffer[-2]; /* is it really -2, or 4294967294 */

It's really minus two.
x = pointer->data; /* Reading something>2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.

That's too deep a question for me. Getting away with murder
can be good (if you think of the murder as a one-time event) or
bad (if getting away with one influences you to attempt others
until eventually you don't get away.)

Anyhow, you're just repeating Question 6.17 of the FAQ. (Since
you imply you've been a reader of this forum for some span of time,
you're surely aware of how to find the FAQ, right?)

Oh, sorry: s/?/, you pfule?/
 
J

James Kuyper

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? ...

I haven't noticed any flame wars recently about the elegance of code.

One good way to reduce (but unfortunately, not eliminate) complaints
about "silly, nonportable hack" is to avoid posting messages for which
the complaint is justified.

.... I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

Your estimate of the volume of this newsgroup seems to be high by about
an order of magnitude.
If you'd prefer a quieter, more polite environment, try
comp.lang.c.moderated - but it's excessively quiet.
 
B

BGB

Bo S. said:
pointer =&buffer[-2]; /* is it really -2, or 4294967294 */

If it's a 32 bit system, does it matter? If you take an address and add
4294967294 to it, and then truncate to 32 bits, do you get something
different from subtracting 2?

When I taught 2's complement arithmetic in sophomore assembly language
classes, the example I liked to use was an unscrupulous car seller
"rolling back" a car's odometer. Take 20,000 miles off the car's
mileage and it's worth more.

So... in addition to being illegal, car odometers are now made so they
won't wind backwards. No problem -- on a 100,000 mile odometer, just
roll it 80,000 miles forward instead. No way to tell the difference.

In effect, that's the same thing you're seeing here.

(of course, more modern yet odometers also put up a flag when they wrap
around so you can tell....)


so, you are saying that at that point it is not really that the car has
just become brand-new again?...
 
S

Shao Miller

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

I hope that doesn't happen.
Anyway, I take a chance: I have stripped out a piece of code that
we have in a large application that in some circumstances will do
what this example program show. The interesting part is the

pointer =&buffer[-2]; /* is it really -2, or 4294967294 */
x = pointer->data; /* Reading something>2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.
I would appreciate if anyone can explain why, or why not, without
arguing about good or bad programming style (this code has already
been redesigned but exists alive and well in older versions of
our product).

I guess buffer[-2] can be written *(buffer-2) and&buffer[-2]
will simply be (buffer-2) and pointer->data will access memory
at, say, (buffer-2)+2. But isn't n1-n2 = 4294967294 in the
example below? Ok, 4294967294+2 is 0...or?

Here's an example program:

Just out of curiosity: Are you trying to gain access to an
implementation's 'malloc'-internal details? I notice that you are
attempting to access outside of the explicitly allocated memory.

Here's some feed-back, for whatever it's worth:

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

signed int main(signed int argc, char ** argv) {
/* It can be nice to put constant values here */
enum cv {
buff_elem_count = 10,
test_num = 54,
zero = 0
};
unsigned int n1 = 0;
unsigned int n2 = 2;
unsigned int n3;
unsigned int * membuff;
/*
* 'struct dummy' can have an unpredictable
* alignment requirement which might
* not be the same as 'unsigned int',
* even though it must be divisible by
* the alignment requirement of
* 'unsigned int' due to the first member
*/
struct dummy {
unsigned int d1;
unsigned int d2;
unsigned int d3;
} * sp;

/* Modified size computation */
membuff = malloc(buff_elem_count * sizeof *membuff);
/* Check for null pointer value */
if (!membuff) {
puts("Out of memory.");
return EXIT_FAILURE;
}

/* Modified to use testing value */
membuff[0] = test_num;

/*
* If the pointer arithmetic result below does
* not point to an 'unsigned int', the behaviour
* is undefined.
* If the alignment requirement of
* 'struct dummy' is not satisfied by the pointer
* arithmetic result below, the behaviour is
* undefined.
*/
sp = (struct dummy *) (membuff + (signed int)n1 - n2);
/*
* If 'sp' does not point to a contiguous range of
* accessible memory with size 'sizeof (struct dummy)'
* (if there's a hole), then the behaviour below is
* undefined.
* If there is padding between members of
* 'struct dummy', '&sp->d3' is not necessarily the
* same as '(unsigned int (*)[3])sp + 2'.
*/
n3 = sp->d3;

/* Newline added */
printf("Result = %u\n", n3);

return EXIT_SUCCESS;
}
 
S

Shao Miller

/*
* If the pointer arithmetic result below does
* not point to an 'unsigned int', the behaviour
* is undefined.

Sorry, I meant (as in another comment), that if the result doesn't meet
the _alignment_ requirement for 'unsigned int', the behaviour is undefined.
 
S

Shao Miller

Sorry, I meant (as in another comment), that if the result doesn't meet
the _alignment_ requirement for 'unsigned int', the behaviour is undefined.

Sorry; it's late. I also meant that it ought to point to within the
same array object, else the behaviour is undefined. This implies an
element type of 'unsigned int', so I was better off the first time. :)
 
S

Shao Miller

...
Here's an example program:
...

Here's another example program :) :

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

enum node_cv {
node_n1,
node_n2,
node_n3,
node_cv_fields,
node_cv_zero = 0
};
typedef unsigned int a_node[node_cv_fields];

signed int main(signed int argc, char ** argv) {
enum cv {
test_count = 54,
zero = 0
};
a_node test_nodes[] = {
{10, 4, 0},
{ 6, 0, 1},
{ 0, 3, 2},
{ 0, 9, 3},
};
a_node * cur_node;
signed int tests, distance;

/* Set initial node */
cur_node = test_nodes;

/* Walk the nodes */
for (tests = 0; tests < test_count; ++tests) {
distance =
(signed int) node_n1[*cur_node] -
node_n2[*cur_node];
printf(
"Node @ %p (n3: %u): n1: [%u] n2: [%u] Distance: %d\n",
(void *) cur_node,
node_n3[*cur_node],
node_n1[*cur_node],
node_n2[*cur_node],
distance
);
cur_node = (a_node *) (*cur_node + distance);
}
return EXIT_SUCCESS;
}
 
N

Noob

Keith said:
Converting the offset value from unsigned int to int is probably
cleaner:

int offset = n1 - n2;
sp = (struct dummy *) &membuff[offset];

Cleaner, perhaps, but it has implementation-defined behavior
when n1 < n2

"When [...] an unsigned integer is converted to its corresponding
signed integer, if the value cannot be represented the result is
implementation-defined."
 
J

James Kuyper

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

Anyway, I take a chance: I have stripped out a piece of code that
we have in a large application that in some circumstances will do
what this example program show. The interesting part is the

pointer = &buffer[-2]; /* is it really -2, or 4294967294 */

It is really -2.
x = pointer->data; /* Reading something >2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.

Code like this can work, though it does not work in your example below.
You could make it work by preceding it with code like the following:

struct databin{
int other;
int data;
} *full = malloc(10 * sizeof *full);

if(full)
{
struct databin *buffer = full + 2;
struct databin *pointer;
I would appreciate if anyone can explain why, or why not, without
arguing about good or bad programming style (this code has already
been redesigned but exists alive and well in older versions of
our product).

I guess buffer[-2] can be written *(buffer-2) and &buffer[-2]
will simply be (buffer-2) and pointer->data will access memory
at, say, (buffer-2)+2. But isn't n1-n2 = 4294967294 in the
example below? Ok, 4294967294+2 is 0...or?

Here's an example program:

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

signed int
main(signed int argc, char ** argv)

The "signed" keyword is needed only for 'signed char'; its use with
other integer types is permitted, but has no effect, because those types
are already signed (unless you also use the "unsigned" keyword, in which
case adding "signed" would create a syntax error).
{
unsigned int n1 = 0;
unsigned int n2 = 2;
unsigned int n3;
unsigned int *membuff;
struct dummy {
unsigned int d1;
unsigned int d2;
unsigned int d3;
} *sp;

membuff = malloc(10*sizeof(unsigned int));

Better:
membuff = malloc(10 * sizeof *membuff);
membuff[0] = 54;

sp = (struct dummy *) &membuff[n1 - n2];

Yes, n1-n2 is UINT_MAX-1, not -2, as you suggest above. This gives this
code undefined behavior. Note, however, that the behavior would be
undefined, for essentially the same reason, if n1 and n2 were signed, so
that n1-n2 is -2. The only values you can safely put in that subscript
are 0 through 10. 10 is safe only because of the '&' - any other use of
memcpy[10] would have undefined behavior.

You don't have to dereference the resulting pointer to have problems,
and dereferencing it only to access sp->d3 doesn't avoid the problems.
The expression &membuf[-2] is equivalent to membuf-2, and it has
undefined behavior all by itself, before it's value is even converted to
(struct dummy *). On many real-world implementations, membuf-2 could be
an invalid pointer value, which means that even storing it in an address
register (as might be done somewhere in that expression) could cause
your program to abort or a signal to be raised (among many other
possibilities).

As a further, minor quibble, it's possible (though rather unlikely) that
the alignment requirements of struct dummy are stricter than those of
unsigned int, so it's not guaranteed that you could, for instance, even
safely cast (struct dummy*) &membuf[2].

If your code actually works, despite these problems, that's merely an
unfortunate coincidence. You'd have been better off if it had failed,
forcing you to fix these defects.
 
I

ImpalerCore

Is it possible to post a basic C question in this newsgroup without
starting a flame war about "elegant" code or see answers like
"silly, nonportable hack"? I've almost stopped reading this ng since
there are usually more than 500 posts every day and most of it is
opinons and comments on other's opinion.

Anyway, I take a chance: I have stripped out a piece of code that
we have in a large application that in some circumstances will do
what this example program show. The interesting part is the

pointer = &buffer[-2]; /* is it really -2, or 4294967294 */
x = pointer->data;     /* Reading something >2 unsigned's down */

It (seems to) work, I'm just wondering if we're lucky or not.
I would appreciate if anyone can explain why, or why not, without
arguing about good or bad programming style (this code has already
been redesigned but exists alive and well in older versions of
our product).

I guess buffer[-2] can be written *(buffer-2) and &buffer[-2]
will simply be (buffer-2) and pointer->data will access memory
at, say, (buffer-2)+2. But isn't n1-n2 = 4294967294 in the
example below? Ok, 4294967294+2 is 0...or?

Here's an example program:

#define _POSIX_SOURCE 1
#include <stdlib.h>
#include <stdio.h>

signed int
main(signed int argc, char ** argv)
{
  unsigned int n1 = 0;
  unsigned int n2 = 2;
  unsigned int n3;
  unsigned int *membuff;
  struct dummy {
      unsigned int d1;
      unsigned int d2;
      unsigned int d3;
  } *sp;

  membuff = malloc(10*sizeof(unsigned int));
  membuff[0] = 54;

  sp = (struct dummy *) &membuff[n1 - n2];
  n3 = sp->d3;

  printf("Result = %u", n3);

  exit(0);

}

The one place that I've seen that kind of code is when designing an
allocator replacement. For example, if one wants to track how much
memory is in use, one method is to tag each allocation with its size
that occurs before the actual pointer returned.

\code snippet
#define PTR_HEADER_SIZE (sizeof (size_t))

size_t current_memory = 0;

void* track_malloc( size_t size )
{
void* mem = NULL;
void* p = NULL;

p = malloc( size + PTR_HEADER_SIZE );

if ( p )
{
*((size_t*)p) = size;
mem = (unsigned char*)p + PTR_HEADER_SIZE;

current_memory += size;

...
}

return mem;
}

void track_free( void* p )
{
void* actual_p = NULL;
size_t p_size;

if ( p )
{
actual_p = (unsigned char*)p - PTR_HEADER_SIZE;
p_size = *((size_t*)actual_p);

free( actual_p );

current_memory -= p_size;

...
}
}
\endcode

This kind of thing is necessary if one wants to keep track of the
amount of memory in use, since 'free' does not communicate the number
of bytes freed from an allocation. I use this technique to create an
allocator that places a maximum bound on memory to test error-handling
at arbitrary out-of-memory conditions.

You may be seeing something similar. In essence, you want to retain a
pointer to the actual object, but add information to the "negative"
side of the pointer for bookkeeping tasks.

Best regards,
John D.
 
K

Keith Thompson

Noob said:
Keith said:
Converting the offset value from unsigned int to int is probably
cleaner:

int offset = n1 - n2;
sp = (struct dummy *) &membuff[offset];

Cleaner, perhaps, but it has implementation-defined behavior
when n1 < n2

"When [...] an unsigned integer is converted to its corresponding
signed integer, if the value cannot be represented the result is
implementation-defined."

D'oh!. You're right -- and for reasons that I emphasized in the same
article.

I actually can't think of a good clean way to compute the difference
between two unsigned int values as a signed int. You can convert
both values to int:

int offset = (int)n1 - (int)n2;

but that potentially fails if either value exceeds INT_MAX. Or you can
write:

int offset = (n1 >= n2) ? (n1 - n2) : -(n2 - n1);

(the equivalent if statement might be more legible), but that's ugly.

You could convert to a wider type, but there's no guarantee that there
is a wider type.
 
K

Keith Thompson

Keith Thompson said:
However, in your case the index expression is ``n1 - n2''. Both n1 and
n2 are of type unsigned int, with values 0 and 2, respectively -- which
means that the result of the subtractin is *not* -2, but UINT_MAX-2
(on your system, 4294967294U).
[...]

That's UINT_MAX-1, not UINT_MAX-2.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,116
Latest member
LeanneD317
Top