size of a pointer on 4-bit system

BartC · Feb 2, 2013

Shao Miller said:
Off-topic, but: Do any of these care about '2 * * x' versus '2**x'?

In most of them, *x doesn't have the meaning it has in C. So it would be
syntax error. But I don't think ** for exponentiation is that universal
either.

glen herrmannsfeldt · Feb 2, 2013

Keith Thompson said:
glen herrmannsfeldt said:

PL/I has no reserved words, but requires space between adjacent keywords
and variables. There are some special exceptions. I believe like C,
both GOTO and GO TO are allowed in free-form Fortran and PL/I. [...]

Click to expand...

Like C? C allows `goto`; `go to` is a syntax error.

Oops. I haven't written a goto in C for a long time,
and then, even in the other languages, I wrote it goto.

-- glen

Keith Thompson · Feb 2, 2013

The unary sizeof operator binds more tightly than multiplication, so
the first yields the size of `2 * *x`, and the second is equivalent
to `(sizeof 2) * (*x)`.

It's not usually necessary to use parentheses with sizeof applied to an
expression, but as with any other operator you sometimes need
parentheses to deal with precedence.

PL/I inherited it from Fortran, gawk (GNU awk) has it in
addition to the ^ operator.

Now, why doesn't C have one?

x**y and x^y are already well-defined expressions in C
(multiplication by a dereference and exclusive-or, respectively).

I suppose ^^ could have been used (since you can't have a
short-circuit xor). But the systems programming domain for which
C was originally designed didn't have much use for exponentiation.

Keith Thompson · Feb 2, 2013

BartC said:
In most of them, *x doesn't have the meaning it has in C. So it would be
syntax error. But I don't think ** for exponentiation is that universal
either.

It's not universal, but I think most languages that have an
exponentiation operator spell it "**".

C *could* have had ** as an exponentation operator from the
beginning. Then multiplication by a dereference would have to
be written with a space or parentheses (`x * *y` or `x*(*y)`) --
just as we have to do for `x - -y`.

glen herrmannsfeldt · Feb 3, 2013

(snip)

It's not universal, but I think most languages that have an
exponentiation operator spell it "**".

Well, when BASIC was new ASCII had an up-arrow character, later
replaced by tilde. EBCDIC has no up-arror or tilde, so it wasn't
likely that any IBM language would use them.

I normally don't use ^ for exponentiation in posts, as it seems
too much like the XOR operator.

C *could* have had ** as an exponentation operator from the
beginning. Then multiplication by a dereference would have to
be written with a space or parentheses (`x * *y` or `x*(*y)`) --
just as we have to do for `x - -y`.

Yes, but now it is too late for that one.

Which combination of characters doesn't currently have any meaning.
My thought so far is !*! which I believe isn't used. You can't
apply unary * to the result from !, and there is no postfix !
that could come before binary *.

I suppose ^^ could also work, though I keep hoping for a logical
(not bitwise) XOR operator.

-- glen

Ben Bacarisse · Feb 3, 2013

Richard Damon said:
Tim Rentsch said:

"Tim Rentsch" <[email protected]> wrote in message

Click to expand...

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

Click to expand...

What, that extra information is stored in those padding bits? How is
that going to help?

What bit-pattern could be returned by sizeof(*x) that would make
100*sizeof(*x) yield 50 instead of 100?

Click to expand...

one simple solution is to have the sizeof operator not return a size_t
for the nybble type, but some special type that does math "funny" to get
the right value, for example a fixed point type with 1 fractional bit.
When doing arithmetic on this type, the compiler can get the "right"
answer, and also do the right thing when converting it to a standard
type.

I don't see how that can be permitted, at least without twisting things
some more! The type of a sizeof expression is size_t (an unsigned
integer type) and the result must be "an integer".

If (and this is a moot point) a value can have padding bits, the result
could be a trap representation and then all bets are off and the
implementation could do what it likes, but I don't think that's how
values work.

It would say that

malloc(100 * sizeof(*x));

and

size_t size = sizeof(*x);
malloc(100 * size);

might return different sized allocations, but as long as the "nybble"
type is given a name in the implementation reserved name space (like
_Nybble), the the use of that name leads to "undefined behavior" by the
standard which is then defiend by the implementation to be useful.

Are you imagining a sort of nuclear option for UB? I.e. that just using
some non-standard type anywhere in a program makes all of it undefined?
That's probably defensible, but it's not how most people like to talk
about such things.

James Kuyper · Feb 3, 2013

On 02/02/2013 09:25 PM, Ben Bacarisse wrote:
....

Are you imagining a sort of nuclear option for UB? ...

Undefined behavior IS the nuclear option. It's hard to go any more
"nuclear" than "no requirements".

James Kuyper · Feb 3, 2013

Because for most usages they act just like regular types! Note that the
change in behavior that I described is not "dramatically" different than
for normal types, sizeof just returns a type that can express fractions
of a "byte" and defines math the make this work as normal.

Remember, the original question was how to make an implementation that
allowed for a type that was smaller than char. This method keeps totally
conforming behavior for programs that do not use this feature, and
mostly the expected result when using the feature.

Yes, it would be possible based on this logic to make an implementation
that did something totally weird when using a type like _Nybble, but
then, based on the standard that IS allowed. It should only be done with
an implementation that has provided a useful meaning to this form of
undefined behavior.

Explicit use of _Nybble can have arbitrarily weird consequences, because
it's a reserved name, so your code can't define it itself, and because
most uses of an identifier that has no definition are constraint
violations. That's not the case which justifies bothering to explicitly
allow for extended integer types.

The case which does matter is if an extended integer type is used by the
implementation for one of the standard things that can be typedefs for
an integer type, such as size_t. Using a size_t that's a typedef for an
extended integer type does not have undefined behavior (at least, not
for that reason), and must have behavior consistent with the
requirements of the standard. Which rules out, among other things,
having sizeof() behave as oddly as you suggests it could, if used on
that typedef. If used directly on the extended type, there's no problem,
but when used on the standard typedef, it must behave normally. It must
do all the things that the standard requires of a type suitable for use
as the corresponding typedef.

If it weren't for the sizeof() issue, _Nybble could obviously meet the
requirements for int4_t, int_least4_t, and int_fast4_t. Less
obviously, it could also meet the requirements for clock_t, fpos_t.
time_t, though selecting it for any of those purposes would render the
corresponding features of C nearly useless.

BartC · Feb 3, 2013

Because for most usages they act just like regular types! Note that the
change in behavior that I described is not "dramatically" different than
for normal types, sizeof just returns a type that can express fractions
of a "byte" and defines math the make this work as normal.

That would need to include all conversions to int being rounded up, so that
99*sizeof(nybble) gives 50 rather than 49. But there would still be plenty
of dodgy areas:

nybble A[39];
int x,y;

x=sizeof(A); // 20
y=sizeof(A[0]); // 1

Now, x/y will yield 20 instead of 39.

On the other hand, a new bitsizeof() version would give 156/4 or 39 as
expected (and would work for other types too, and would benefit from being a
64-bit result rather than 32-bit). An explicit conversion is needed to give
a rounded-up byte size (eg. bitstobytes()).

Roberto Waltman · Feb 3, 2013

Bart said:
The 80286 processor supports different sized pointers for data and
functions, and either of them could be larger than an int.

So did the humble 8088/8086 in the "compact" and "medium" memory
models.

James Kuyper · Feb 3, 2013

On 2/3/13 7:46 AM, James Kuyper wrote: ....

The case which does matter is if an extended integer type is used by the
implementation for one of the standard things that can be typedefs for
an integer type, such as size_t. Using a size_t that's a typedef for an
extended integer type does not have undefined behavior (at least, not
for that reason), and must have behavior consistent with the
requirements of the standard. Which rules out, among other things,
having sizeof() behave as oddly as you suggests it could, if used on
that typedef. If used directly on the extended type, there's no problem,
but when used on the standard typedef, it must behave normally. It must
do all the things that the standard requires of a type suitable for use
as the corresponding typedef.

If it weren't for the sizeof() issue, _Nybble could obviously meet the
requirements for int4_t, int_least4_t, and int_fast4_t. Less
obviously, it could also meet the requirements for clock_t, fpos_t.
time_t, though selecting it for any of those purposes would render the
corresponding features of C nearly useless.

Click to expand...

I never said size_t would be en extended integer type.

No, you did not. However, an implementation choosing to identify _Nybble
as one of the extended integer types it supports, and then using it in
one of the standard typedefs, is the only way in which there could be a
non-trivial connection between the language it implements and the
requirements of the C standard. That non-trivial connection is that it
cannot conform to the standard's requirements of sizeof expressions, but
the reasons why are subtle enough to be worthy of some discussion.
Otherwise, everything is covered by "undefined behavior", and there's
nothing worth saying about it in this forum (which hasn't apparently
stopped people).

Shao Miller · Feb 3, 2013

On 2/3/13 7:46 AM, James Kuyper wrote: ...

The case which does matter is if an extended integer type is used by the
implementation for one of the standard things that can be typedefs for
an integer type, such as size_t. Using a size_t that's a typedef for an
extended integer type does not have undefined behavior (at least, not
for that reason), and must have behavior consistent with the
requirements of the standard. Which rules out, among other things,
having sizeof() behave as oddly as you suggests it could, if used on
that typedef. If used directly on the extended type, there's no problem,
but when used on the standard typedef, it must behave normally. It must
do all the things that the standard requires of a type suitable for use
as the corresponding typedef.

If it weren't for the sizeof() issue, _Nybble could obviously meet the
requirements for int4_t, int_least4_t, and int_fast4_t. Less
obviously, it could also meet the requirements for clock_t, fpos_t.
time_t, though selecting it for any of those purposes would render the
corresponding features of C nearly useless.

Click to expand...

I never said size_t would be en extended integer type.

Click to expand...

No, you did not. However, an implementation choosing to identify _Nybble
as one of the extended integer types it supports, and then using it in
one of the standard typedefs, is the only way in which there could be a
non-trivial connection between the language it implements and the
requirements of the C standard. That non-trivial connection is that it
cannot conform to the standard's requirements of sizeof expressions, but
the reasons why are subtle enough to be worthy of some discussion.
Otherwise, everything is covered by "undefined behavior", and there's
nothing worth saying about it in this forum (which hasn't apparently
stopped people).

This response could be evidence of the problem with using "type" that I
had tried to point out earlier. Don't call it a "type" and someone
can't confuse it with "extended integer type". '_Nybble' and '_Nsize_t'
could be extension keywords and could be defined as part of the
'type-specifier' syntax, but like 'typedef' as a
'storage-class-specifier', could have different meaning and constraints.

Ben Bacarisse · Feb 4, 2013

James Kuyper said:
On 02/02/2013 09:25 PM, Ben Bacarisse wrote:
...

Undefined behavior IS the nuclear option. It's hard to go any more
"nuclear" than "no requirements".

I meant nuclear in terms of the discussion. When talking about how well
or easily C can support feature X, the discussion is interesting (to me
at least) only in so far as the "nuclear option" is not invoked. All
features can be supported in an extended C -- just add

__Undefined;

at the top and you can have bit addressing, arbitrary precision numbers,
closures and what have you. What's interesting is how well the new
facility integrates with the rest of the language. How much of the
existing semantics must be thrown out? How much of the normal behaviour
of C can be applied to the new types or features? Once you use a
_Nibble type, it's perfectly permissible for the expression 1 + 2 to
have the value "banana", but we'd really like most of C to remain
unchanged.

Shao Miller · Feb 4, 2013

I meant nuclear in terms of the discussion. When talking about how well
or easily C can support feature X, the discussion is interesting (to me
at least) only in so far as the "nuclear option" is not invoked. All
features can be supported in an extended C -- just add

__Undefined;

at the top and you can have bit addressing, arbitrary precision numbers,
closures and what have you. What's interesting is how well the new
facility integrates with the rest of the language. How much of the
existing semantics must be thrown out? How much of the normal behaviour
of C can be applied to the new types or features? Once you use a
_Nibble type, it's perfectly permissible for the expression 1 + 2 to
have the value "banana", but we'd really like most of C to remain
unchanged.

Agreed, and Objective-C comes to mind.

But to be honest, I'd prefer to use a 'malloc' wrapper, "array access"
wrapper function (which could be inlined during translation), etc. that
know about nibbles and perform the required computations, rather than to
spend hours defining a new
language^H^H^H^H^H^H^H^H^H^H^H^H^H^Hextensions. $0.02.

Tim Rentsch · Feb 4, 2013

Ben Bacarisse said:
Tim Rentsch said:

BartC said:

There's no reason a conforming implementation couldn't provide its
own four-bit data types, including arrays of such data types. The
rule that all non-bitfield types must be multiples of 'char' in
width applies only to Standard-defined types, not to types provided
as part of extensions.

OK, I agree. It does have the complication that

nybble *x;
x=malloc(100*sizeof(*x));

won't allocate 100 nybbles, unless sizeof works differently.
[snip]

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Click to expand...

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

Click to expand...

I can't. Can the value of an operator that yields an integer have
padding bits? The wording is not always 100% unambiguous, but my
understanding that padding bits are part of the representation.

The point of my comment is not that it is possible but that it is
feasible. It's easy enough to make sizeof give an appropriate
value. The question is, can we devise a representation for size_t
that can hold such values reliably but still falls within the
Standard's boundaries for integer types, and also has a reasonable
implementation?

My first thought was that you want the padding bits (always
supposing they can be there in a value) to encode the fact the
result is a nibble size rather than a byte size, but I still can't
square that with the need for the result to be both "an integer"
and "the size (in bytes) of its operand".

I don't know quite how to respond to this. What we're talking
about is an extension to standard C. It seems obvious that an
extension to C necessarily involves undefined behavior, otherwise
it wouldn't be an extension. On top of that, I already said in
the previous posting that "Using 'sizeof' on a non-standard
datatype may behave in unexpected ways." The Standard doesn't
impose any requirements for sizeof applied to such data types, or
indeed any type outside of those defined in the Standard. The
interesting part is what happens downstream from sizeof. In
particular, can we define semantics for size_t that both satisfies
the requirements for regular C data types, and also allows size_t
values for the four-bit data types to be used in the same sorts of
ways that their Standard-defined counterparts are and still have
them work as expected? (And obviously I think the answer is yes.)

Tim Rentsch · Feb 4, 2013

Richard Damon said:
Tim Rentsch said:

"Tim Rentsch" <[email protected]> wrote in message

Click to expand...

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

Click to expand...

What, that extra information is stored in those padding bits? How is
that going to help?

What bit-pattern could be returned by sizeof(*x) that would make
100*sizeof(*x) yield 50 instead of 100?

Click to expand...

one simple solution is to have the sizeof operator not return a
size_t for the nybble type, but some special type that does math
"funny" to get the right value, for example a fixed point type
with 1 fractional bit. When doing arithmetic on this type, the
compiler can get the "right" answer, and also do the right thing
when converting it to a standard type.

This is basically the idea, except the result isn't a new type
but is always a size_t. The key insight is that size_t can be
what is in effect a fixed-point type (with three fraction bits,
for example), but still satisfy the requirements for being an
integer type by designating the fraction bits as "padding bits".
Any combination of fraction bits other than all zeroes would be
a trap representation, allowing both standard behavior and
extended behavior in the same data type (ie, size_t).

It would say that

malloc(100 * sizeof(*x));

and

size_t size = sizeof(*x);
malloc(100 * size);

might return different sized allocations, but as long as the
"nybble" type is given a name in the implementation reserved
name space (like _Nybble), the the use of that name leads to
"undefined behavior" by the standard which is then defiend by
the implementation to be useful.

I would find discrepancies like this disquieting. And there are
other cases, eg, calloc(), where preserving the fractional
information in size_t could be important. It seems better to
have size_t be able to carry around the extra information,
since generally that should yield higher fidelity overall.

Tim Rentsch · Feb 4, 2013

BartC said:
Tim Rentsch said:

BartC said:

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Click to expand...

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

Click to expand...

What, that extra information is stored in those padding bits?
How is that going to help?

What bit-pattern could be returned by sizeof(*x) that would
make 100*sizeof(*x) yield 50 instead of 100?

Richard Damon gives an idea along similar lines to what I was
thinking. My response to his posting has the details.

Tim Rentsch · Feb 4, 2013

BartC said:
Because for most usages they act just like regular types!
Note that the change in behavior that I described is not
"dramatically" different than for normal types, sizeof just
returns a type that can express fractions of a "byte" and
defines math the make this work as normal.

Click to expand...

That would need to include all conversions to int being rounded
up, [snip elaboration].

Another possibility would be to trap any conversion that would
lose information. The implementation could provide various
rounding functions (guaranteed to succeed) for converting a
possibly non-integral size_t values. Alternatively, assigning a
size_t to a floating-point type could be used to extract the
fractional value safely, or provide rounding as desired. Needless
to say the actual code generated wouldn't necessarily have any
floating-point conversions in it, under the "as if" rule.

Tim Rentsch · Feb 4, 2013

Ben Bacarisse said:
Richard Damon said:

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

What, that extra information is stored in those padding bits? How is
that going to help?

What bit-pattern could be returned by sizeof(*x) that would make
100*sizeof(*x) yield 50 instead of 100?

Click to expand...

one simple solution is to have the sizeof operator not return a size_t
for the nybble type, but some special type that does math "funny" to get
the right value, for example a fixed point type with 1 fractional bit.
When doing arithmetic on this type, the compiler can get the "right"
answer, and also do the right thing when converting it to a standard
type.

Click to expand...

I don't see how that can be permitted, at least without twisting
things some more! The type of a sizeof expression is size_t (an
unsigned integer type) and the result must be "an integer".

Here's another way of looking at it that may help. Using sizeof
is supposed to give the size of its operand in bytes. For a
four-bit data type, that should be a number strictly between
zero and one. Under 6.5 p5, such a circumstance qualifies as an
exceptional condition and therefore is undefined behavior.

If (and this is a moot point) a value can have padding bits,
the result could be a trap representation and then all bets are
off and the implementation could do what it likes, but I don't
think that's how values work.

Are you imagining a sort of nuclear option for UB? I.e. that
just using some non-standard type anywhere in a program makes
all of it undefined? That's probably defensible, but it's not
how most people like to talk about such things.

I was meaning to say something stronger, or at least how I think
of it is stronger. There is local undefined behavior at sizeof,
because of the exceptional condition, and another local undefined
behavior for a size_t with non-zero fraction bits. However, once
these two local undefined behaviors are defined, everything else
proceeds definedly (not counting things like pointers to the new
type, etc, which also have to be defined, but I think you get the
idea). The presence of undefined behavior is repaired purely
locally by defining the semantics for these two specific cases, and
otherwise has no effect (again assuming that other aspects have
been defined suitably).

Shao Miller · Feb 4, 2013

Richard Damon said:
Richard Damon said:

The sizeof operator can be defined in a way that satisfies all
the Standard's requirements but still allows this example to
allocate only 100 nibbles.

So what would be the value of sizeof(*x)? It can only really
be 1 or 0. [snip elaboration]

Actually that isn't right. Using 'sizeof' on a non-standard
datatype may behave in unexpected ways. Here's a hint: sizeof
yields a result of type size_t; size_t is an (unsigned) integer
type; integer types larger than character types may have padding
bits. Can you fill in the rest?

What, that extra information is stored in those padding bits? How is
that going to help?

What bit-pattern could be returned by sizeof(*x) that would make
100*sizeof(*x) yield 50 instead of 100?

Click to expand...

one simple solution is to have the sizeof operator not return a
size_t for the nybble type, but some special type that does math
"funny" to get the right value, for example a fixed point type
with 1 fractional bit. When doing arithmetic on this type, the
compiler can get the "right" answer, and also do the right thing
when converting it to a standard type.

Click to expand...

This is basically the idea, except the result isn't a new type
but is always a size_t. The key insight is that size_t can be
what is in effect a fixed-point type (with three fraction bits,
for example), but still satisfy the requirements for being an
integer type by designating the fraction bits as "padding bits".
Any combination of fraction bits other than all zeroes would be
a trap representation, allowing both standard behavior and
extended behavior in the same data type (ie, size_t).

Why bother calling it a "trap representation"? 'sizeof (_Nibble)'
doesn't involve any lvalues, so any usage of the result doesn't have to
do with trap representations until it's been stored somewhere and read
back[6.2.6.1p5]. If, instead, you say that the undefined behaviour is
invoked by '_Nibble' not being a complete object type but also not
contradicting the constraints of 6.5.3.4p1, then you don't _need_ any
trap representations nor even a 'size_t' result...

But if you want to call it a trap representation when you store the
undefined 'size_t' value because that object representation will have
padding bits and does "not represent a value of the object type", so be
it. It just seems a bit arbitrary... Why not call it "an invalid
value", instead?

Typecasting Pointers on a 64 bit System	54	Nov 10, 2011
Size of a compound literal array	1	Sep 12, 2013
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
pointer size depends on what	3	May 26, 2008
Population count of a bit string	6	Nov 22, 2009
Typecasting Pointers on a 64 bit System	12	Aug 8, 2006
virtual and class size	5	Aug 24, 2011
Crash with ruby1.9 on 64 bit system	6	May 9, 2011

size of a pointer on 4-bit system

BartC

glen herrmannsfeldt

Keith Thompson

Keith Thompson

glen herrmannsfeldt

Ben Bacarisse

James Kuyper

James Kuyper

BartC

Roberto Waltman

James Kuyper

Shao Miller

Ben Bacarisse

Shao Miller

Tim Rentsch

Tim Rentsch

Tim Rentsch

Tim Rentsch

Tim Rentsch

Shao Miller

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads