size of a pointer on 4-bit system

BartC · Feb 5, 2013

Shao Miller said:
On 2/4/2013 20:17, BartC wrote:

Why do they have to be crude? Functions that the implementation provides
as extensions needn't be crude at all, and could translate with whatever
efficiency is possible.

A function library created with ordinary C with no implementation help will
always be lacking.

There is no overloading of the functions (so you'd need separate sets to
deal with 1-bit and 4-bit, unless you made the width a parameter, which is
unwieldy). No overloading of built-in operators (so you'd be writing
indexbit(a,i) instead of a). And so on. There's no integration with the
language.

And without being a standard language feature, every implementation will be
different.

bit s[256]; //32 bytes
bit* p = &s[123];

Click to expand...

/* Is this allowed? */
struct {
int i;
bit ba[5];
double d;
} foo;
size_t sz = sizeof foo;

Click to expand...

That's tricky, because C already has bit-fields inside structs! So I'm not
sure how they would interact, or whether successive odd-length bit-arrays
would be packed together, so the next one could start in the middle of a
byte.

(In my implementation, I don't have single bits as independent variables or
struct members; they're only allowed in arrays or as pointer targets (so
it's possible to point to the middle of an ordinary int for example). Your
ba[5] example would start on a byte boundary, and its size would be rounded
up to the next byte, ie. 1 byte.)

glen herrmannsfeldt · Feb 5, 2013

(snip)

It's 'sizeof', not 'sizeof()', if you please. Else we could
discuss the '()+()' operator, the '*()' and '()*()' operators,
etc.

You mean the operators used in #define macros to avoid precedence
problems with the expressions used?

#define square(x) ((x)*(x))

or

#define square(x) x*x

-- glen

glen herrmannsfeldt · Feb 5, 2013

(snip, regarding the 4004, I wrote)

I agree, that for some of the very small 4-bitters (at least the 4004
allowed you to read static data from ROM - more than a few
four-bitters didn't), a C compiler is a bit silly at best, although a
limited C-like language might be nice in some cases.

Well, do they at least have load immediate?

If I remember, the 8048 has a way to get data from ROM, but not
so convenient as it could be.

Still, it's a
rare four-bitter than has more than a couple of K words of instruction
memory and 128 nibbles of data memory - at that point the overhead to
implement an eight-bitter becomes trivial (an 8051 core can be done in
less than 5000* transistors - just adding 128 nibbles of ram will take
3000+, and 2K bytes of ROM are the equivalent of 16,000+). Nor do
large program memory make sense on four-bitters, the invariable worse
code density on a four-bitter quickly grows the required ROM.

It seems 2300 for the 4004. And intel released the schematics
as part of an anniversary celebration. I would have thought more
for the 8051 than 5000, though.

More interesting is a case where a larger CPU has smaller (finer?)
than byte addressing. This has been done (bit addressing on Burroughs
B1xxx machines, for example), but making it possible to take advantage
of that in a C program would clearly require some interesting
extensions.

I don't know the Burroughs machines. As I understand it, the
IBM Stretch is also bit addressable.

*While that may seem a lot compared to ~6000 for an 8080, the 8051
will have a number of peripherals (timers, I/O ports, clock
generators, interrupt controller) that would be external on an 8080.
Nor is that a sort of lower limit.

-- glen

James Kuyper · Feb 5, 2013

(snip)

You mean the operators used in #define macros to avoid precedence
problems with the expressions used?

#define square(x) ((x)*(x))

In responding to that question, I wanted to refer to the "()" operator,
but the standard never uses that term, so I'll be making a slight
<detour>
The standard defines what the term operator means in 6.4.6p2:
"A punctuator is a symbol that has independent syntactic and semantic
significance. Depending on context, it may specify an operation to be
performed (which in turn may yield a value or a function designator,
produce a side effect, or some combination thereof) in which case it is
known as an operator (other forms of operator also exist in some
contexts)."

The standard never provides a comprehensive definition that includes
those other forms, but it does refer to "the sizeof operator" in several
different contexts. It also refers to "the subscript operator []", so
it's clear that constructs that work like "()" can be called operators,
but it never refers to "the parenthesis operator ()". However, with
those precedents, I think we're entitled to refer to "the parenthesis
operator ()". whenever parentheses are used to create a
primary-expression in accordance with 6.5.1p4.
</detour>

As a macro, square() can be applied to arguments and in contexts where
the following explanation is completely meaningless. However, in normal
use, expansion of square() results in an expression making one use of
the binary '*' operator and three uses of the parenthesis operator. It
does not makes sense to describe it in terms of a ()*() operator.

Similarly, sizeof(5) is an expression making use of one sizeof operator
and one parenthesis operator; it doesn't make sense to talk about it in
terms of a combined sizeof() operator, for the same reason that it
doesn't make sense to talk about a combined ()*() operator.
However, in sizeof(int), the parentheses are part of the syntax of the
sizeof expression, so in that context sizeof() is an operator.

Keith Thompson · Feb 5, 2013

James Kuyper said:
Similarly, sizeof(5) is an expression making use of one sizeof operator
and one parenthesis operator; it doesn't make sense to talk about it in
terms of a combined sizeof() operator, for the same reason that it
doesn't make sense to talk about a combined ()*() operator.
However, in sizeof(int), the parentheses are part of the syntax of the
sizeof expression, so in that context sizeof() is an operator.

Yes, that's the way the language defines it, no argument there.

But it's not the way I would have defined it.

I think it would have been more consistent to restrict the idea
of an "operator" to something that takes one or more operands,
each of which is an expression, with the operator and its operands
also being an expression. I dislike referring to the `(int)` in
`sizeof (int)` as an operand. (It would be fine if C permitted
parenthesized type names to be expressions, but it doesn't.)
Similarly, I dislike treating `.` as an operator; its left operand
is an expression (of struct or union type), but its right "operand"
can only be an identifier that names a member, and that cannot be
used as expression by itself.

If it had been up to me, the `sizeof` in `sizeof expr` would be
considered an operator, but the `sizeof` in `sizeof (type-name)`
would not; instead, the whole thing would be just a special kind
of expression. And `.member-name` might be treated as a postfix
operator that can be applied to an expression of struct or union
type; either that, or `prefix.member-name` would be another special
kind of expression. (The latter avoids having a potentially
unlimited number of distinct postfix operators.)

Likewise for `_Alignof` and `->`, of course.

But if I were looking for perfect consistency, I wouldn't be
programming in C.

Roberto Waltman · Feb 5, 2013

Robert said:
More interesting is a case where a larger CPU has smaller (finer?)
than byte addressing. This has been done (bit addressing on Burroughs
B1xxx machines, for example), but making it possible to take advantage
of that in a C program would clearly require some interesting
extensions.

The Intel 8051/2 have a small area of memory configured as single bit
variables that can be directly addressed by the instruction set
without the usual mask/shift/or/and operations.

The Keil C compiler for the '51 has a "bit" type mapped directly to
those. It is posible to set them, clear, compare, etc., but you can
not take their address, declare a pointer to them, etc.

glen herrmannsfeldt · Feb 5, 2013

(snip on operators, or not)

Yes, that's the way the language defines it, no argument there.

But it's not the way I would have defined it.

I think it would have been more consistent to restrict the idea
of an "operator" to something that takes one or more operands,
each of which is an expression, with the operator and its operands
also being an expression. I dislike referring to the `(int)` in
`sizeof (int)` as an operand. (It would be fine if C permitted
parenthesized type names to be expressions, but it doesn't.)
Similarly, I dislike treating `.` as an operator; its left operand
is an expression (of struct or union type), but its right "operand"
can only be an identifier that names a member, and that cannot be
used as expression by itself.

I suppose, but compare to Fortran where % (the structure member
character), () (array subscript or string substring) and ()
(function call) are not operators.

Unlike C, you can't reference a structure member, subscript,
or call a function from the return value of a function in
Fortran. The value of a function returning a structure, string,
or array has to be assigned to an appropriate variable, and
you reference the member, substring, or element of that variable.

Also, Fortran has functions that work in ways similar to the sizeof
operator, including, in newer standards, C_SIZEOF(). Yes defined
as a function, though I don't believe that you could write one
in Fortran (or C).

If it had been up to me, the `sizeof` in `sizeof expr` would be
considered an operator, but the `sizeof` in `sizeof (type-name)`
would not; instead, the whole thing would be just a special kind
of expression. And `.member-name` might be treated as a postfix
operator that can be applied to an expression of struct or union
type; either that, or `prefix.member-name` would be another special
kind of expression. (The latter avoids having a potentially
unlimited number of distinct postfix operators.)

Likewise for `_Alignof` and `->`, of course.

But if I were looking for perfect consistency, I wouldn't be
programming in C.

-- glen

Shao Miller · Feb 5, 2013

A function library created with ordinary C with no implementation help will
always be lacking.

Yes; sorry. I meant that the implementation could allow #include
<bitstuff.h> to do something useful, such as exposing extensions with
non-reserved identifiers. Any functions (or function-like thingies)
that the implementation provides as extensions needn't be implemented in C.

There is no overloading of the functions (so you'd need separate sets to
deal with 1-bit and 4-bit, unless you made the width a parameter, which is
unwieldy). No overloading of built-in operators (so you'd be writing
indexbit(a,i) instead of a). And so on. There's no integration with the
language.

I'd rather that such an extension actually not be confused with C...
For example, the subscript operator is defined (in part) in terms of
pointer arithmetic and indirection. Pointers point to bytes, not
nibbles. So I'd rather see a different syntax, personally.

And without being a standard language feature, every implementation will be
different.

Yup.

bit s[256]; //32 bytes
bit* p = &s[123];

Click to expand...

/* Is this allowed? */
struct {
int i;
bit ba[5];
double d;
} foo;
size_t sz = sizeof foo;

Click to expand...

That's tricky, because C already has bit-fields inside structs! So I'm not
sure how they would interact, or whether successive odd-length bit-arrays
would be packed together, so the next one could start in the middle of a
byte.

Click to expand...

Well that's the thing... Why does 'bit' or '_Nybble' or whatever have
to be a complete object type? If we drop that claim, perhaps we can
avoid complications like the above, addressing, etc., then use a
convenient (but different) syntax for dealing with these things.
Someone _learning_ from the code also has a better chance of recognizing
such usage as extensions, too. (I'd think.)

(In my implementation, I don't have single bits as independent variables
or struct members; they're only allowed in arrays or as pointer targets
(so it's possible to point to the middle of an ordinary int for
example). Your ba[5] example would start on a byte boundary, and its
size would be rounded up to the next byte, ie. 1 byte.)

Click to expand...

You have a C implementation? Your pointers can point to bits?

Shao Miller · Feb 5, 2013

(snip)

You mean the operators used in #define macros to avoid precedence
problems with the expressions used?

#define square(x) ((x)*(x))

or

#define square(x) x*x

Heheh, sure. My favourite is probably:

#define Countof(array) (sizeof (array) / sizeof *(array))

Beyond that, 'sizeof()' is obviously valid because it's a description of
the terminal syntax elements for 'sizeof ( type-name )'. But that
leaves out 'sizeof unary-expression', so plain 'sizeof' seems more
appropriate in a discussion of this operator. 'sizeof()' just looks
irritatingly like whoever types it thinks it's a function! (And
sometimes they do!)

Shao Miller · Feb 5, 2013

(snip)

You mean the operators used in #define macros to avoid precedence
problems with the expressions used?

#define square(x) ((x)*(x))

Click to expand...

In responding to that question, I wanted to refer to the "()" operator,
but the standard never uses that term, so I'll be making a slight
<detour>
The standard defines what the term operator means in 6.4.6p2:
"A punctuator is a symbol that has independent syntactic and semantic
significance. Depending on context, it may specify an operation to be
performed (which in turn may yield a value or a function designator,
produce a side effect, or some combination thereof) in which case it is
known as an operator (other forms of operator also exist in some
contexts)."

The standard never provides a comprehensive definition that includes
those other forms, but it does refer to "the sizeof operator" in several
different contexts. It also refers to "the subscript operator []", so
it's clear that constructs that work like "()" can be called operators,
but it never refers to "the parenthesis operator ()". However, with
those precedents, I think we're entitled to refer to "the parenthesis
operator ()". whenever parentheses are used to create a
primary-expression in accordance with 6.5.1p4.
</detour>

Parentheses are the terminal syntax elements for a few different cases,
including casts, function calls, compound literals. Typing "the '()'
operator" doesn't help too much to distinguish between these cases.

BartC · Feb 5, 2013

You have a C implementation? Your pointers can point to bits?

No, only an implementation of bit-types in another language. But many of the
issues are the same.

And pointers can point to bits; they just require an ordinary pointer plus a
bit-offset.

That would be an extremely useful addition to C; in fact probably more
useful than bit-types themselves. But it's unlikely they will ever form a
part of the language when it's so easy to throw something together with
shifts and masks and bit-pointers emulated using (char*,int) structs. It
doesn't even have binary literals so I can't see it happening!

David Thompson · Feb 11, 2013

Well, when BASIC was new ASCII had an up-arrow character, later
replaced by tilde. EBCDIC has no up-arror or tilde, so it wasn't
likely that any IBM language would use them.

Nit: 5/14 was uparrow and became officially circumflex often called
caret or hat. IIRC EBCDIC of those days did not have circumflex but
did have hook aka logical-not, which was usually a good mapping.

(Also 5/15 was leftarrow and became underline.)

glen herrmannsfeldt · Feb 11, 2013

David Thompson said:
On Sun, 3 Feb 2013 00:03:02 +0000 (UTC), glen herrmannsfeldt

Nit: 5/14 was uparrow and became officially circumflex often called
caret or hat. IIRC EBCDIC of those days did not have circumflex but
did have hook aka logical-not, which was usually a good mapping.

Yes. EBCDIC has not and cent, ASCII has tilde and circumflex.
They usually get cross mapped, but not always the same way.

After a while, I forget which one should go to which.

Most fun is seeing PL/I with the tilde-equal operator.

(Also 5/15 was leftarrow and became underline.)

Yes, that too.

-- glen

Tim Rentsch · Feb 12, 2013

Keith Thompson said:
Tim Rentsch said:

If you think about this a little while I expect you'll see that
expressions like (size_t)1 / 2 must not set any "fraction" bits.
This expression is well-defined by the Standard - it must behave
exactly like 0 in all respects for all subsequent operations (ie,
operations whose behavior is defined by the Standard). The only
value operations that produce size_t representations with non-zero
fractions must have an element of undefined behavior, or possibly
implementation-defined behavior. This expression is completely
defined so it mustn't do that.

Click to expand...

Hmm.

There can be representations for an integer 0 other than all-bits-zero,
so I'm not sure that having `(size_t)1 / 2` set some of the
padding/fraction bits to 1 would be forbidden. But certainly
`(size_t)1 / 2 * 2` must be 0.

On the other hand, we'd want `sizeof (_Nybble[2])` to denote 1
8-bit byte, and `sizeof (_Nybble[2]) / 2` to denote 1 4-bit nybble.

To get that result, you could use

sizeof (char[ sizeof(nibble[2]) / 2. ])

taking advantage of a companion extension that allows floating-point
values as integer expressions in array bounds expressions. (Note
for purists: after doing a #include <nibble.h>.)

<editorial>
Spelling nibble with a 'y' rather than 'i' is an affectation.
There is good reason to use 'y' rather than 'i' in byte, and
it also makes sense phonetically. Neither is true for nibble.

The only way I can think of to make this work consistently
would be to add another padding bit to size_t, a flag that
indicates whether it's an ordinary C size value or something
that takes nybbles into account.

Possible. I don't think it's absolutely necessary, but it
might be worth trying out.

I'm even more convinced that it wouldn't be worth the effort.
The C language is not conveniently portable to 4-bit
addressable systems (or trinary systems, or analog systems, or
...).

I think you're assuming the target of such an implementation
would be a four-bit system. That needn't be the case. Certainly
I would like a C implementation on a large system that supported
four-bit, two-bit, and one-bit quantities.

Typecasting Pointers on a 64 bit System	54	Nov 10, 2011
Size of a compound literal array	1	Sep 12, 2013
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
pointer size depends on what	3	May 26, 2008
Population count of a bit string	6	Nov 22, 2009
Typecasting Pointers on a 64 bit System	12	Aug 8, 2006
virtual and class size	5	Aug 24, 2011
Crash with ruby1.9 on 64 bit system	6	May 9, 2011

size of a pointer on 4-bit system

BartC

glen herrmannsfeldt

glen herrmannsfeldt

James Kuyper

Keith Thompson

Roberto Waltman

glen herrmannsfeldt

Shao Miller

Shao Miller

Shao Miller

BartC

David Thompson

glen herrmannsfeldt

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads