sizeof (size_t) and sizeof (pointer)

Discussion in 'C++' started by Alex Vinokur, Nov 12, 2007.

  1. Alex Vinokur

    Alex Vinokur Guest

    Alex Vinokur, Nov 12, 2007
    #1
    1. Advertising

  2. Alex Vinokur

    Ron Natalie Guest

    Alex Vinokur wrote:
    > Does it have to be? :
    > sizeof (size_t) >= sizeof (pointer)
    >

    No. size_t only has to be big enough to represent the
    maximum number of objects that could be created. There
    are implementations where the sizeof the pointer is bigger
    than even the number of chars that could be allocated (i.e,
    not all the bits in the pointer were used to contribute
    tot he address). It's also not the case that all pointers
    need to be the same size.
    Ron Natalie, Nov 12, 2007
    #2
    1. Advertising

  3. * Alex Vinokur:
    > Does it have to be? :
    > sizeof (size_t) >= sizeof (pointer)


    No.

    Cheers, & hth.,

    - Alf

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Nov 12, 2007
    #3
  4. Ron Natalie wrote:
    > It's also not the case that all pointers
    > need to be the same size.


    Is that really so? I thought that it must be possible to cast any
    pointer to and from a void*. If there were different-sized pointers then
    it could be rather problematic.
    Juha Nieminen, Nov 12, 2007
    #4
  5. * Juha Nieminen:
    > Ron Natalie wrote:
    >> It's also not the case that all pointers
    >> need to be the same size.

    >
    > Is that really so? I thought that it must be possible to cast any
    > pointer to and from a void*. If there were different-sized pointers then
    > it could be rather problematic.


    There are different size pointers, e.g. member pointers tend to be
    larger than others. Then there are function pointers in general, which
    for freestanding functions tend to be the same size as data pointers,
    but cannot be cast to void*. And that's intentionally "problematic".

    Cheers, & hth.,

    - Alf

    --
    A: Because it messes up the order in which people normally read text.
    Q: Why is it such a bad thing?
    A: Top-posting.
    Q: What is the most annoying thing on usenet and in e-mail?
    Alf P. Steinbach, Nov 12, 2007
    #5
  6. Juha Nieminen wrote:
    > Ron Natalie wrote:
    >> It's also not the case that all pointers
    >> need to be the same size.

    >
    > Is that really so? I thought that it must be possible to cast any
    > pointer to and from a void*. If there were different-sized pointers
    > then it could be rather problematic.


    What if void* is at least as large as the largest of them? If the
    sizes do differ, it would makes sense, no?

    V
    --
    Please remove capital 'A's when replying by e-mail
    I do not respond to top-posted replies, please don't ask
    Victor Bazarov, Nov 12, 2007
    #6
  7. Alex Vinokur

    Bo Persson Guest

    Juha Nieminen wrote:
    :: Ron Natalie wrote:
    ::: It's also not the case that all pointers
    ::: need to be the same size.
    ::
    :: Is that really so? I thought that it must be possible to cast any
    :: pointer to and from a void*. If there were different-sized
    :: pointers then it could be rather problematic.

    You can only reliably cast the pointer back to the original type. So,
    as long as void* is among the largest types, other pointers can be
    smaller.



    Bo Persson
    Bo Persson, Nov 12, 2007
    #7
  8. Alex Vinokur wrote:
    > Does it have to be? :
    > sizeof (size_t) >= sizeof (pointer)
    > ...


    Firstly, by 'pointer' you probably mean something like 'void*', since,
    say, an object of 'SomeType (SomeClass::*)()' is also a "pointer" (a
    pointer-to-member-function) and it most likely will be [a lot] bigger
    that a 'size_t' on the same implementation.

    Secondly, even the ordinary 'void*' can be bigger than 'size_t'. In
    fact, typical DOS/Win16 implementations used to have a 16-bit 'size_t'
    and 32-bit 'void*' pointers (depended on memory model).

    --
    Best regards,
    Andrey Tarasevich
    Andrey Tarasevich, Nov 13, 2007
    #8
  9. Juha Nieminen wrote:
    >> It's also not the case that all pointers
    >> need to be the same size.

    >
    > Is that really so? I thought that it must be possible to cast any
    > pointer to and from a void*. If there were different-sized pointers then
    > it could be rather problematic.


    Well, to be C++-pedantic, you can expect to cast literally _any_ pointer
    to 'void*' that way. Only pointers to object types can be cast to
    'void*' and back. Having noted that, the round-trip conversion 'T*' ->
    'void*' -> 'T*' is indeed guaranteed to preserve the original value of
    'T*' pointer, but that only means that value representation of 'void*'
    is at least as "precise" (as big) as the value representation of any
    'T*' type. Yet various 'T*' can still have different representations
    (including different sizes).

    --
    Best regards,
    Andrey Tarasevich
    Andrey Tarasevich, Nov 13, 2007
    #9
  10. Ron Natalie wrote:
    > size_t only has to be big enough to represent the
    > maximum number of objects that could be created.


    Hmm... By definition, size_t has to be big enough to represent the
    number of bytes in a single object. If you prefer to express it in terms
    of "number of objects", it should probably sound like "size_t only has
    to be big enough to represent the maximum number of [continuous] bytes
    that could be allocated for a single object". Although I don't see the
    point in trying to reformulate it like that.

    To say that it should represent "maximum number of objects that could be
    created" is misleading. Quite the opposite, in general case there's
    nothing that prevents one from creating more objects than size_t can count.

    --
    Best regards,
    Andrey Tarasevich
    Andrey Tarasevich, Nov 13, 2007
    #10
  11. Alex Vinokur

    James Kanze Guest

    On Nov 12, 6:39 pm, Juha Nieminen <> wrote:
    > Ron Natalie wrote:
    > > It's also not the case that all pointers
    > > need to be the same size.


    > Is that really so? I thought that it must be possible to cast any
    > pointer to and from a void*. If there were different-sized pointers then
    > it could be rather problematic.


    It is, and your assumption is wrong. You can cast any pointer
    to an object to void*, but all you are guaranteed then is that
    you can cast it back to the original type without loss of
    information. There are also guarantees concerning accessing
    objects as arrays of char or unsigned char. All of which,
    together, more or less imply that sizeof(void*) == sizeof(char*)
    (an explicit requirement in the C standard), and that
    sizeof(void*) >= sizeof(T*) for all object types T. I've worked
    on systems where char* was larger than int*, and of course,
    systems where void (*)() had a different size than char* were
    quite frequent once upon a time---you can still find their
    descendants in use today. (The same systems often had a size_t
    which was smaller than a data pointer. And in some cases, data
    pointers or function pointers which were larger than any
    integral type.)

    You cannot, of course, convert a pointer to function, or any
    ponter to member, to a void*; an attempt to do so is an error,
    and requires a diagnositic.

    Some standards place stricter requirements on the
    implementation: Posix does require object pointers and function
    pointers have the same size and representation, for example.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Nov 13, 2007
    #11
  12. Alex Vinokur

    James Kanze Guest

    On Nov 12, 1:51 pm, Alex Vinokur <> wrote:
    > Does it have to be? :
    > sizeof (size_t) >= sizeof (pointer)


    Formally, there's no relation. Both can be pretty much anything
    the implementation wants. Practically, you have the
    relationship reversed: I can't think of a reason why
    sizeof(size_t) <= sizeof(char*) would ever hold. (Of course, if
    by "pointer", you mean any pointer, and not just a pointer to an
    object, anything goes. I've used systems where the size of a
    function pointer was two bytes, but size_t and data pointers
    were four bytes.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Nov 13, 2007
    #12
  13. Alex Vinokur

    Ron Natalie Guest

    Juha Nieminen wrote:
    > Ron Natalie wrote:
    >> It's also not the case that all pointers
    >> need to be the same size.

    >
    > Is that really so? I thought that it must be possible to cast any
    > pointer to and from a void*. If there were different-sized pointers then
    > it could be rather problematic.


    Correct. void* has to be the same as char* and that will hold anything.
    Pointers to larger objects need not be as big. It is certainly not the
    case that you can do this and I've worked on machines where it would
    fail bizarrely:

    char char_array[8];
    char* charp = char_aray + 1;

    int* intp = (int*) charp;
    char* charp2 = (char*) intp;

    There's no guarantee that int* need represent all the legal char*
    values. On some machines it would shift the low order bits of
    the pointer off in the conversion.

    I did compiler work and ported UNIX to a Denelcor HEP supercomputer
    decades ago. This machine encoded the operand size in the low order
    bits of the non-character pointer. This can lead to all sorts of
    fun if you manage to do soemthing like this:

    int x;
    union {
    int* ip;
    short* sp;
    } carbide;
    carbide.ip = &x;

    short* sp = carbide.sp;

    *sp = 5; // boom... sp has an "int" sized operand representation

    I know this because the BSD UNIX kernel did effectively the above all
    over internally.
    Ron Natalie, Nov 13, 2007
    #13
  14. Bo Persson wrote:
    > You can only reliably cast the pointer back to the original type. So,
    > as long as void* is among the largest types, other pointers can be
    > smaller.


    At least in gcc in a 32-bit linux system it seems that a method
    pointer is 8 bytes long, while a void* is 4 bytes.

    I know this is not related to standard C++ per se, but why does a
    method pointer need to be larger than a function pointer? I can't think
    of any technical reason for this, because a method cannot be called
    through a pointer without an object anyways, so any additional info the
    function pointer needs would be in that object, wouldn't it?
    Juha Nieminen, Nov 14, 2007
    #14
  15. On Wed, 14 Nov 2007 14:28:53 +0200, Juha Nieminen wrote:
    > At least in gcc in a 32-bit linux system it seems that a method
    > pointer is 8 bytes long, while a void* is 4 bytes.
    >
    > I know this is not related to standard C++ per se, but why does a
    > method pointer need to be larger than a function pointer? I can't think
    > of any technical reason for this, because a method cannot be called
    > through a pointer without an object anyways, so any additional info the
    > function pointer needs would be in that object, wouldn't it?


    Because a method pointer can point to a virtual function or a non-virtual
    function, and when declaring the method pointer, you cannot know where it
    will point to.

    Say, you have this:

    class A;
    typedef void (A::*Aptr) ();
    Aptr ptrtable[2];

    Are the pointers stored in ptrtable virtual or not? You don't know.
    You don't even know whether A has virtual functions or not, and thus
    whether there is need to express virtual functions. So you need to
    be able.

    In fact, they may be both, virtual and non-virtual:

    class A
    {
    public:
    virtual void afunc() { }
    void bfunc() { }
    };

    int main()
    {
    ptrtable[0] = &A::afunc;
    ptrtable[1] = &A::bfunc;
    }

    This is valid code.

    On the 64-bit and 32-bit Linux systems, GCC and ICC implement method
    pointers as a pair of two pointer-size integers, with the following
    semantics:

    If the first value is even, the second will be zero.
    In this case, the first value is a pointer to the member function,
    that is not virtual. To follow the pointer, just read the pointer
    and jump to that address.
    If the first value is odd, this indicates a virtual function.
    In this case, the following algorithm will be applied to acquire
    the actual function address:
    Add the second value to the address of the instance for which
    you are calling the method. Read a pointer from that address.
    Add the first value, minus 1, to that address. Read a pointer
    from that resulting address.
    Then jump to the address indicated by that pointer.

    With testing I couldn't figure out the situations where the second
    value would actually be non-zero, but I trust there are some.
    On different platforms, the mechanics behind method pointers can
    obviously be different.

    --
    Joel Yliluoma - http://bisqwit.iki.fi/
    : comprehension = 1 / (2 ^ precision)
    Joel Yliluoma, Nov 15, 2007
    #15
  16. Alex Vinokur

    Ron Natalie Guest

    Joel Yliluoma wrote:

    >
    > Are the pointers stored in ptrtable virtual or not? You don't know.
    > You don't even know whether A has virtual functions or not, and thus
    > whether there is need to express virtual functions. So you need to
    > be able.
    >

    Further, in the case of virtual/multiple inheritance it needs to be able to
    have the offset to adjust the "this" pointer as well.

    If your compiler is ABSOLUTELY standards compliant, all pointers to
    member functions need to be the same size (regardless of whether
    there are virtual / multiple inheritance). This is because there
    is no "void*" like super poitner for pointer-to-member and someone
    made the stupid-assed decision that you should thus be able to
    cast between pointer-to-member types and back without losing
    information.
    Ron Natalie, Nov 15, 2007
    #16
  17. Alex Vinokur

    James Kanze Guest

    On Nov 15, 1:54 pm, Ron Natalie <> wrote:
    > Joel Yliluoma wrote:


    > > Are the pointers stored in ptrtable virtual or not? You
    > > don't know. You don't even know whether A has virtual
    > > functions or not, and thus whether there is need to express
    > > virtual functions. So you need to be able.


    > Further, in the case of virtual/multiple inheritance it needs
    > to be able to have the offset to adjust the "this" pointer as
    > well.


    > If your compiler is ABSOLUTELY standards compliant, all
    > pointers to member functions need to be the same size
    > (regardless of whether there are virtual / multiple
    > inheritance). This is because there is no "void*" like super
    > poitner for pointer-to-member and someone made the
    > stupid-assed decision that you should thus be able to cast
    > between pointer-to-member types and back without losing
    > information.


    I don't think that that's the only reason. You can have a
    pointer to a member of an incomplete type, so the compiler
    cannot possibly know whether there are virtual functions,
    mulitple inheritance, etc. or not. VC++ does the optimizations
    you refer to, unless you specify otherwise. With the result
    that you cannot reliably pass pointer to member functions as
    arguments: something like:

    class Toto ;

    void
    f( Toto* p, void (Toto::*f)() )
    {
    p->*f() ;
    }

    will not work.

    Because you can have pointers to member of an incomplete type,
    all pointers to member functions must have the same
    representation.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Nov 15, 2007
    #17
  18. Joel Yliluoma wrote:
    > On the 64-bit and 32-bit Linux systems, GCC and ICC implement method
    > pointers as a pair of two pointer-size integers, with the following
    > semantics:
    >
    > [ridiculously convoluted semantics omitted]


    I don't know why so many compiler writers implement method pointers in such
    a complicated way. The easy way to do it is:

    * A method pointer is internally just a function pointer (perhaps with a
    different calling convention, like fast-this).

    * A call x->*p(args...) just does p(x,args...).

    * When taking the address of a method, if it can be represented in the
    above format, do so; otherwise, generate a proxy function equivalent to

    rtntype T::proxy(args) { return method(args); }

    and point to that.

    This avoids the need for complicated special-case logic for method pointers;
    the proxy function always looks the same, and while the code it generates
    may be complicated, the logic for generating it is already implemented. What
    these other representations amount to is a gratuitous runtime state-machine
    implementation of something that could have been compiled to native code
    with less implementation effort and probably greater runtime efficiency. Not
    to mention that this representation could be easily standardized as part of
    an ABI, and is good for implementing delegates.

    -- Ben
    Ben Rudiak-Gould, Nov 28, 2007
    #18
  19. Alex Vinokur

    Ron Natalie Guest

    Ben Rudiak-Gould wrote:

    >
    > * When taking the address of a method, if it can be represented in the
    > above format, do so; otherwise, generate a proxy function equivalent to
    >
    > rtntype T::proxy(args) { return method(args); }
    >

    How is this any more efficent or less convoluted than storing the method
    pointer and a constant to add to the "this" pointer?
    Ron Natalie, Nov 29, 2007
    #19
  20. Ron Natalie wrote:
    > How is this any more efficent or less convoluted than storing the method
    > pointer and a constant to add to the "this" pointer?


    If Base::f() is virtual there's no method pointer you can store, because if
    Derived overloads f() and x is a Derived, (x.*&Base::f)() calls
    Derived::f(). (This is slightly odd given that x.Base::f() calls Base::f().
    If they'd given that semantics to member pointers, none of this complexity
    would exist.) So we need to encode a second case for virtual functions in
    there somehow, and test it at each call site where the call might be virtual
    (i.e. where Base is an incomplete type or contains some virtual method
    compatible with the method pointer's type).

    If f() is non-virtual, but implemented in a virtual base class of Base, then
    we have a method pointer but no offset. This would need another case, except
    that the standard doesn't require implementations to handle it. (It says
    that &Base::f is a BasicBase::*, which isn't compatible with Base::*.)

    What if f() is virtual and implemented in a virtual base class? On most
    implementations, this is simpler than the previous case. We can handle it
    like any other virtual function, because the compiler generated a
    pointer-adjusting thunk to put into the vtable -- and it did that so the
    vtable could be a vector of function pointers instead of a vector of
    pointer-plus-this-adjustment-with-special-case-for-virtual-base thingies.
    Vtable entries are method pointers, and they're always, to my knowledge,
    implemented in just the way I'm suggesting that surface-language method
    pointers should be. Almost all of the necessary code is already in the compiler.

    This technique is certainly faster for the trivial case, and almost
    certainly faster for the general non-virtual case, since the pointers are
    half the size and each call requires an indirect jump and an unconditional
    direct jump instead of an indirect jump and a conditional jump (with
    potential misprediction). I see two problems with it. One is that it's
    almost certainly slower for virtual methods (two indirect jumps), but I
    think pointers to virtual methods are much rarer than pointers to
    non-virtual methods in the wild. The other is that you can't implement
    semantics-preserving casts from Base::* to Derived::* (or vice versa) in
    nontrivial cases without horrible convolutions. The only sensible way I can
    see to do it is to turn the cast into

    switch (p) {
    case &Base::f: return &Derived::f;
    case &Base::g: return &Derived::g;
    // ...
    }

    which is only workable if you have some way of guaranteeing that you haven't
    generated duplicate thunks for the same method. This isn't a fatal problem
    since the standard doesn't require such casts to work (unless you cast the
    pointer back before using it). It's akin to casting from void (*)(Base*) to
    void (*)(Derived*), which would be even harder to implement.

    -- Ben
    Ben Rudiak-Gould, Nov 30, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Derek
    Replies:
    7
    Views:
    24,331
    Ron Natalie
    Oct 14, 2004
  2. Alex Vinokur
    Replies:
    7
    Views:
    497
    Clark S. Cox III
    Aug 14, 2006
  3. Bill Cunningham

    sizeof size_t

    Bill Cunningham, Feb 3, 2008, in forum: C Programming
    Replies:
    8
    Views:
    733
    CBFalconer
    Feb 3, 2008
  4. Alex Vinokur
    Replies:
    9
    Views:
    787
    James Kanze
    Oct 13, 2008
  5. Alex Vinokur
    Replies:
    1
    Views:
    575
Loading...

Share This Page