Abstraction layer between C and CPU

Discussion in 'C Programming' started by Luke Wu, Jan 21, 2005.

  1. Luke Wu

    Luke Wu Guest

    Hello,

    >From spending some time in clc, I've come to realize that C's model of

    the CPU can be totally different from the atual CPU.

    Is it safe to say that almost nothing can be gleaned about physical CPU
    behaviour from C level behaviour.

    For example:

    - do the addresses returned by & have to have (by Standard) any direct
    relationship to real addresses?

    -- if the address of 2 objects is different by 'x' in C (when using &
    operator), are they so in the hardware?

    - do the elements of an array usually end up being placed side by side
    in most implementations (does the standard require this) ?
    how about multidimensional arrays?

    -- I've seen code where two or more arrays would be declared side by
    side and then the first array with extended indexing would be used to
    access the elements of the second/more array. Does this suggest that C
    guarantees that declared variables of the same storage class are placed
    in ascending order in memory?

    int a[10], b[10], c[20];
    int i;

    for(i = 0; i < 40; i++)
    {
    a = 0; /* zero initializes all three arrays */
    }



    - I read in an article once (can't find it now) that a "byte" in C
    doesn't necessarily have to be an octet of bits at the hardware level

    Any help would be appreciated.
    (Are there any links on the net that point to details of C's
    abstraction layer? I can't seem to find any. I guess these details
    are woven through the standards documents, but I'm talking about a
    single cohesive document)
     
    Luke Wu, Jan 21, 2005
    #1
    1. Advertising

  2. Luke Wu wrote:
    > From spending some time in clc, I've come to realize that C's
    > model of the CPU can be totally different from the atual CPU.


    C doesn't model a CPU, it models an abstract machine.

    > Is it safe to say that almost nothing can be gleaned about
    > physical CPU behaviour from C level behaviour.


    True. The whole point of high level languages is to avoid
    dealing with low level implementation details.

    > For example: ...


    Your examples are really questions on how implementations might
    work. Comp.lang.c is not the place for such questions since the
    standard merely supplies semantics. How those semantics are
    actually implemented is not specified.[*]

    Personally, I think your questions, whilst naturally curious,
    are nonetheless dangerous. I've seen countless examples of
    newbie programmers who try to analyse C semantics from things
    like disassemblies, only to develop false conclusions.

    When code based on such false conclusions is ported to other
    machines, it can often lead to bugs which are difficult to
    diagnose and debug.

    > ...
    > - I read in an article once (can't find it now) that a "byte"
    > in C doesn't necessarily have to be an octet of bits at the
    > hardware level.


    Correct. Some architectures are incapable of addressing octets.

    > Any help would be appreciated.
    > (Are there any links on the net that point to details of C's
    > abstraction layer?


    The standards _are_ the abstraction layer. Your questions are
    about realisations of that abstraction.

    > I can't seem to find any. I guess these details
    > are woven through the standards documents, but I'm talking
    > about a single cohesive document)


    You should perhaps look at compiler writing books.

    [*] Of course, the standard authors are quite mindful of what
    can be implemented efficiently on various existing and future
    architectures.

    --
    Peter
     
    Peter Nilsson, Jan 21, 2005
    #2
    1. Advertising

  3. Luke Wu wrote:
    > ...
    >>From spending some time in clc, I've come to realize that C's model of

    > the CPU can be totally different from the atual CPU.


    From purely abstract theoretical point of view: yes, of course it can be.

    > Is it safe to say that almost nothing can be gleaned about physical CPU
    > behaviour from C level behaviour.


    That's correct.

    > For example:
    >
    > - do the addresses returned by & have to have (by Standard) any direct
    > relationship to real addresses?


    If by "real addresses" you mean machine addresses, then no, they don't
    have to have any relationship.

    > -- if the address of 2 objects is different by 'x' in C (when using &
    > operator), are they so in the hardware?


    I don't exactly understand what you mean by "different by 'x'". By 'x'
    what? "Bytes" in C sense of the word? Machine bytes? Difference returned
    by binary '-' operator?

    > - do the elements of an array usually end up being placed side by side
    > in most implementations (does the standard require this) ?


    Yes, they do. This means that any padding present between the elements
    of the array is part of the element itself, not something added
    specifically by the array object. This follows from the fact that in C

    sizeof(array) = sizeof(element) * number_of_elements

    This is required by the standard.

    > how about multidimensional arrays?


    Multidimensional arrays in C are just arrays of arrays, which means that
    the above applies to them as well. Arrays cannot "insert" extra padding
    between elements.

    > -- I've seen code where two or more arrays would be declared side by
    > side and then the first array with extended indexing would be used to
    > access the elements of the second/more array.
    > Does this suggest that C
    > guarantees that declared variables of the same storage class are placed
    > in ascending order in memory?


    No, there's no such guarantee. Such access is completely illegal in C.
    The behavior is undefined.

    > - I read in an article once (can't find it now) that a "byte" in C
    > doesn't necessarily have to be an octet of bits at the hardware level


    That's true. "Byte" in C (C-byte) is essentially synonymous with 'char'
    type. '[unsigned|signed] char' objects in C always consist of 1 C-byte
    by definition. A C-byte might consist of any number of machine bytes,
    which means that the number of bits in C-byte might be different from 8
    (could be 16, for example).

    > (Are there any links on the net that point to details of C's
    > abstraction layer? I can't seem to find any. I guess these details
    > are woven through the standards documents, but I'm talking about a
    > single cohesive document)


    C99 standard has a number of sections specifically dedicated to these
    issues.

    --
    Best regards,
    Andrey Tarasevich
     
    Andrey Tarasevich, Jan 21, 2005
    #3
  4. Luke Wu

    Mike Wahler Guest

    "Luke Wu" <> wrote in message
    news:...
    > Hello,
    >
    > >From spending some time in clc, I've come to realize that C's model of

    > the CPU



    C doesn't really model 'a CPU' it defines an 'abstract machine',
    and doesn't directly refer to a "CPU" component.

    >can be totally different from the atual CPU.
    >
    > Is it safe to say that almost nothing can be gleaned about physical CPU
    > behaviour from C level behaviour.


    Not almost nothing, but nothing. However, by examining an
    assembly listing which many compilers can emit, one can
    glean some platform-specific information. But then you're
    outside the realm of C.

    >
    > For example:
    >
    > - do the addresses returned by & have to have (by Standard) any direct
    > relationship to real addresses?


    No. This is especially true for platforms which feature
    'virtual memory' and/or separate 'process spaces' as in
    e.g. Microsoft Windows.


    > -- if the address of 2 objects is different by 'x' in C (when using &
    > operator), are they so in the hardware?


    Not necessarily. Also note that the addresses of two separate
    objects will not necessarily reflect their relationship in
    source code. e.g.:

    int i;
    int j;

    the address of 'j' need not be greater than address of 'i'
    nor is their difference guaranteed to be sizeof(int).
    (the only time this *is* guaranteed is when the
    objects are adjacent elements (the subscript of one
    is one more or less than the subscript of the other)
    of the same array).

    (but the adddresses of two separate objects are always
    guaranteed to be different)

    >
    > - do the elements of an array usually


    Not usually, but always.

    >end up being placed side by side


    At contiguous addresses (as reported by the & operator),
    whose difference is sizeof(array's element type).

    int array[2];

    &array[1] is guaranteed to be exactly
    sizeof(int) larger than &array[0];

    > in most implementations


    All conforming implementations.

    >(does the standard require this) ?


    Yes.



    > how about multidimensional arrays?


    Yes. "multi-dimensional arrays" in C are really
    "arrays of arrays"

    for the array:

    int arr2d[2][3] = {1, 2, 3, 4, 5, 6};
    (sometimes written for clarity as:
    int arr2d[2][3] = { {1,2,3}, {4,5,6} };

    the values are stored (contiguously) in memory in the
    order in which the intializer values appear above. That is:
    arr2d[0][0] == 1
    arr2d[0][1] == 2
    arr2d[0][2] == 3
    arr2d[1][0] == 4
    arr2d[1][1] == 5
    arr2d[1][2] == 6

    That is, C arrays are stored in 'row major' order, unlike
    some other languages.

    >
    > -- I've seen code where two or more arrays would be declared side by
    > side


    C has rather 'free' formatting rules, e.g. more than one
    declaration or statment can appear on a single line.
    int array1[] = {1,2,3}; int array2[] = {4,5,6};

    However I recommend against this practice.

    > and then the first array with extended indexing


    What do you mean by 'extended indexing'? C does not define
    such a term.

    >would be used to
    > access the elements of the second/more array.


    Any integral expression whose value when added to the address
    of an array's first element is within the bounds of that array
    can be used to index into it. The fact that these values might
    themselves be stored in an array is of no consequence. As a
    matter of fact, some 'convoluted' code could be written in which
    array element values are used to index that same array. But imo
    this is a rather dangerous practice.


    >Does this suggest that C
    > guarantees that declared variables of the same storage class are placed
    > in ascending order in memory?


    No. This is only guaranteed for elements of the same array.

    >
    > int a[10], b[10], c[20];


    This is a valid way to define several objects, but I recommend
    one object per line. Easier to read and maintain.

    > int i;
    >
    > for(i = 0; i < 40; i++)
    > {
    > a = 0; /* zero initializes all three arrays */


    NO, NO, NO!

    You must process each array individually. Their positions
    in memory relative to one another is not specified. Also
    note that what you wrote above is *not* intiialization,
    but assignment, not the same thing. An object is intitialized
    when it is defined:

    int a[10] = {1,2,3}; /* first three elements are intialized with
    1, 2, and 3, respectively, all others to zero */

    FWIW, you can initialize all the elements of an array to zero like this:

    int a[10] = {0};

    (If this definition appears at file scope, or is qualified
    with 'static' at block scope, all elements are initialized to
    zero implicitly -- but I like to include the initializer(s)
    anyway, for clarity, but that is a 'style' issue).

    > }
    >
    >
    >
    > - I read in an article once (can't find it now) that a "byte" in C
    > doesn't necessarily have to be an octet of bits at the hardware level


    Correct. It's simply the 'smallest addressible unit of storage',
    which is required to have a minimum size of eight bits, but can
    be larger (and often is on certain architectures). From a C
    perspective, 'byte' and 'character' are synonymous.

    This 'abstraction' is there to make the language as platform
    neutral as possible, allowing for implementation on the widest
    possible variety of existing architectures as well as those that
    have yet to be concieved.

    >
    > Any help would be appreciated.
    > (Are there any links on the net that point to details of C's
    > abstraction layer? I can't seem to find any. I guess these details
    > are woven through the standards documents, but I'm talking about a
    > single cohesive document)


    This single cohesive document *is* the ISO standard, but I'll be
    the first to admit it's not easy to read. What you need are
    some books. See www.accu.org for peer reviews.

    -Mike
     
    Mike Wahler, Jan 21, 2005
    #4
  5. Luke Wu

    -berlin.de Guest

    Luke Wu <> wrote:
    >>From spending some time in clc, I've come to realize that C's model of

    > the CPU can be totally different from the atual CPU.


    > Is it safe to say that almost nothing can be gleaned about physical CPU
    > behaviour from C level behaviour.


    That's why there is a standard, i.e. in order to be able to write
    programs that _don't_ depend on the specific CPU you are using but
    that can be ported easily from one to the next system. Otherwise
    you wouldn't have much more that a (high-level) assembler.

    > For example:


    > - do the addresses returned by & have to have (by Standard) any direct
    > relationship to real addresses?


    No. With many modern operating systems the concept of "real addresses"
    (in the sense of physical addresses) don't even make much sense, since
    there's what's called "virtually memory", and the mapping between phy-
    sical addresses and what a program sees is completely at the discretion
    of the operating system. What the program sees as a fixed address can
    be mapped to varying physical addresses (or even get written out to swap
    space).

    > -- if the address of 2 objects is different by 'x' in C (when using &
    > operator), are they so in the hardware?


    No - one of the objects could even be in swap space on the disk while
    the other is in memory.

    > - do the elements of an array usually end up being placed side by side
    > in most implementations (does the standard require this) ?
    > how about multidimensional arrays?


    As long as what the program sees as the addresses are continous in
    (virtual) memory everything is fine. But in the sense of physical mem-
    mory the elements could be far apart.

    > -- I've seen code where two or more arrays would be declared side by
    > side and then the first array with extended indexing would be used to
    > access the elements of the second/more array. Does this suggest that C
    > guarantees that declared variables of the same storage class are placed
    > in ascending order in memory?


    > int a[10], b[10], c[20];
    > int i;


    > for(i = 0; i < 40; i++)
    > {
    > a = 0; /* zero initializes all three arrays */
    > }


    No, you can't rely on that, even if you only care about the "virtual"
    addresses. Accessing an array element outside of its defined range
    of indices is forbidden and leads to undefined behaviour. That code
    may work on a certain platform when compiled with a certain compiler
    but there's no guarantee that it works with any other compiler or on
    a different platform.

    > - I read in an article once (can't find it now) that a "byte" in C
    > doesn't necessarily have to be an octet of bits at the hardware level


    There's no "byte" in C. What you have is a char (as the smallest
    type), and how many bits a char has on the system you're working on
    can be found out from the CHAR_BIT macro from <limits.h>. The only
    guarantee you have is that CHAR_BIT is at least 8, i.e. a char has
    at least 8 bits - but it can be more.

    > (Are there any links on the net that point to details of C's
    > abstraction layer? I can't seem to find any. I guess these details
    > are woven through the standards documents, but I'm talking about a
    > single cohesive document)


    Most of the things you're asking about you won't find in the standard
    because they aren't relevant from a C language point of view. How C
    code gets compiled to have the resulting executable work as expected
    (i.e. as required by the standard) is due to the people writing the
    compiler. The standard does not make any requirements how they use the
    CPU they are dealing with to manage this. The C standard is basically
    a recipe along the lines of "Given this code as input the resulting
    program must behave in the that way", but how this it's achieved (and
    with what kind of hardware) isn't relevant.

    Regards, Jens
    --
    \ Jens Thoms Toerring ___ -berlin.de
    \__________________________ http://www.toerring.de
     
    -berlin.de, Jan 21, 2005
    #5
  6. Luke Wu

    Luke Wu Guest

    Thank you for the responses.

    I am now getting the 'feel' for C's abstraction away from hardware
    details from reading clc posts. I think I'm almost done erasing all
    the assumptions that I got into my head from reading books like The C
    Companion, by Allen I. Holub.
     
    Luke Wu, Jan 21, 2005
    #6
  7. Andrey Tarasevich wrote:
    > ...
    >> - do the elements of an array usually end up being placed side by side
    >> in most implementations (does the standard require this) ?

    >
    > Yes, they do. This means that any padding present between the elements
    > of the array is part of the element itself, not something added
    > specifically by the array object. This follows from the fact that in C
    >
    > sizeof(array) = sizeof(element) * number_of_elements
    >
    > This is required by the standard.
    >
    >> how about multidimensional arrays?

    >
    > Multidimensional arrays in C are just arrays of arrays, which means that
    > the above applies to them as well. Arrays cannot "insert" extra padding
    > between elements.
    > ...


    Although it is worth noting that the above requirements are still
    formulated at language level. Which means that if some compiler by means
    of "compiler magic" can satisfy these requirements and at the same time
    place array elements out of order/apart from each other in machine
    memory, there wouldn't be anything wrong with it.

    --
    Best regards,
    Andrey Tarasevich
     
    Andrey Tarasevich, Jan 21, 2005
    #7
  8. Luke Wu

    Mike Wahler Guest

    <-berlin.de> wrote in message
    news:...
    > Luke Wu <> wrote:


    > > - I read in an article once (can't find it now) that a "byte" in C
    > > doesn't necessarily have to be an octet of bits at the hardware level

    >
    > There's no "byte" in C.


    Au contraire.

    ISO/IEC 9899:1999 (E)

    3.6

    1 byte
    addressable unit of data storage large enough to hold
    any member of the basic character set of the execution
    environment

    -Mike
     
    Mike Wahler, Jan 21, 2005
    #8
  9. Mike Wahler wrote:

    > Jens.Toerring wrote:
    >
    >>Luke Wu wrote:

    >
    >>>- I read in an article once (can't find it now) that a "byte" in C
    >>>doesn't necessarily have to be an octet of bits at the hardware level

    >>
    >>There's no "byte" in C.

    >
    > Au contraire.
    >
    > ISO/IEC 9899:1999 (E)
    >
    > 3.6
    >
    > 1 byte
    > addressable unit of data storage large enough to hold
    > any member of the basic character set of the execution
    > environment


    Note that a byte is not a data type
    but the *size* of a unit of storage.

    In practice, a byte is 8 binary digits (bits) almost everywhere
    including machines where four characters are normally "packed"
    into 32 bit "words".
     
    E. Robert Tisdale, Jan 21, 2005
    #9
  10. Luke Wu

    Mike Wahler Guest

    "E. Robert Tisdale" <> wrote in message
    news:csq1nd$dht$...
    > Mike Wahler wrote:
    >
    > > Jens.Toerring wrote:
    > >
    > >>Luke Wu wrote:

    > >
    > >>>- I read in an article once (can't find it now) that a "byte" in C
    > >>>doesn't necessarily have to be an octet of bits at the hardware level
    > >>
    > >>There's no "byte" in C.

    > >
    > > Au contraire.
    > >
    > > ISO/IEC 9899:1999 (E)
    > >
    > > 3.6
    > >
    > > 1 byte
    > > addressable unit of data storage large enough to hold
    > > any member of the basic character set of the execution
    > > environment

    >
    > Note that a byte is not a data type


    Note that I never claimed that it is.

    > but the *size* of a unit of storage.
    >
    > In practice, a byte is 8 binary digits (bits) almost everywhere


    Then imo your 'everywhere' is rather limited.

    > including machines where four characters are normally "packed"
    > into 32 bit "words".


    Note that on some machines a byte is 32 bits.

    -Mike
     
    Mike Wahler, Jan 21, 2005
    #10
  11. Mike Wahler wrote:

    > E. Robert Tisdale wrote:
    >
    >>Mike Wahler wrote:
    >>
    >>>Jens.Toerring wrote:
    >>>
    >>>>Luke Wu wrote:
    >>>
    >>>>>- I read in an article once (can't find it now) that a "byte" in C
    >>>>>doesn't necessarily have to be an octet of bits at the hardware level
    >>>>
    >>>>There's no "byte" in C.
    >>>
    >>>Au contraire.
    >>>
    >>>ISO/IEC 9899:1999 (E)
    >>>
    >>> 3.6
    >>>
    >>>1 byte
    >>> addressable unit of data storage large enough to hold
    >>> any member of the basic character set of the execution
    >>> environment

    >>
    >>Note that a byte is not a data type

    >
    > Note that I never claimed that it is.


    I never claimed that you claimed that it is. :)

    >>but the *size* of a unit of storage.
    >>
    >>In practice, a byte is 8 binary digits (bits) almost everywhere

    >
    > Then imo your 'everywhere' is rather limited.
    >
    >>including machines where four characters are normally "packed"
    >>into 32 bit "words".

    >
    > Note that on some machines a byte is 32 bits.


    Name ten.

    Perspective is important.

    I know that you don't mean to imply
    that this is a real problem for C programmers.

    Most C programmers will never write a single line of code
    that will be ported to a processor with 32 bit bytes.
     
    E. Robert Tisdale, Jan 21, 2005
    #11
  12. Mike Wahler wrote:
    > "Luke Wu" <> wrote in message
    > news:...
    >
    >>
    >>- do the addresses returned by & have to have (by Standard) any direct
    >>relationship to real addresses?

    >
    >
    > No. This is especially true for platforms which feature
    > 'virtual memory' and/or separate 'process spaces' as in
    > e.g. Microsoft Windows.
    >


    Well, there must somewhere be a mapping between pointer
    values and actual addresses (even in the abstract machine).
    Though there can be several layers of mappings. So there
    must be a relationship, but depending on what you mean by
    direct, it might not be direct.

    But even though this mapping must exist even in the abstract
    machine one cannot one cannot portably use this for anything
    as there is a) no specified mechanism for doing so and b) it
    will be very different between platforms.



    --
    Thomas.
     
    Thomas Stegen, Jan 21, 2005
    #12
  13. Mike Wahler wrote:
    > "E. Robert Tisdale" <> wrote in message
    > news:csq1nd$dht$...

    [snip usual char byte 8 bit not 8 bit discussion]

    Semi OT perhaps but...

    Outside a C perspective, didn't IBM first coin the term byte to
    refer to 8 bit entities? As far as I know machines such as the
    pdp-11 (I think) had 9 bit entities, but never used the term byte.

    It is also clear though that in a C context byte does not mean
    this. It is also clear that one should establish a context,
    implicitly or explicitly, when discussing bytes with anyone.
    Are we in the C locale, or in the mere mortals locale?

    Here in comp.lang.c thw context should be clear to everyone.

    --
    Thomas.
     
    Thomas Stegen, Jan 21, 2005
    #13
  14. Thomas Stegen wrote:
    > Mike Wahler wrote:
    >
    >> "E. Robert Tisdale" <> wrote in message
    >> news:csq1nd$dht$...

    >
    > [snip usual char byte 8 bit not 8 bit discussion]
    >
    > Semi OT perhaps but...
    >
    > Outside a C perspective, didn't IBM first coin the term byte to
    > refer to 8 bit entities? As far as I know machines such as the
    > pdp-11 (I think) had 9 bit entities, but never used the term byte.
    >
    > It is also clear though that in a C context byte does not mean
    > this. It is also clear that one should establish a context,
    > implicitly or explicitly, when discussing bytes with anyone.
    > Are we in the C locale, or in the mere mortals locale?
    >
    > Here in comp.lang.c thw context should be clear to everyone.
    >


    Perhaps, using the term ``octet" for a group of 8 bits would be
    much better. A byte may be an octet and is the most basic
    addressable unit in an execution environment. Therefore, a byte,
    according to this definition, may also be 4 bits.

    To reply to the original context, C does not have
    a ``byte" data type. In C, a char contains, at least, enough
    bits to represent any element of the basic character set.
    A char may at least be a byte or higher.

    I don't see how you can safely assume a char to contain at least
    8 bits. The standard doesn't say so explicitly.

    --
    "I'm learning to program because then I can write
    programs to do my homework faster." - Andy Anfilofieff
     
    Jonathan Burd, Jan 21, 2005
    #14
  15. Luke Wu

    CBFalconer Guest

    Mike Wahler wrote:
    >

    .... snip ...
    >
    > FWIW, you can initialize all the elements of an array to zero
    > like this:
    >
    > int a[10] = {0};
    >
    > (If this definition appears at file scope, or is qualified
    > with 'static' at block scope, all elements are initialized to
    > zero implicitly -- but I like to include the initializer(s)
    > anyway, for clarity, but that is a 'style' issue).


    But be aware that, on some systems, this may result in heavy
    bloating of the final executable file with long strings of zero
    bytes. This has nothing whatsoever to do with the language, but
    you should be aware of the possibility.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
     
    CBFalconer, Jan 21, 2005
    #15
  16. Jonathan Burd wrote:
    > Thomas Stegen wrote:
    >
    >> Mike Wahler wrote:
    >>
    >>> "E. Robert Tisdale" <> wrote in message
    >>> news:csq1nd$dht$...

    >>
    >>
    >> [snip usual char byte 8 bit not 8 bit discussion]
    >>
    >> Semi OT perhaps but...
    >>
    >> Outside a C perspective, didn't IBM first coin the term byte to
    >> refer to 8 bit entities? As far as I know machines such as the
    >> pdp-11 (I think) had 9 bit entities, but never used the term byte.
    >>
    >> It is also clear though that in a C context byte does not mean
    >> this. It is also clear that one should establish a context,
    >> implicitly or explicitly, when discussing bytes with anyone.
    >> Are we in the C locale, or in the mere mortals locale?
    >>
    >> Here in comp.lang.c thw context should be clear to everyone.
    >>

    >
    > Perhaps, using the term ``octet" for a group of 8 bits would be
    > much better. A byte may be an octet and is the most basic
    > addressable unit in an execution environment. Therefore, a byte,
    > according to this definition, may also be 4 bits.
    >
    > To reply to the original context, C does not have
    > a ``byte" data type. In C, a char contains, at least, enough
    > bits to represent any element of the basic character set.
    > A char may at least be a byte or higher.

    Correction: A char must at least be a byte.
    >
    > I don't see how you can safely assume a char to contain at least
    > 8 bits. The standard doesn't say so explicitly.
    >



    --
    "I'm learning to program because then I can write
    programs to do my homework faster." - Andy Anfilofieff
     
    Jonathan Burd, Jan 21, 2005
    #16
  17. Jonathan Burd wrote:
    > Thomas Stegen wrote:
    >
    >> Mike Wahler wrote:
    >>
    >>> "E. Robert Tisdale" <> wrote in message
    >>> news:csq1nd$dht$...

    >>
    >>
    >> [snip usual char byte 8 bit not 8 bit discussion]
    >>
    >> Semi OT perhaps but...
    >>
    >> Outside a C perspective, didn't IBM first coin the term byte to
    >> refer to 8 bit entities? As far as I know machines such as the
    >> pdp-11 (I think) had 9 bit entities, but never used the term byte.
    >>
    >> It is also clear though that in a C context byte does not mean
    >> this. It is also clear that one should establish a context,
    >> implicitly or explicitly, when discussing bytes with anyone.
    >> Are we in the C locale, or in the mere mortals locale?
    >>
    >> Here in comp.lang.c thw context should be clear to everyone.
    >>

    >
    > Perhaps, using the term ``octet" for a group of 8 bits would be
    > much better. A byte may be an octet and is the most basic
    > addressable unit in an execution environment. Therefore, a byte,
    > according to this definition, may also be 4 bits.
    >
    > To reply to the original context, C does not have
    > a ``byte" data type. In C, a char contains, at least, enough
    > bits to represent any element of the basic character set.
    > A char may at least be a byte or higher.
    >
    > I don't see how you can safely assume a char to contain at least
    > 8 bits. The standard doesn't say so explicitly.
    >


    Alright, CHAR_BIT is at least 8 bits. My bad.

    Regards,
    Jonathan.

    --
    "I'm learning to program because then I can write
    programs to do my homework faster." - Andy Anfilofieff
     
    Jonathan Burd, Jan 21, 2005
    #17
  18. Luke Wu

    pete Guest

    Jonathan Burd wrote:

    > > A char may at least be a byte or higher.


    (sizeof(char) == 1) /* always just exactly one. */

    > > I don't see how you can safely assume a char to contain at least
    > > 8 bits. The standard doesn't say so explicitly.
    > >

    >
    > Alright, CHAR_BIT is at least 8 bits. My bad.


    --
    pete
     
    pete, Jan 21, 2005
    #18
  19. On Fri, 21 Jan 2005 10:11:21 +0000, Thomas Stegen
    <> wrote:

    > Mike Wahler wrote:
    >> "E. Robert Tisdale" <> wrote in message
    >> news:csq1nd$dht$...

    > [snip usual char byte 8 bit not 8 bit discussion]
    >
    > Semi OT perhaps but...
    >
    > Outside a C perspective, didn't IBM first coin the term byte to
    > refer to 8 bit entities? As far as I know machines such as the
    > pdp-11 (I think) had 9 bit entities, but never used the term byte.


    This was discussed here recently.

    Yes, Werner Buchholz at IBM invented the term in 1956, originally just
    as a 1 to 6 bit field used for I/O but by the end of the year it had
    come to refer to 8 bit quantities. The DEC PDP-11 was a 16 bit machine,
    and DEC did use the term byte to refer to half-words of 8 bits (as far
    as I know no PDP-11 actually used wrds like 'byte' at all, they couldn't
    usually speak <g>). The DEC PDP-10 programmers used 'bytes' to refer to
    variable bit fields (from 1 to 36 bits I believe).

    > It is also clear though that in a C context byte does not mean
    > this. It is also clear that one should establish a context,
    > implicitly or explicitly, when discussing bytes with anyone.
    > Are we in the C locale, or in the mere mortals locale?


    Use "characters" or "chars" to refer to the C entities and "octets" to
    refer to the 8 bit quanities, and shun the overloaded term "bytes".

    > Here in comp.lang.c thw context should be clear to everyone.


    It isn't, even those of use who have worked on machines with odd byte
    lengths often now use it only ablut 8 bit quantities, because that
    represents the vast majority of machines these days (most of the DSP
    programmers I know refer to the basic -- and only -- memory units as
    "words").

    Chris C
     
    Chris Croughton, Jan 21, 2005
    #19
  20. Luke Wu

    Mike Wahler Guest

    "E. Robert Tisdale" <> wrote in message
    news:csq6qk$g26$...
    > Mike Wahler wrote:
    >
    > > E. Robert Tisdale wrote:
    > >
    > >>Mike Wahler wrote:
    > >>
    > >>>Jens.Toerring wrote:
    > >>>
    > >>>>Luke Wu wrote:
    > >>>
    > >>>>>- I read in an article once (can't find it now) that a "byte" in C
    > >>>>>doesn't necessarily have to be an octet of bits at the hardware level
    > >>>>
    > >>>>There's no "byte" in C.
    > >>>
    > >>>Au contraire.
    > >>>
    > >>>ISO/IEC 9899:1999 (E)
    > >>>
    > >>> 3.6
    > >>>
    > >>>1 byte
    > >>> addressable unit of data storage large enough to hold
    > >>> any member of the basic character set of the execution
    > >>> environment
    > >>
    > >>Note that a byte is not a data type

    > >
    > > Note that I never claimed that it is.

    >
    > I never claimed that you claimed that it is. :)
    >
    > >>but the *size* of a unit of storage.
    > >>
    > >>In practice, a byte is 8 binary digits (bits) almost everywhere

    > >
    > > Then imo your 'everywhere' is rather limited.
    > >
    > >>including machines where four characters are normally "packed"
    > >>into 32 bit "words".

    > >
    > > Note that on some machines a byte is 32 bits.

    >
    > Name ten.


    No need.

    >
    > Perspective is important.


    More important is abstraction and portability.

    >
    > I know that you don't mean to imply
    > that this is a real problem for C programmers.


    It can be for some.

    >
    > Most C programmers will never write a single line of code
    > that will be ported to a processor with 32 bit bytes.


    There you go again, with your 'most [insert whatever]'.
    You can't have any idea what 'most' C programmers do
    or don't do. You can't know who they are, or how many
    of them there are.

    -Mike
     
    Mike Wahler, Jan 21, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dhananjay
    Replies:
    1
    Views:
    1,126
    sloan
    Dec 18, 2006
  2. Kobu
    Replies:
    18
    Views:
    606
    Keith Thompson
    Jul 23, 2006
  3. BJ Dierkes

    Database Abstraction Layer And/Or ORM

    BJ Dierkes, Sep 24, 2007, in forum: Python
    Replies:
    1
    Views:
    317
    Bruno Desthuilliers
    Sep 24, 2007
  4. Ole
    Replies:
    4
    Views:
    183
  5. jis
    Replies:
    11
    Views:
    343
    Tobias Müller
    Jan 30, 2013
Loading...

Share This Page