Variadic functions calling variadic functions with the argument list, HLL bit shifts on LE processor

Discussion in 'C Programming' started by Ross A. Finlayson, Mar 3, 2005.

  1. Hi,

    I hope you can help me understand the varargs facility.

    Say I am programming in ISO C including stdarg.h and I declare a
    function as so:

    void log_printf(const char* logfilename, const char* formatter, ...);

    Then, I want to call it as so:

    int i = 1;
    const char* s = "string";

    log_printf("logfile.txt", "Int: %d String %s", i, s);

    Then, in the definition of log_printf I have something along the lines
    of this:

    void log_printf(const char* logfilename, const char* formatter, ...){

    va_list ap;

    FILE* logfile;
    if(logfilename == NULL) { return; }
    if(formatter == NULL) { return; }
    logfile = fopen(logfilename, "a");
    if( logfile == NULL ) { return; }

    va_start(ap, formatter);

    fprintf(logfile, formatter, ap );

    va_end(ap);

    fprintf(logfile, "\n");
    fclose(logfile);
    return;

    }

    As you can see I want to call fprintf with its variable argument list
    that is the same list of arguments as passed to log_printf.

    Please explain the (a) correct implementation of this function. I
    looked to the C Reference Manual and Schildt's handy C/C++ Programmer's
    Reference and various platform's man pages and documentation with
    regards to this, and don't quite get it. I search for va_start(ap,
    fmt).


    Then, also I have some questions about shift on little-endian
    processors. As an obligatory topicality justification, performance
    issues of C are on-topic on comp.lang.c. Anyways, in the HLL C when
    you shift the 32 bit unsigned value 0x00010000 one bit right the result
    is equal to 0x00008000. Now, on little-endian architectures that is
    represented in memory as

    00 00 01 00

    and then

    00 80 00 00

    On the LE register as well, its representation is

    00 00 01 00

    and then shifting it one bit right in the HLL leads to

    00 80 00 00

    but one would hope that the shr for shift right instruction would
    instead leave:

    00 00 00 80

    so I am wondering if besides using assembler instructions there are
    other well known methods in the high level language for shifting,
    particularly in terms of one bit shift and multiples of eight bit
    shifts.

    That's basically about the conflict between msb->lsb bit fill order but
    LSB->MSB, LE or Little-Endian byte order, and correspondingly between
    lsb->msb and MSB->LSB, BE or Big-Endian, in terms of high level shift
    and instruction level shift. I have not decompiled the output of the
    compiler on the LE machine, maybe that would be the most direct way to
    see what the compiler does to accomplish bit shifts across byte
    boundaries that preserve numeric values, but if you happen to know
    offhand, that would be interesting and appreciated. I'm basically
    looking at the basic necessity of assembler for a bit scanner. It is
    acceptably easily done in the HLL, I'm just concerned about trivial
    issues of compiler instruction generation.

    Here's another question, but it's about the linker. Say, a unit
    contains multiple symbols, that have by default the extern storage
    class. I guess it's up to the linker to either draw in all symbols of
    the module or pull the used symbols out of the compiled module. I look
    at the output with gcc, and it draws in symbols that are unused, just
    because some other symbol in the same compilation unit is used. for
    example:

    [space:~/Desktop/fc_test] nm Build/test
    ....
    00003c7c T _fc_mem_free
    00003c38 T _fc_mem_malloc
    00003cc4 T _fc_mem_realloc
    00003d10 T _fc_mem_stackalloc
    00003d6c T _fc_mem_stackfree
    ....

    All of those definitions are in the same compilation or translation
    unit, but only fc_mem_stackalloc and fc_mem_stackfree are actually
    used, and there is no reason for that code to be in the output. I
    change the optimization to -O3 and the unnecessary symbols are still
    there.

    [space:~/Desktop/fc_test] space% cc -v
    Reading specs from /usr/libexec/gcc/darwin/ppc/2.95.2/specs
    Apple Computer, Inc. version gcc-926, based on gcc version 2.95.2
    19991024 (release)

    It's one consideration to divide the compilation units so that each
    compilation unit contains only one function, but that gets very bulky,
    as some of the translation units in this project have dozens of nearly
    identical functions, and I would rather not have hundreds of source
    code files if I could hint to the linker to remove unnecessary code.
    Some programs linking to the static library only need one of those
    hundreds of generated functions.

    So:

    1. variadic function arguments going to called variadic function?
    2. HLL shift on Little-Endian?
    3. linker hints for unneeded symbols besides compiling into separate
    units

    Then I have some questions about best practices with signals, Microsoft
    SEH or Structured Exception Handling, portability and runtime
    dependency of longjmp, and stuff.

    Thank you!

    Ross F.
    Ross A. Finlayson, Mar 3, 2005
    #1
    1. Advertising

  2. Ross A. Finlayson

    Artie Gold Guest

    Re: Variadic functions calling variadic functions with the argumentlist, HLL bit shifts on LE processors

    Ross A. Finlayson wrote:
    > Hi,
    >
    > I hope you can help me understand the varargs facility.
    >
    > Say I am programming in ISO C including stdarg.h and I declare a
    > function as so:
    >
    > void log_printf(const char* logfilename, const char* formatter, ...);
    >
    > Then, I want to call it as so:
    >
    > int i = 1;
    > const char* s = "string";
    >
    > log_printf("logfile.txt", "Int: %d String %s", i, s);
    >
    > Then, in the definition of log_printf I have something along the lines
    > of this:
    >
    > void log_printf(const char* logfilename, const char* formatter, ...){
    >
    > va_list ap;
    >
    > FILE* logfile;
    > if(logfilename == NULL) { return; }
    > if(formatter == NULL) { return; }
    > logfile = fopen(logfilename, "a");
    > if( logfile == NULL ) { return; }
    >
    > va_start(ap, formatter);
    >
    > fprintf(logfile, formatter, ap );
    >
    > va_end(ap);
    >
    > fprintf(logfile, "\n");
    > fclose(logfile);
    > return;
    >
    > }
    >
    > As you can see I want to call fprintf with its variable argument list
    > that is the same list of arguments as passed to log_printf.
    >


    Yuo want vfprintf(); that's exactly what it's for.
    >
    > Then, also I have some questions about shift on little-endian
    > processors. As an obligatory topicality justification, performance
    > issues of C are on-topic on comp.lang.c. Anyways, in the HLL C when


    Erm, no -- at least not at the implementation level.

    [snip]

    >
    > 1. variadic function arguments going to called variadic function?
    > 2. HLL shift on Little-Endian?
    > 3. linker hints for unneeded symbols besides compiling into separate
    > units


    The operation of particular linkers is not topical here.
    >
    > Then I have some questions about best practices with signals, Microsoft
    > SEH or Structured Exception Handling, portability and runtime
    > dependency of longjmp, and stuff.
    >
    > Thank you!
    >
    > Ross F.
    >


    HTH,
    --ag
    --
    Artie Gold -- Austin, Texas
    http://it-matters.blogspot.com (new post 12/5)
    http://www.cafepress.com/goldsays
    Artie Gold, Mar 3, 2005
    #2
    1. Advertising

  3. Artie Gold wrote:

    > The operation of particular linkers is not topical here.
    > >
    > > Then I have some questions about best practices with signals,

    Microsoft
    > > SEH or Structured Exception Handling, portability and runtime
    > > dependency of longjmp, and stuff.
    > >
    > > Thank you!
    > >
    > > Ross F.
    > >

    >
    > HTH,
    > --ag
    > --
    > Artie Gold -- Austin, Texas
    > http://it-matters.blogspot.com (new post 12/5)
    > http://www.cafepress.com/goldsays


    Thanks!

    Yeah that vfprintf is just what I needed.

    http://www.cppreference.com/stdio/vprintf_vfprintf_vsprintf.html

    About the linker, some linkers do carefully extract only the needed
    symbols, and it is a linker and linker option dependent issue. While
    that is so, it affects the design of the C program, because to have the
    code generation be of reduced size without mryiad linker
    configurations, then there ends up being the clutter of hundreds of
    compilation units.

    About the HLL shift on LE architecture, it generally doesn't matter,
    yet, in wanting to consider good practices for the design of the HLL C
    language source code to be used without reconfiguration with respect to
    the processor upon which the generated instructions execute.

    About the signals and stuff, that's basically about a shim, and other
    considerations of, say, having a library component allocate heap
    memory, report errors, and other runtime aspects.

    Hey thanks again.

    Ross F.
    Ross A. Finlayson, Mar 3, 2005
    #3
  4. >I hope you can help me understand the varargs facility.
    >
    >Say I am programming in ISO C including stdarg.h and I declare a
    >function as so:
    >
    >void log_printf(const char* logfilename, const char* formatter, ...);
    >
    >Then, I want to call it as so:
    >
    >int i = 1;
    >const char* s = "string";
    >
    >log_printf("logfile.txt", "Int: %d String %s", i, s);
    >
    >Then, in the definition of log_printf I have something along the lines
    >of this:
    >
    >void log_printf(const char* logfilename, const char* formatter, ...){
    >
    > va_list ap;
    >
    > FILE* logfile;
    > if(logfilename == NULL) { return; }
    > if(formatter == NULL) { return; }
    > logfile = fopen(logfilename, "a");
    > if( logfile == NULL ) { return; }
    >
    > va_start(ap, formatter);
    >
    > fprintf(logfile, formatter, ap );


    The only reason for the existence of the vfprintf() function is to
    deal with this kind of problem.

    >
    > va_end(ap);
    >
    > fprintf(logfile, "\n");
    > fclose(logfile);
    > return;
    >
    >}
    >


    >Then, also I have some questions about shift on little-endian
    >processors. As an obligatory topicality justification, performance
    >issues of C are on-topic on comp.lang.c. Anyways, in the HLL C when
    >you shift the 32 bit unsigned value 0x00010000 one bit right the result
    >is equal to 0x00008000.


    Yes, regardless of endianness. You are supposed to write programs
    so they don't care about endianness.

    >Now, on little-endian architectures that is
    >represented in memory as
    >
    >00 00 01 00
    >
    >and then
    >
    >00 80 00 00
    >
    >On the LE register as well, its representation is


    There's no such thing as a "LE register" on most popular CPU
    architectures. Registers have whatever width they have, and the
    shift is done on the value. A key feature of registers on many
    CPUs is that they do not have a memory address and therefore do not
    have endianness.

    >
    >00 00 01 00
    >
    >and then shifting it one bit right in the HLL leads to
    >
    >00 80 00 00
    >
    >but one would hope that the shr for shift right instruction would
    >instead leave:
    >
    >00 00 00 80
    >
    >so I am wondering if besides using assembler instructions there are
    >other well known methods in the high level language for shifting,
    >particularly in terms of one bit shift and multiples of eight bit
    >shifts.


    The compiler needs to use assembler instructions. There
    isn't much else that it CAN do. You were expecting magic?
    Transmute Endianness incantations?

    >That's basically about the conflict between msb->lsb bit fill order but
    >LSB->MSB, LE or Little-Endian byte order, and correspondingly between
    >lsb->msb and MSB->LSB, BE or Big-Endian, in terms of high level shift
    >and instruction level shift.


    What conflict? A shift operation generally maps directly to an
    assembly-language shift instruction. Sometimes a shift for a variable
    amount might have to use a loop.

    It's shifting a VALUE.

    >I have not decompiled the output of the
    >compiler on the LE machine, maybe that would be the most direct way to
    >see what the compiler does to accomplish bit shifts across byte
    >boundaries that preserve numeric values, but if you happen to know
    >offhand, that would be interesting and appreciated.


    A value gets shifted in a register. Registers don't HAVE byte boundaries,
    or at least not ones that make any difference, performance wise.

    >I'm basically
    >looking at the basic necessity of assembler for a bit scanner. It is
    >acceptably easily done in the HLL, I'm just concerned about trivial
    >issues of compiler instruction generation.
    >
    >Here's another question, but it's about the linker. Say, a unit
    >contains multiple symbols, that have by default the extern storage
    >class. I guess it's up to the linker to either draw in all symbols of
    >the module or pull the used symbols out of the compiled module. I look
    >at the output with gcc, and it draws in symbols that are unused, just
    >because some other symbol in the same compilation unit is used. for


    Can you suggest a way for gcc to determine what part of the object
    to leave out based on which symbols are wanted and which aren't?
    It's not easy.

    >example:
    >
    >[space:~/Desktop/fc_test] nm Build/test
    >...
    >00003c7c T _fc_mem_free
    >00003c38 T _fc_mem_malloc
    >00003cc4 T _fc_mem_realloc
    >00003d10 T _fc_mem_stackalloc
    >00003d6c T _fc_mem_stackfree
    >...
    >
    >All of those definitions are in the same compilation or translation
    >unit, but only fc_mem_stackalloc and fc_mem_stackfree are actually
    >used, and there is no reason for that code to be in the output. I
    >change the optimization to -O3 and the unnecessary symbols are still
    >there.
    >
    >[space:~/Desktop/fc_test] space% cc -v
    >Reading specs from /usr/libexec/gcc/darwin/ppc/2.95.2/specs
    >Apple Computer, Inc. version gcc-926, based on gcc version 2.95.2
    >19991024 (release)
    >
    >It's one consideration to divide the compilation units so that each
    >compilation unit contains only one function, but that gets very bulky,
    >as some of the translation units in this project have dozens of nearly
    >identical functions, and I would rather not have hundreds of source
    >code files if I could hint to the linker to remove unnecessary code.
    >Some programs linking to the static library only need one of those
    >hundreds of generated functions.
    >
    >So:
    >
    >1. variadic function arguments going to called variadic function?
    >2. HLL shift on Little-Endian?
    >3. linker hints for unneeded symbols besides compiling into separate
    >units
    >
    >Then I have some questions about best practices with signals, Microsoft
    >SEH or Structured Exception Handling, portability and runtime
    >dependency of longjmp, and stuff.
    >
    >Thank you!
    >
    >Ross F.
    >



    Gordon L. Burditt
    Gordon Burditt, Mar 3, 2005
    #4
  5. Gordon Burditt wrote:

    > ...
    >
    > >Then, also I have some questions about shift on little-endian
    > >processors. As an obligatory topicality justification, performance
    > >issues of C are on-topic on comp.lang.c. Anyways, in the HLL C when
    > >you shift the 32 bit unsigned value 0x00010000 one bit right the

    result
    > >is equal to 0x00008000.

    >
    > Yes, regardless of endianness. You are supposed to write programs
    > so they don't care about endianness.
    >
    > >Now, on little-endian architectures that is
    > >represented in memory as
    > >
    > >00 00 01 00
    > >
    > >and then
    > >
    > >00 80 00 00
    > >
    > >On the LE register as well, its representation is

    >
    > There's no such thing as a "LE register" on most popular CPU
    > architectures. Registers have whatever width they have, and the
    > shift is done on the value. A key feature of registers on many
    > CPUs is that they do not have a memory address and therefore do not
    > have endianness.
    >
    > >
    > >00 00 01 00
    > >
    > >and then shifting it one bit right in the HLL leads to
    > >
    > >00 80 00 00
    > >
    > >but one would hope that the shr for shift right instruction would
    > >instead leave:
    > >
    > >00 00 00 80
    > >
    > >so I am wondering if besides using assembler instructions there are
    > >other well known methods in the high level language for shifting,
    > >particularly in terms of one bit shift and multiples of eight bit
    > >shifts.

    >
    > The compiler needs to use assembler instructions. There
    > isn't much else that it CAN do. You were expecting magic?
    > Transmute Endianness incantations?
    >
    > >That's basically about the conflict between msb->lsb bit fill order

    but
    > >LSB->MSB, LE or Little-Endian byte order, and correspondingly

    between
    > >lsb->msb and MSB->LSB, BE or Big-Endian, in terms of high level

    shift
    > >and instruction level shift.

    >
    > What conflict? A shift operation generally maps directly to an
    > assembly-language shift instruction. Sometimes a shift for a

    variable
    > amount might have to use a loop.
    >
    > It's shifting a VALUE.
    >
    > >I have not decompiled the output of the
    > >compiler on the LE machine, maybe that would be the most direct way

    to
    > >see what the compiler does to accomplish bit shifts across byte
    > >boundaries that preserve numeric values, but if you happen to know
    > >offhand, that would be interesting and appreciated.

    >
    > A value gets shifted in a register. Registers don't HAVE byte

    boundaries,
    > or at least not ones that make any difference, performance wise.
    >
    > >I'm basically
    > >looking at the basic necessity of assembler for a bit scanner. It

    is
    > >acceptably easily done in the HLL, I'm just concerned about trivial
    > >issues of compiler instruction generation.
    > >
    > >Here's another question, but it's about the linker. Say, a unit
    > >contains multiple symbols, that have by default the extern storage
    > >class. I guess it's up to the linker to either draw in all symbols

    of
    > >the module or pull the used symbols out of the compiled module. I

    look
    > >at the output with gcc, and it draws in symbols that are unused,

    just
    > >because some other symbol in the same compilation unit is used. for

    >
    > Can you suggest a way for gcc to determine what part of the object
    > to leave out based on which symbols are wanted and which aren't?
    > It's not easy.
    >
    > >example:
    > >
    > >[space:~/Desktop/fc_test] nm Build/test
    > >...
    > >00003c7c T _fc_mem_free
    > >00003c38 T _fc_mem_malloc
    > >00003cc4 T _fc_mem_realloc
    > >00003d10 T _fc_mem_stackalloc
    > >00003d6c T _fc_mem_stackfree
    > >...
    > >
    > >All of those definitions are in the same compilation or translation
    > >unit, but only fc_mem_stackalloc and fc_mem_stackfree are actually
    > >used, and there is no reason for that code to be in the output. I
    > >change the optimization to -O3 and the unnecessary symbols are still
    > >there.
    > >
    > >[space:~/Desktop/fc_test] space% cc -v
    > >Reading specs from /usr/libexec/gcc/darwin/ppc/2.95.2/specs
    > >Apple Computer, Inc. version gcc-926, based on gcc version 2.95.2
    > >19991024 (release)
    > >
    > >It's one consideration to divide the compilation units so that each
    > >compilation unit contains only one function, but that gets very

    bulky,
    > >as some of the translation units in this project have dozens of

    nearly
    > >identical functions, and I would rather not have hundreds of source
    > >code files if I could hint to the linker to remove unnecessary code.
    > >Some programs linking to the static library only need one of those
    > >hundreds of generated functions.
    > >
    > >So:
    > >
    > >1. variadic function arguments going to called variadic function?
    > >2. HLL shift on Little-Endian?
    > >3. linker hints for unneeded symbols besides compiling into separate
    > >units
    > >
    > >Then I have some questions about best practices with signals,

    Microsoft
    > >SEH or Structured Exception Handling, portability and runtime
    > >dependency of longjmp, and stuff.
    > >
    > >Thank you!
    > >
    > >Ross F.
    > >

    >
    >
    > Gordon L. Burditt


    Hi,

    Not really, no.

    About Little-Endian, I'm aware many chips have big and little-endian
    modes, eg SPARC, PowerPC, Itanium, but the Intel x86 chips are
    little-endian, and as well the low order bytes of the word are aliased
    to specific sections of the register, and there are zillions of those
    things sitting around.

    About the code size handling, that's a good question, I don't offhand
    know the binary format of the object files, ELF, COFF, PE, a.out, with
    DWARF or STABS or ???

    I guess there's basically static and dynamic analysis. That's
    different than static and extern storage, with the auto and the hey hey
    and the I-don't-know. It's relocatable code, mostly, if you can tell
    the compiler that you aren't calling any functions by fixed literal
    addresses in the objects, or even if you are, then it should be able to
    resolve that, subject to limitations in the intermediate object format,
    it does or otherwise it wouldn't not satisfy the symbols in unused
    compilation units, ie, linking unused objects.

    Some compilers do that, reduced size, they have to do it somehow. If I
    have 45 compilation objects in the build, I'd rather not have 145, but
    if I have a compilation unit with 20 nearly identical functions with
    widely varying runtime, and the program only uses one of those
    functions, then I definitely want the least amount of stuff there, from
    compile time.

    About Little-Endian and shift, I'm trying to think of an example. Say
    I have a 32 bit unsigned word with each byte being an encoded symbol.

    0x AA BB CC DD

    If a certain bit in another register is set, I want to shift that
    right.

    0x 00 AA BB CC

    Then, regardless of whether CC or DD is in the low byte, I test that
    byte for a bit. If it's set, there's this processing, and then what's
    left is conditionally shifted 16 bits.

    0x 00 00 00 AA
    or
    0x 00 00 AA BB

    Now, all I'm interested in is the low byte there. So, on the x86
    register eax, say, it is as so

    -------eax -----
    --eah--- ---eal---
    -ah- -al-
    AA 00 00 00

    then, if I want to move that byte off then I think it has to go onto ah
    or al to be moved into a memory location at a byte offset. So, I'm
    causing myself grief about flipping the original number to 0x DD CC BB
    AA and shifting it the other way, for the Little-Endian case, because
    those bit tests (and + jnz, I suppose) or moves work off of the byte
    size register aliases off of the x86.

    Then I get to thinking about my near complete lack of knowledge of how
    the Big-Endian processor best or most happily works with moving bytes
    off of the register into memory, using C. What are some good examples
    of useful knowledge from C of the byte order of the processor, and how
    it loads bytes to and from memory, or casting int & 0xFF to byte?

    Besides this kind of stuff, all the non-fixed width or standard char*
    or wchar_t* function integer variables have size of sizeof(register_t)
    from machine/types.h, but not actually including that file or using
    that type, partially because it's signed and thus right shifts have
    sign extension, which is unfeasible for sequential testing of each bit
    of the register, with sign-entending the right shift of the initially
    single bit mask. That's not pedantically about the C language
    specification, but it is about proper usage.

    So, I should probably go compile some shifts and step through their
    disassembly on the Pentium, to perhaps enhance but hopefully diminish
    my deep personal confusion about low-voltage electronics.

    Then, I'm wondering about signal-friendly uses of basically heap memory
    or I/O functions. That's about installing a handler to clean up the
    custom library junk, for reliability, but also to some extent about the
    existence, on one single-core processor, of one thread of execution.

    Also, when I'm accessing this buffer full of bytes, I assume the buffer
    itself is word aligned and its length is an integral number of words,
    and I am concerned that bounds checking will complain. So, that is not
    really a problem, it just needs some code to treat the beginning and
    end of the memory buffer as bytes and the rest as words.

    I was playing with the vprintf and getting undefined behavior for not
    calling va_start, it was printing FEEDFACE. Now, CAFEBABE I have seen
    before, DEADBEEF, but FEEDFACE was a new one to me. It works great,
    I'm really happy about that.

    Hey, I compile with -pedantic -ansi, and besides complaining about long
    long not being ANSI from stdlib.h, it says "ANSI C does not support
    const or volatile functions" for something along the lines of const
    char* name(int number);.

    Anyways that's a lot of unrelated garbage to your question. So, no, I
    don't offhand know how the linker works.

    Thanks! Warm regards,

    Ross F.
    Ross A. Finlayson, Mar 3, 2005
    #5
  6. >About Little-Endian, I'm aware many chips have big and little-endian
    >modes, eg SPARC, PowerPC, Itanium, but the Intel x86 chips are
    >little-endian, and as well the low order bytes of the word are aliased
    >to specific sections of the register, and there are zillions of those
    >things sitting around.


    So what? Write your code so it doesn't matter what endian the
    processor is.

    >About Little-Endian and shift, I'm trying to think of an example. Say
    >I have a 32 bit unsigned word with each byte being an encoded symbol.
    >
    >0x AA BB CC DD


    It's a VALUE. Quit putting those stupid spaces in there.

    >If a certain bit in another register is set, I want to shift that
    >right.
    >
    >0x 00 AA BB CC
    >
    >Then, regardless of whether CC or DD is in the low byte, I test that
    >byte for a bit. If it's set, there's this processing, and then what's
    >left is conditionally shifted 16 bits.
    >
    >0x 00 00 00 AA
    >or
    >0x 00 00 AA BB
    >
    >Now, all I'm interested in is the low byte there. So, on the x86
    >register eax, say, it is as so
    >
    >-------eax -----
    >--eah--- ---eal---
    > -ah- -al-
    >AA 00 00 00
    >
    >then, if I want to move that byte off then I think it has to go onto ah
    >or al to be moved into a memory location at a byte offset.


    A one-byte value does not have byte order.

    >So, I'm
    >causing myself grief about flipping the original number to 0x DD CC BB
    >AA and shifting it the other way, for the Little-Endian case, because
    >those bit tests (and + jnz, I suppose) or moves work off of the byte
    >size register aliases off of the x86.


    What the heck are you talking about? You think flipping the order
    is FASTER? It's not. You think flipping the order and shifting gives
    you the same answer? It doesn't. Reversing the order of BYTES does not
    reverse the order of BITS.

    If you're doing a 32-bit shift, IT DOES A 32-BIT SHIFT. Stupid register
    aliases have nothing to do with it. By definition, AL refers to the low-order
    byte of EAX on an Intel processor, and has nothing to do with whether the
    processor is big-endian or little-endian. If you're doing a 32-bit bitwise
    AND, it does a full 32-bit bitwise AND. byte size register aliases have
    nothing to do with it.

    >Then I get to thinking about my near complete lack of knowledge of how
    >the Big-Endian processor best or most happily works with moving bytes
    >off of the register into memory, using C. What are some good examples
    >of useful knowledge from C of the byte order of the processor, and how
    >it loads bytes to and from memory, or casting int & 0xFF to byte?


    When you use C, you're not supposed to CARE about the endianness, or lack
    thereof, of the processor.

    If it wants to load a 32-bit value out of memory, IT LOADS A 32-bit VALUE
    OUT OF MEMORY. There are no speed issues about which end is done first,
    so you shouldn't care. The bytes wind up in the right places in the register
    based on endianness or lack thereof.

    >So, I should probably go compile some shifts and step through their
    >disassembly on the Pentium, to perhaps enhance but hopefully diminish
    >my deep personal confusion about low-voltage electronics.


    The specification of the software interface has nothing to do with whether
    the Pentium uses low-voltage electronics or quantum warp fields or orbital
    mind-control lasers. It still works the same.

    Gordon L. Burditt
    Gordon Burditt, Mar 3, 2005
    #6
  7. Gordon Burditt wrote:
    > >About Little-Endian, I'm aware many chips have big and little-endian
    > >modes, eg SPARC, PowerPC, Itanium, but the Intel x86 chips are
    > >little-endian, and as well the low order bytes of the word are

    aliased
    > >to specific sections of the register, and there are zillions of

    those
    > >things sitting around.

    >
    > So what? Write your code so it doesn't matter what endian the
    > processor is.
    >
    > >About Little-Endian and shift, I'm trying to think of an example.

    Say
    > >I have a 32 bit unsigned word with each byte being an encoded

    symbol.
    > >
    > >0x AA BB CC DD

    >
    > It's a VALUE. Quit putting those stupid spaces in there.
    >
    > >If a certain bit in another register is set, I want to shift that
    > >right.
    > >
    > >0x 00 AA BB CC
    > >
    > >Then, regardless of whether CC or DD is in the low byte, I test that
    > >byte for a bit. If it's set, there's this processing, and then

    what's
    > >left is conditionally shifted 16 bits.
    > >
    > >0x 00 00 00 AA
    > >or
    > >0x 00 00 AA BB
    > >
    > >Now, all I'm interested in is the low byte there. So, on the x86
    > >register eax, say, it is as so
    > >
    > >-------eax -----
    > >--eah--- ---eal---
    > > -ah- -al-
    > >AA 00 00 00
    > >
    > >then, if I want to move that byte off then I think it has to go onto

    ah
    > >or al to be moved into a memory location at a byte offset.

    >
    > A one-byte value does not have byte order.
    >
    > >So, I'm
    > >causing myself grief about flipping the original number to 0x DD CC

    BB
    > >AA and shifting it the other way, for the Little-Endian case,

    because
    > >those bit tests (and + jnz, I suppose) or moves work off of the byte
    > >size register aliases off of the x86.

    >
    > What the heck are you talking about? You think flipping the order
    > is FASTER? It's not. You think flipping the order and shifting

    gives
    > you the same answer? It doesn't. Reversing the order of BYTES does

    not
    > reverse the order of BITS.
    >
    > If you're doing a 32-bit shift, IT DOES A 32-BIT SHIFT. Stupid

    register
    > aliases have nothing to do with it. By definition, AL refers to the

    low-order
    > byte of EAX on an Intel processor, and has nothing to do with whether

    the
    > processor is big-endian or little-endian. If you're doing a 32-bit

    bitwise
    > AND, it does a full 32-bit bitwise AND. byte size register aliases

    have
    > nothing to do with it.
    >
    > >Then I get to thinking about my near complete lack of knowledge of

    how
    > >the Big-Endian processor best or most happily works with moving

    bytes
    > >off of the register into memory, using C. What are some good

    examples
    > >of useful knowledge from C of the byte order of the processor, and

    how
    > >it loads bytes to and from memory, or casting int & 0xFF to byte?

    >
    > When you use C, you're not supposed to CARE about the endianness, or

    lack
    > thereof, of the processor.
    >
    > If it wants to load a 32-bit value out of memory, IT LOADS A 32-bit

    VALUE
    > OUT OF MEMORY. There are no speed issues about which end is done

    first,
    > so you shouldn't care. The bytes wind up in the right places in the

    register
    > based on endianness or lack thereof.
    >
    > >So, I should probably go compile some shifts and step through their
    > >disassembly on the Pentium, to perhaps enhance but hopefully

    diminish
    > >my deep personal confusion about low-voltage electronics.

    >
    > The specification of the software interface has nothing to do with

    whether
    > the Pentium uses low-voltage electronics or quantum warp fields or

    orbital
    > mind-control lasers. It still works the same.
    >
    > Gordon L. Burditt


    Well, sure, C has no concept of endian-ness, and the code might be
    compiled on a PDP- or Middle-Endian system, but it's a fact that for
    all intents and purposes the byte order of the platform is Big-Endian,
    or Little-Endian.

    Now, with dealing with an int, it is true that there is no reason to
    care what the byte order is, unless you have to deal with ints written
    to file or stream as a sequential sequence of bytes.

    Basically I'm talking processing data that is a sequence of bytes,
    8-bit octets. It can be processed by loading a byte at a time, where
    if CHAR_BIT does not equal 8, for example on some strange processor
    with 36 bit words and 9 bit bytes that is basically mythical, the data
    in question is still stored as a sequence of 8-bit octets of binary
    digits, or bytes.

    As well, the sequence of the bytes increases as the memory address
    increases. That is to say, the most significant byte of the data
    stream is at the lower memory address, similarly to how the most
    significant byte of a Big-Endian integer is stored at the lower memory
    address.

    So, to be portable and byte-order agnostic, load a byte at a time of
    the input data stream and process it.

    To be slightly more pragmatic, load as many bytes as fit on a register
    at once onto the register. It's important from the processor's
    perspective that the memory address from which those BYTES_PER_WORD
    many bytes are loaded is aligned to the size of the word, that
    sizeof(word_t) divides evenly into the address. On some platforms,
    where that is not the case, the program results SIGBUS Bus Error
    signal, and on others the processor grumpily stops what it's doing and
    loads the unaligned word. Here, a word means the generic register
    size, and not necessarily the octet pair, where the DWORD type from
    windows.h is the type of the register word on 32-bit systems, or a type
    defined as project_word_t, or project_sword_t, where the default word
    type is unsigned.

    So anyways, a point here is that on the little-endian machine, when the
    word from the data sequence is loaded, say 0xAABBCCDD, then the number
    on the register is 0xDDCCBBAA. Now, let's say for some bizarre reason
    the idea is to test each bit in data sequence in order, even if a table
    lookup might _seem_ more efficient, because the loop fits on a cache
    line and there are more aligned accesses. Anyways, on the
    Little-Endian processor, then there is the consideration of scanning
    what are bits 31 to 0 of the data sequence, on the register they are
    bits 7 to 0, 15 to 8, 23 to 16, and 31 to 24.

    If the fill order of the bits is reversed, then the idea is to scan in
    order bit 7 to bit 0 of each byte, then on the little-endian processor
    the register word can be processed as bit 0 to bit 31 on the register.
    I digress.

    http://groups-beta.google.com/group/comp.lang.asm.x86/msg/4f29d6567da68957

    There are very real considerations of how to organize the data within a
    word where only pieces of the word have meaning, that is, instead of
    being an integer it is basically a char[4], or char[8] on 64-bit
    processor words, and on different endian architectures, those arrays
    will essentially be in reversed order when loaded onto the register by
    loading that word from memory.

    One of those considerations is in taking one of those char values and
    extracting it from the int value. When that's done, then if the value
    happens to be on one of those aliased byte registers on the x86, then
    it is moved off into memory with one instruction, otherwise it has to
    be copied to the register and then moved. In the high level language
    C, it's possible to design the organization of the embedded codes
    within the integer so that they naturally fall upon these boundaries,
    making it easier for the compiler to not generate what would be
    unneeded operations in light of design and compile time decisions.

    The same holds true, in an opposite kind of way, for big-endian
    processors and they're methods of moving part of the contents of the
    register to a byte location.

    The point in making these kinds of design decisions and wasting the
    time to come to these useful conclusions instead of just using getc and
    putc on a file pointer, which is great, is that they allow some
    rational accomodation of the processor features without totally
    specializing for each processor, where whether the processor is Big- or
    Little-Endian, or not, makes a difference in the correct algorithms in
    terms of words and sequences of bytes, and specializing the functions
    for those what are basically high level characteristics, in terms of
    their immediate consequence the word as an array or vector of smaller
    sized scalar integers.

    The data streams are coming in network order, that means big-endian.
    Data is most efficiently loaded onto the processor register a processor
    register word at a time.

    Anyways, I guess I'm asking not to hear "no, program about bytes" but
    rather, yes, "C is a useful high level programming language meant for
    generation that closely approximates the machine features and here are
    some good practices for working with sequences of data organized in
    bytes."

    Thanks again, I appreciate your insight. Warm regards,

    Ross F.


    Is this wrong?

    #ifndef fc_typedefs_h
    #define fc_typedefs_h

    #include <stddef.h> /* wchar_t */

    #include <limits.h>
    #if CHAR_BIT != 8
    #error This application is supported only on systems with CHAR_BIT==8
    #endif

    #define PLATFORM_BYTE_ORDER be

    /** This is the type of the generic data pointer. */
    typedef void* fc_ptr_t;

    /** This is the scalar unsigned integer type of the generic data
    pointer. */
    typedef unsigned int fc_ptr_int_t;

    /** This is the size of memory blocks in bytes, sizeof's size_t. */
    typedef unsigned int fc_size_t;

    /** This should be the unsigned word size of the processor, 32, 64,
    128, ..., eg register_t, and not subject to right shift sign extension.
    */
    // typedef unsigned int fc_word_t
    typedef unsigned int fc_word_t;

    /** This should be the signed word size of the processor, 32, 64, 128,
    ...., eg register_t. */
    typedef int fc_sword_t;

    /** This is sizeof(fc_word_t) * 8, the processor register word width in
    bits. */
    #define fc_word_width 32

    /** This macro suffixes the function identifier with the word width
    type size identifier. */
    #define fc_word_width_suffix(x) x##_32

    /** This is the number of bits in a byte, eg CHAR_BIT, and must be 8.
    */
    #define fc_bits_in_byte 8

    /** This is fc_word_width / fc_bits_in_byte -1, eg 32/8 -1 = 3, 64/8 -
    1 = 7. */
    #define fc_align_word_mask (0x00000003)

    /** This is fc_word_width - 1. */
    #define fc_word_max_shift 31

    /** This is the literal suffix for literals 0x12345678 of fc_word_t, eg
    empty or UL, then ULL, ..., and should probably be unchanged. */
    #define UL UL

    /** This is the default unsigned scalar integer type. */
    typedef unsigned int fc_uint_t;
    /** This is the default signed scalar integer type. */
    typedef int fc_int_t;

    /** This type is returned from functions as error or status code. */
    typedef fc_int_t fc_ret_t;

    /** This is the type of the symbol from the coding alphabet, >= 8 bits
    wide, sizeof == 1. */
    typedef unsigned char fc_symbol_t;

    /** This is a character type for use with standard functions. */
    typedef char fc_char;
    /** This is a wide character type for use with standard functions. */
    typedef wchar_t fc_wchar_t;

    /** This is a fixed-width scalar type. */
    typedef char fc_int8;
    /** This is a fixed-width scalar type. */
    typedef unsigned char fc_uint8;
    /** This is a fixed-width scalar type. */
    typedef short fc_int16;
    /** This is a fixed-width scalar type. */
    typedef unsigned short fc_uint16;
    /** This is a fixed-width scalar type. */
    typedef int fc_int32;
    /** This is a fixed-width scalar type. */
    typedef unsigned int fc_uint32;

    #if fc_word_width == 64

    /** This is a fixed-width scalar type. */
    typedef long long fc_int64;
    /** This is a fixed-width scalar type. */
    typedef unsigned long long fc_uint64;

    #endif /* fc_word_width == 64 */

    /** This is a Boolean type. */
    typedef fc_word_t fc_bool_t;

    #ifndef NULL
    #define NULL ((void*)0)
    #endif



    #endif /* fc_typedefs_h */
    Ross A. Finlayson, Mar 3, 2005
    #7
  8. In article <>,
    Ross A. Finlayson <> wrote:
    :Basically I'm talking processing data that is a sequence of bytes,
    :8-bit octets. It can be processed by loading a byte at a time, where
    :if CHAR_BIT does not equal 8, for example on some strange processor
    :with 36 bit words and 9 bit bytes that is basically mythical, the data
    :in question is still stored as a sequence of 8-bit octets of binary
    :digits, or bytes.

    Mythical? The Sigma Xerox 5, 7, and 9; the Honeywell L6 and L66;
    The PDP-6 and PDP-10 (used for TOPS-20). And probably others.

    For awhile in the early 80's, it looked like 36 bit words were going
    to replace 32 bit words.

    Now, if you'd said "legendary" instead of "mythical"...
    --
    Reviewers should be required to produce a certain number of
    negative reviews - like police given quotas for handing out
    speeding tickets. -- The Audio Anarchist
    Walter Roberson, Mar 3, 2005
    #8
  9. >Well, sure, C has no concept of endian-ness, and the code might be
    >compiled on a PDP- or Middle-Endian system, but it's a fact that for
    >all intents and purposes the byte order of the platform is Big-Endian,
    >or Little-Endian.


    And the processor's price tag might be denominated in dollars or
    it might be in Euros, but that isn't relevant either.

    >Now, with dealing with an int, it is true that there is no reason to
    >care what the byte order is, unless you have to deal with ints written
    >to file or stream as a sequential sequence of bytes.


    In that case, you have to deal with the bytes in the ENDIAN ORDER THEY
    WERE WRITTEN IN, not the endian order of the native processor.

    >Basically I'm talking processing data that is a sequence of bytes,
    >8-bit octets. It can be processed by loading a byte at a time, where
    >if CHAR_BIT does not equal 8, for example on some strange processor
    >with 36 bit words and 9 bit bytes that is basically mythical, the data
    >in question is still stored as a sequence of 8-bit octets of binary
    >digits, or bytes.
    >
    >As well, the sequence of the bytes increases as the memory address
    >increases. That is to say, the most significant byte of the data
    >stream is at the lower memory address, similarly to how the most
    >significant byte of a Big-Endian integer is stored at the lower memory
    >address.
    >
    >So, to be portable and byte-order agnostic, load a byte at a time of
    >the input data stream and process it.
    >
    >To be slightly more pragmatic, load as many bytes as fit on a register
    >at once onto the register. It's important from the processor's
    >perspective that the memory address from which those BYTES_PER_WORD
    >many bytes are loaded is aligned to the size of the word, that
    >sizeof(word_t) divides evenly into the address.


    You are often NOT guaranteed this in network packets.
    The required alignment is NOT necessarily that the address
    is a multiple of the word size of that type. For example, an
    8-byte quantity might have to be aligned on a multiple of *4*,
    not 8.

    >On some platforms,
    >where that is not the case, the program results SIGBUS Bus Error
    >signal, and on others the processor grumpily stops what it's doing and
    >loads the unaligned word.


    Or sometimes it loads part of the *WRONG* word.

    >Here, a word means the generic register
    >size, and not necessarily the octet pair, where the DWORD type from
    >windows.h is the type of the register word on 32-bit systems, or a type
    >defined as project_word_t, or project_sword_t, where the default word
    >type is unsigned.


    There is no <windows.h> in C, nor any of the *-crap-word types.

    >
    >So anyways, a point here is that on the little-endian machine, when the
    >word from the data sequence is loaded, say 0xAABBCCDD, then the number
    >on the register is 0xDDCCBBAA. Now, let's say for some bizarre reason
    >the idea is to test each bit in data sequence in order, even if a table
    >lookup might _seem_ more efficient, because the loop fits on a cache
    >line and there are more aligned accesses. Anyways, on the
    >Little-Endian processor, then there is the consideration of scanning
    >what are bits 31 to 0 of the data sequence, on the register they are
    >bits 7 to 0, 15 to 8, 23 to 16, and 31 to 24.


    The bits are numbered 0x00000001, 0x00000002, 0x00000004, 0x00000008,
    0x00000010, ... 0x80000000. You can use a value of 1 advanced by shifting
    to sequence through them. C does *NOT* define a bit numbering.
    There is no universal agreement on little-endian machines whether bit 31
    refers to the least-significant or most-significant bit of a 32-bit word.
    There is no universal agreement on big-endian machines either.

    >If the fill order of the bits is reversed, then the idea is to scan in
    >order bit 7 to bit 0 of each byte, then on the little-endian processor
    >the register word can be processed as bit 0 to bit 31 on the register.
    >I digress.


    There is no such bit numbering. If you want to define your own, don't
    present it as being universal, and you have to define what it is first.

    >
    >http://groups-beta.google.com/group/comp.lang.asm.x86/msg/4f29d6567da68957
    >
    >There are very real considerations of how to organize the data within a
    >word where only pieces of the word have meaning, that is, instead of
    >being an integer it is basically a char[4], or char[8] on 64-bit
    >processor words, and on different endian architectures, those arrays
    >will essentially be in reversed order when loaded onto the register by
    >loading that word from memory.


    There's a reason why char arrays and multi-byte integers are
    treated differently.

    >One of those considerations is in taking one of those char values and
    >extracting it from the int value. When that's done, then if the value
    >happens to be on one of those aliased byte registers on the x86, then
    >it is moved off into memory with one instruction, otherwise it has to
    >be copied to the register and then moved. In the high level language
    >C, it's possible to design the organization of the embedded codes
    >within the integer so that they naturally fall upon these boundaries,
    >making it easier for the compiler to not generate what would be
    >unneeded operations in light of design and compile time decisions.


    You're trying to save amounts of CPU time that may take years to
    recover if you have to recompile the code once to get the savings.

    >The same holds true, in an opposite kind of way, for big-endian
    >processors and they're methods of moving part of the contents of the
    >register to a byte location.
    >
    >The point in making these kinds of design decisions and wasting the
    >time to come to these useful conclusions instead of just using getc and
    >putc on a file pointer, which is great, is that they allow some
    >rational accomodation of the processor features without totally
    >specializing for each processor, where whether the processor is Big- or
    >Little-Endian, or not, makes a difference in the correct algorithms in
    >terms of words and sequences of bytes, and specializing the functions
    >for those what are basically high level characteristics, in terms of
    >their immediate consequence the word as an array or vector of smaller
    >sized scalar integers.


    Such micro-optimization is a waste of time, and encourages such
    things as the carefully hand-tuned bubble sort.

    >The data streams are coming in network order, that means big-endian.
    >Data is most efficiently loaded onto the processor register a processor
    >register word at a time.
    >
    >Anyways, I guess I'm asking not to hear "no, program about bytes" but
    >rather, yes, "C is a useful high level programming language meant for
    >generation that closely approximates the machine features and here are
    >some good practices for working with sequences of data organized in
    >bytes."


    Unless you have benchmarks that demonstrate that the section of code
    you are discussing is a bottleneck, I think you should be IGNORING
    the endianness of the machine and write portable code.

    >Thanks again, I appreciate your insight. Warm regards,
    >
    >Ross F.
    >
    >
    >Is this wrong?
    >
    >#ifndef fc_typedefs_h
    >#define fc_typedefs_h
    >
    >#include <stddef.h> /* wchar_t */
    >
    >#include <limits.h>
    >#if CHAR_BIT != 8
    >#error This application is supported only on systems with CHAR_BIT==8
    >#endif
    >
    >#define PLATFORM_BYTE_ORDER be
    >
    >/** This is the type of the generic data pointer. */
    >typedef void* fc_ptr_t;
    >
    >/** This is the scalar unsigned integer type of the generic data
    >pointer. */
    >typedef unsigned int fc_ptr_int_t;
    >
    >/** This is the size of memory blocks in bytes, sizeof's size_t. */
    >typedef unsigned int fc_size_t;
    >
    >/** This should be the unsigned word size of the processor, 32, 64,
    >128, ..., eg register_t, and not subject to right shift sign extension.
    >*/
    >// typedef unsigned int fc_word_t
    >typedef unsigned int fc_word_t;
    >
    >/** This should be the signed word size of the processor, 32, 64, 128,
    >..., eg register_t. */
    >typedef int fc_sword_t;
    >
    >/** This is sizeof(fc_word_t) * 8, the processor register word width in
    >bits. */
    >#define fc_word_width 32
    >
    >/** This macro suffixes the function identifier with the word width
    >type size identifier. */
    >#define fc_word_width_suffix(x) x##_32
    >
    >/** This is the number of bits in a byte, eg CHAR_BIT, and must be 8.
    >*/
    >#define fc_bits_in_byte 8
    >
    >/** This is fc_word_width / fc_bits_in_byte -1, eg 32/8 -1 = 3, 64/8 -
    >1 = 7. */
    >#define fc_align_word_mask (0x00000003)
    >
    >/** This is fc_word_width - 1. */
    >#define fc_word_max_shift 31
    >
    >/** This is the literal suffix for literals 0x12345678 of fc_word_t, eg
    >empty or UL, then ULL, ..., and should probably be unchanged. */
    >#define UL UL
    >
    >/** This is the default unsigned scalar integer type. */
    >typedef unsigned int fc_uint_t;
    >/** This is the default signed scalar integer type. */
    >typedef int fc_int_t;
    >
    >/** This type is returned from functions as error or status code. */
    >typedef fc_int_t fc_ret_t;
    >
    >/** This is the type of the symbol from the coding alphabet, >= 8 bits
    >wide, sizeof == 1. */
    >typedef unsigned char fc_symbol_t;
    >
    >/** This is a character type for use with standard functions. */
    >typedef char fc_char;
    >/** This is a wide character type for use with standard functions. */
    >typedef wchar_t fc_wchar_t;
    >
    >/** This is a fixed-width scalar type. */
    >typedef char fc_int8;
    >/** This is a fixed-width scalar type. */
    >typedef unsigned char fc_uint8;
    >/** This is a fixed-width scalar type. */
    >typedef short fc_int16;
    >/** This is a fixed-width scalar type. */
    >typedef unsigned short fc_uint16;
    >/** This is a fixed-width scalar type. */
    >typedef int fc_int32;
    >/** This is a fixed-width scalar type. */
    >typedef unsigned int fc_uint32;
    >
    >#if fc_word_width == 64
    >
    >/** This is a fixed-width scalar type. */
    >typedef long long fc_int64;
    >/** This is a fixed-width scalar type. */
    >typedef unsigned long long fc_uint64;
    >
    >#endif /* fc_word_width == 64 */
    >
    >/** This is a Boolean type. */
    >typedef fc_word_t fc_bool_t;
    >
    >#ifndef NULL
    >#define NULL ((void*)0)
    >#endif
    >
    >
    >
    >#endif /* fc_typedefs_h */
    >
    Gordon Burditt, Mar 4, 2005
    #9
  10. Hi,

    I was wrong about the deal with having the literal 0xAABBCCDD, shifting
    it right 24 bits, and having it not be on the low byte of the register.
    The little-endian processor would write it to memory in order as DD CC
    BB AA, but that is irrelevant, and casting the word to byte is as
    simple as using the aliased byte register. For that I am glad.

    There's still the notion of interpreting vectors of bytes in a data
    stream. Generally they're not byte-swapped in register words, so when
    four bytes of the "Big-Endian" data stream are moved onto a
    Little-Endian register, then the low word of the register contains the
    first of those elements instead of the fourth.

    That is vaguely troubling. In terms of testing each bit of the input
    sequence in order, for example with entropy-coded data, besides fill
    bytes and mixed-mode coding, besides that, the order of the bits in the
    sequence isn't 31-0 on the LE register it's 7-0, 15-8, 23-16, and
    31-24.

    A simple consideration with that is to just swap the order of the bytes
    on the register. In C, that's a function, because it uses temp
    variables, but there are instructions to accomplish that affect.

    Say for example, and I know this is slow, but the idea is to test bit
    31 to bit 0 and process based upon that.

    word_t bitmask = 0x80000000;

    Now, on big or little endian processors, that has bit 31 set, where the
    bits are numbered from the right starting at zero.

    Then, say there's a word aligned pointer to the data.

    word_t* inputdata

    Then, a variable to hold the current word of input data.

    word_t currentinput;

    Then, get the currentinput from the inputdata.

    currentinput = *inputdata++;

    That dereferences the input pointer and sets the value of the current
    input to the that value, and then it increments the inputdata pointer.
    When I say increment the input pointer, it adds the sizeof word_t to
    the pointer, which is an integer, ie that's the same as

    currentinput = *inputdata;
    inputdata = inputdata + sizeof(word_t);

    Except that addition is illegal on pointer types. Postfix increment by
    pointer type size: OK, comparison of pointer with < and >, OK, adding
    pointers and integers, bad, leading to casting the pointer to an
    integer sufficiently large to contain the pointer, on those platforms
    with such an integer type, or otherwise in general keeping pointers and
    integers separate.

    So anyways then currentinput has the first four bytes of data. Let's
    say the data in the stream was

    0x11 0x22 0x44 0x88

    or

    00010001 00100010 01000100 10001000

    So anyways on the little endian processor testing against the bitmask
    yields true, because on the register that data was loaded right to left
    from the increasing memory:

    8844221

    And then although the first bit of the data stream was zero, the first
    bit on the register from loading the initial subsequence of the data is
    one.

    So, on little-endian architecture, after loading the data, swap the
    bytes on that register.

    #if PLATFORM_BYTE_ORDER == le
    reverse_bytes(currentinput)
    #endif

    Otherwise, the bitmask starts as

    unsigned bitmask 0x00000080;

    and then after every test against current input, the bitmask is tested
    against the constant 0x01010101, if nonzero then shift the bitmask left
    15 bits, else, shift right 1 bit. If it's nonzero, then before
    shifting left, test against 0x01000000 and if that's nonzero break the
    loop and go to the current input.

    Now, for each of the 32 bits there is a test against the byte boundary
    mask, and so swapping the bytes beforehand leads to the same number or
    slightly less tests, and, the branch is more likely to be predicted as
    it is taken only with P=1/32 instead of P=1/8.

    An opposite case exists in the rare condition where the bits are filled
    in each byte with lsb-to-msb bit order instead of that msb-to-lsb or
    7-0. Then, start the bitmask on 0x00000001 and shift it left, swapping
    the bytes of input on big-endian instead.

    BE/msb:
    BE/lsb: swap, reverse
    LE/msb: swap
    LE/lsb: reverse

    The reverse is compiled in with different constants and shifts, the
    swap is an extra function.

    Luckily, most data is plain byte organized and msb-to-lsb, but some
    data formats, for example Deflate, have both msb and lsb filled codes
    of varying sizes.

    Anyways I was wrong about the literal in the C and the shift and that.
    That is to say

    int num = 0xAABBCCDD;

    is on the Little-Endian register

    ------eax-----
    --eah-- --eal---
    -ah- -al-

    AABBCCDD

    So, that concern was unjustified.

    While that is so, processing the data in word width increments at a
    time is more efficient that processing it a byte at a time, except in
    the case of implementations using type indexed lookup tables where the
    lookup table with 2^32 entries is excessive.

    There are other considerations with the lookup tables' suitabilityand
    wird width. For example, to reverse the bits in a byte, it is
    convenient to have a 256 entry table of bytes and select from the input
    byte its reversed output byte. I think that was the way to do it.
    Yet, say the notion is to reverse all the bits in a word, or all the
    bits in each byte of a word. Then, you don't want a table to reverse
    all the 2^32 possible values, and it gets to that something like this,
    I copied it from somebody:

    fc_uint32 fc_reversebits_bytes_32(fc_uint32 x){

    fc_uint32 left;
    fc_uint32 right;

    left = x & 0xAAAAAAAAUL;
    right = x & 0x55555555UL;
    left = left >> 1;
    right = right << 1;
    x = left | right;

    left = x & 0xCCCCCCCCUL;
    right = x & 0x33333333UL;
    left = left >> 2;
    right = right << 2;
    x = left | right;

    left = x & 0xF0F0F0F0UL;
    right = x & 0x0F0F0F0FUL;
    left = left >> 4;
    right = right << 4;
    x = left | right;

    return x;

    }

    is probably faster than something like this, which might be faster:

    fc_uint32 fc_reversebits_bytes_32_b(fc_uint32 in){

    fc_uint32 offset;
    fc_uint32 reversed = 0;

    offset = in >> 24;
    reversed = reversed | fc_reversebits_bytes_table_32[offset];

    in = in & 0x00FFFFFF;
    in = in | 0x01000000;
    offset = in >> 16;
    reversed = reversed | fc_reversebits_bytes_table_32[offset];

    in = in & 0x0000FFFF;
    in = in | 0x00020000;
    offset = in >> 8;
    reversed = reversed | fc_reversebits_bytes_table_32[offset];

    in = in & 0x000000FF;
    in = in | 0x00000300;
    reversed = reversed | fc_reversebits_bytes_table_32[in];

    return reversed;

    }

    where that's a table 256*4 = 1024 entries of four bytes each, three
    zero. Partially that's so because the implementation for the 64 bit
    word register is:

    fc_uint64 fc_reversebits_bytes_64(fc_uint64 x){

    fc_uint64 left;
    fc_uint64 right;

    left = x & 0xAAAAAAAAAAAAAAAAULL;
    right = x & 0x5555555555555555ULL;
    left = left >> 1;
    right = right << 1;
    x = left | right;

    left = x & 0xCCCCCCCCCCCCCCCCULL;
    right = x & 0x3333333333333333ULL;
    left = left >> 2;
    right = right << 2;
    x = left | right;

    left = x & 0xF0F0F0F0F0F0F0F0ULL;
    right = x & 0x0F0F0F0F0F0F0F0FULL;
    left = left >> 4;
    right = right << 4;
    x = left | right;

    return x;


    }

    with the same number of instructions, perhaps, as the 32-bit
    implementation.

    So, as the word width increases, hopefully in terms of powers of two,
    but in 24 or 36 or whatever, rethink of some of these fundamental
    bit-twiddling algorithms ensues.

    I guess that's so for a lot of the vector operations, for example as
    there are with the MMX or SSE or AltiVec or SPARC graphics extensions.
    Does anyone know a cross-platform C library or code and style to access
    some subset of the vector operations on those things? As it is, each
    is quite separate.

    Yeah, so anyways, my bad about the misinterpretation the C language
    integer literal in terms of the LE processor register.

    You had a good example about the consideration of cycle shaving and the
    fact that an "archaic" computer that costs fifteen dollars at a yard
    sale runs faster than the 40 million dollar supercomputer from only
    twenty-five years ago. While that is so, cutting the cycle count, or
    approximation in terms of processor register renaming and instruction
    pipelining, in half, does double the speed.

    Thank you,

    Ross F.
    Ross A. Finlayson, Mar 4, 2005
    #10
  11. Hi Gordon,

    I think you're pretty much right, except for the use of a word type,
    for example register_t from machine/types.h on many systems.

    About the data not being aligned in memory, here this is just about
    sequences of bytes, and how the memory in which that sequence is
    contained is itself aligned to a word boundary, and of an integral
    multiple of the word size length. So, start loading it from the
    previous aligned address, so that, say, 0xXXAABBCC is loaded onto the
    word, and then shift the bit test mask, forward in this case of BE
    processor, and proceed. Then, at the back of the loop, shift the word
    end test bit where the memory locations outside of the data sequence
    are assumed to be reachable and not cause cause an access or segment
    violation, although that leads to problems because sometimes they are
    unreachable.

    In that case there is then delicate byte-wise array access to a word
    boundary, then blazing word aligned word access, then again as
    necessary byte-wise manipulation.

    Then, it is not so difficult to basically change the definition of
    word_t and word_width from 32 to 64 or 128, where that has nothing to
    do with ILP32 or LP64 or LARGE_FILE_OFFSETS or anything of that sort.

    Thank you,

    Ross F.
    Ross A. Finlayson, Mar 4, 2005
    #11
  12. Ross A. Finlayson

    Old Wolf Guest

    Ross A. Finlayson wrote:
    > Gordon Burditt wrote:
    >
    > Well, sure, C has no concept of endian-ness, and the code might be
    > compiled on a PDP- or Middle-Endian system, but it's a fact that for
    > all intents and purposes the byte order of the platform is

    Big-Endian,
    > or Little-Endian.
    >
    > Now, with dealing with an int, it is true that there is no reason to
    > care what the byte order is, unless you have to deal with ints

    written
    > to file or stream as a sequential sequence of bytes.


    A sequential sequence ? What other sorts are there?

    > the DWORD type from windows.h is the type of the register word
    > on 32-bit systems, or a type defined as project_word_t,
    > or project_sword_t


    I think that type was used in Star Wars.

    > [lots of stuff]


    You seem to be stuck in an assembler mindset.
    In C it is foolish to write endian-dependent code.
    You write value-based code that works on any endianness.

    Your compiler then translates it to the most efficient set
    of instructions for whatever CPU it is targeting. That is the
    compiler's job, not yours.

    >
    > Is this wrong?


    Not /wrong/ as such, but you could just write portable code.

    > #ifndef fc_typedefs_h
    > #define fc_typedefs_h
    >
    > #include <stddef.h> /* wchar_t */
    >
    > #include <limits.h>
    > #if CHAR_BIT != 8
    > #error This application is supported only on systems with CHAR_BIT==8
    > #endif
    >
    > #define PLATFORM_BYTE_ORDER be


    You should write code that doesn't depend on this.
    Then it will be portable without any changes.

    > /** This is the size of memory blocks in bytes, sizeof's size_t. */
    > typedef unsigned int fc_size_t;


    You say it's size_t and then you suddenly change to the possibly
    smaller type 'unsigned int'. What's up with that?

    > /** This is sizeof(fc_word_t) * 8, the processor register word width

    in
    > bits. */

    #define fc_word_width 32

    Why not define it as the comment says?

    > /** This is the number of bits in a byte, eg CHAR_BIT, and must be 8.
    > */
    > #define fc_bits_in_byte 8


    Why not define it as CHAR_BIT then?

    >
    > /** This is fc_word_width / fc_bits_in_byte -1, eg 32/8 -1 = 3, 64/8

    -
    > 1 = 7. */
    > #define fc_align_word_mask (0x00000003)


    Why not define it as the comment says?

    >
    > /** This is fc_word_width - 1. */
    > #define fc_word_max_shift 31


    Why not define it as the comment says?

    > /** This is the literal suffix for literals 0x12345678 of fc_word_t,

    eg
    > empty or UL, then ULL, ..., and should probably be unchanged. */
    > #define UL UL


    Now that's a strange one.
    Note that LU is also a valid suffix to indicate 'unsigned long'.
    Both LU and UL designate 'unsigned long' but fc_word_t is
    'unsigned int', so your comment is wrong.

    > /** This is a fixed-width scalar type. */
    > typedef char fc_int8;
    > /** This is a fixed-width scalar type. */
    > typedef unsigned char fc_uint8;
    > /** This is a fixed-width scalar type. */
    > typedef short fc_int16;
    > /** This is a fixed-width scalar type. */
    > typedef unsigned short fc_uint16;
    > /** This is a fixed-width scalar type. */
    > typedef int fc_int32;
    > /** This is a fixed-width scalar type. */
    > typedef unsigned int fc_uint32;


    What if short is 32 bits or int is 64? You should use the values
    from limits.h to verify tha the sizes are correct.

    Also, since you are using 'long long' you are obviously
    restricting yourself to a C99 implementation, so you could
    include <stdint.h> and use the pre-existing fixed-width types.

    >
    > /** This is a Boolean type. */
    > typedef fc_word_t fc_bool_t;
    >

    You could use _Bool.


    > #ifndef NULL
    > #define NULL ((void*)0)
    > #endif


    This must be defined by stdlib.h, stdio.h etc.
    It's usually not a great idea to redefine it yourself.
    Old Wolf, Mar 4, 2005
    #12
  13. Old Wolf wrote:
    > Ross A. Finlayson wrote:
    > > Gordon Burditt wrote:
    > >
    > > Well, sure, C has no concept of endian-ness, and the code might be
    > > compiled on a PDP- or Middle-Endian system, but it's a fact that

    for
    > > all intents and purposes the byte order of the platform is

    > Big-Endian,
    > > or Little-Endian.
    > >
    > > Now, with dealing with an int, it is true that there is no reason

    to
    > > care what the byte order is, unless you have to deal with ints

    > written
    > > to file or stream as a sequential sequence of bytes.

    >
    > A sequential sequence ? What other sorts are there?
    >
    > > the DWORD type from windows.h is the type of the register word
    > > on 32-bit systems, or a type defined as project_word_t,
    > > or project_sword_t

    >
    > I think that type was used in Star Wars.
    >
    > > [lots of stuff]

    >
    > You seem to be stuck in an assembler mindset.
    > In C it is foolish to write endian-dependent code.
    > You write value-based code that works on any endianness.
    >
    > Your compiler then translates it to the most efficient set
    > of instructions for whatever CPU it is targeting. That is the
    > compiler's job, not yours.
    >
    > >
    > > Is this wrong?

    >
    > Not /wrong/ as such, but you could just write portable code.
    >
    > > #ifndef fc_typedefs_h
    > > #define fc_typedefs_h
    > >
    > > #include <stddef.h> /* wchar_t */
    > >
    > > #include <limits.h>
    > > #if CHAR_BIT != 8
    > > #error This application is supported only on systems with

    CHAR_BIT==8
    > > #endif
    > >
    > > #define PLATFORM_BYTE_ORDER be

    >
    > You should write code that doesn't depend on this.
    > Then it will be portable without any changes.
    >
    > > /** This is the size of memory blocks in bytes, sizeof's size_t. */
    > > typedef unsigned int fc_size_t;

    >
    > You say it's size_t and then you suddenly change to the possibly
    > smaller type 'unsigned int'. What's up with that?
    >
    > > /** This is sizeof(fc_word_t) * 8, the processor register word

    width
    > in
    > > bits. */

    > #define fc_word_width 32
    >
    > Why not define it as the comment says?
    >
    > > /** This is the number of bits in a byte, eg CHAR_BIT, and must be

    8.
    > > */
    > > #define fc_bits_in_byte 8

    >
    > Why not define it as CHAR_BIT then?
    >
    > >
    > > /** This is fc_word_width / fc_bits_in_byte -1, eg 32/8 -1 = 3,

    64/8
    > -
    > > 1 = 7. */
    > > #define fc_align_word_mask (0x00000003)

    >
    > Why not define it as the comment says?
    >
    > >
    > > /** This is fc_word_width - 1. */
    > > #define fc_word_max_shift 31

    >
    > Why not define it as the comment says?
    >
    > > /** This is the literal suffix for literals 0x12345678 of

    fc_word_t,
    > eg
    > > empty or UL, then ULL, ..., and should probably be unchanged. */
    > > #define UL UL

    >
    > Now that's a strange one.
    > Note that LU is also a valid suffix to indicate 'unsigned long'.
    > Both LU and UL designate 'unsigned long' but fc_word_t is
    > 'unsigned int', so your comment is wrong.
    >
    > > /** This is a fixed-width scalar type. */
    > > typedef char fc_int8;
    > > /** This is a fixed-width scalar type. */
    > > typedef unsigned char fc_uint8;
    > > /** This is a fixed-width scalar type. */
    > > typedef short fc_int16;
    > > /** This is a fixed-width scalar type. */
    > > typedef unsigned short fc_uint16;
    > > /** This is a fixed-width scalar type. */
    > > typedef int fc_int32;
    > > /** This is a fixed-width scalar type. */
    > > typedef unsigned int fc_uint32;

    >
    > What if short is 32 bits or int is 64? You should use the values
    > from limits.h to verify tha the sizes are correct.
    >
    > Also, since you are using 'long long' you are obviously
    > restricting yourself to a C99 implementation, so you could
    > include <stdint.h> and use the pre-existing fixed-width types.
    >
    > >
    > > /** This is a Boolean type. */
    > > typedef fc_word_t fc_bool_t;
    > >

    > You could use _Bool.
    >
    >
    > > #ifndef NULL
    > > #define NULL ((void*)0)
    > > #endif

    >
    > This must be defined by stdlib.h, stdio.h etc.
    > It's usually not a great idea to redefine it yourself.


    Hi,

    The PLATFORM_BYTE_ORDER isn't actually used for anything yet, it's just
    a reminder that there are platform details with some of the word access
    to arbitrary strings of bytes.

    I compiled with size_t using -pedantic -ansi and stddefs.h or
    sys/types.h or what-have-you brings in long long which is not a part of
    ANSI C. Also, where there is a ptr_int_t, which is really a
    ptr_uint_t, sometimes I subtract a smaller from a larger pointer, to
    the same memory block, and assume the result is size_t.

    About defining sizeof(fc_word_t)*8, the compiler will generate the
    constant 32 from that where sizeof(fc_word_t) == 4. While that is so,
    I use #if fc_word_width == 64 which is probably not a portable m4
    macro, and I don't know if m4, the C preprocessor, or other C
    preprocessor, would evaluate the expression.

    Why not define 8 as CHAR_BIT? Because, I think some C compilers do not
    have limits.h, that being a reason to remove the "#include limits.h"
    and the preprocessor build error example.

    About the fixed-width types, that's what they are on this machine. The
    long long and unsigned long long or __uint64 and __int64, which I
    believe was standard, or u_int64_t or quadword or what have you,
    definitions are only included by the preprocessor when fc_word_width >
    32.

    About the constant or literal suffix, that's interesting about LU.
    Basically I'm uncertain about that, I have an array of constants, and
    they are the word size, and I'm concerned about promotion of the 32 bit
    constant literal to higher word sizes.

    When I compile with -traditional it says "integer constant is unsigned
    in ANSI C, signed with -traditional." Anyways I'm using ISO prototypes
    so am not expecting to define and use the PROTO macros.

    The typedefs.h header file is meant to be "portable" in that all that
    is necessary in terms of portability of the functions that use the
    types in those files is the single point of change in the typedefs.h
    file. That change is including the appropriate standard definitions
    from their usual locations.

    Then, I have some issues with drawing the right objects into
    compilation, they all compile on BE or LE arkies, new word, heh, that
    looks like ankles, but there are varying implementations, where there
    might be something like #define fc_byteorder_suffix(x)
    x##PLATFORM_BYTE_ORDER when the specialized function is called from the
    public function. Then, the user might just want that one function
    specialized for their processor and data and only use it, so it looks
    like each function gets its own compilation unit, gearing towards
    either static or dynamic linking.

    Here's something, about the header include guard, it's

    #ifndef h__fc_typedefs_h
    #define h__fc_typedefs_h h__fc_typedefs_h

    ....

    #endif /* h__fc_typedefs_h */

    I think I'm not supposed to use any macro symbol starting with
    underscore or "__", and also, the guard definition should be unlikely
    to be used anywhere else, including in arbitrary generated data, so I
    define it as itself. Yet, I have had problems with that.

    Nothing is counterintuitive. It's also the set of all sets and the
    ur-element.

    About _Bool, I'm concerned that I am writing code that is a subset of
    both ISO C and C++, and C++ users might want to define it as bool, and
    I don't care as long as FC_TRUE is nonzero and FC_FALSE is zero.

    I don't define NULL unless it's not defined. I know that in C++ NULL
    is zero, 0, and that there are various rules about comparing pointers
    to null.

    So, I get some warnings with -ansi, although the large part of those
    went awaywith not including most standard definition headers that draw
    in long long or other 64 bit type, but there are warnings about const
    char* f(), "functions can not be const or volatile."

    Hey thanks, that's some good advice.

    Ross F.
    Ross A. Finlayson, Mar 4, 2005
    #13
  14. Hi,

    I add some things to the file, about type definitions, I wonder what
    you think. The contents follow. Please excuse this longwindedness.

    Basically, the additions are for writing source code that is both a C,
    and a C++, program. I was reading on
    http://david.tribble.com/text/cdiffs.htm#C99-const-linkage that there
    is a different storage class for constants in the global namespace in
    C++. That is, it says

    int x = 1;

    is by default in C

    extern int x = 1;

    or rather it has external storage class, and in C++

    static int x = 1;

    Now, the static keyword means that the symbol is only visible in that
    compilation unit. The linker is unable to access the variable declared
    static from another object, which seems very different from the static
    keyword on variables or member functions in C++ or Java.

    Anyways, I use constant tables of data, sitting in their own
    compilation units, eg as so:

    const int table[256] = {0, 1, 2, ...};

    with there actually being 256 literal entries. If I add the extern
    keyword directly to that, then the C compiler complains (warns), but in
    probably misinterpreting the quoted source, I think in C++ it should be
    declared extern, so

    #define fc_cpp_extern /* define to be blank, not integer zero */

    fc_cpp_extern const int table[256] = {0, 1, 2, ...}.

    Then, for C++ it could be defined as extern, or extern "C", or not, I
    am not sure. That could be ridiculous or vestigial.

    Then, I read about the restrict keyword in C99, which is kind of like
    the register keyword for pointers but much different. I have functions
    that I think would qualify to benefit from declaring their inputs or
    variables as not aliased to write-through for other pointers, so I
    define a macro for the keyword and ignore it for now.

    Then, I get to thinking about C++ and its use of static_cast,
    reinterpret_cast, const_cast, and dynamic_cast, where dynamic_cast is
    meaningless in C, I don't have any reason to use const_cast, but every
    other cast in the code is correctly either a static_cast, or
    reinterpret_cast. So, I define macros for static_cast and
    reinterpret_cast.

    #define s_cast(T) (T)
    #define r_cast(T) (T)

    Then I make sure that the variable cast in the C is parenthesized and
    immediately follows the s_cast or r_cast macro, with no whitespace.

    void* pv;
    int* pi;
    ip = s_cast(void*)(pi)

    Then, for C++ define s_cast:

    #define s_cast(T) static_cast<T>

    OK, then I get to thinking about memory and memory allocation. I got
    some ideas for it from the "CTips" web site,
    http://users.bestweb.net/~ctips/tip044.html . Consider:

    #ifndef h__fc_facility_memory_h
    #define h__fc_facility_memory_h

    #include "fc_typedefs.h"

    /*
    These are macros, use them to call the heap memory functions.
    */

    #define fc_mem_allocate_1(V,T) V = s_cast(T*)(fc_mem_fc_malloc(
    sizeof(T) )
    #define fc_mem_allocate_n(V,T,C) V = s_cast(T*)(fc_mem_fc_malloc(
    sizeof(T) * (C) ) )
    #define fc_mem_release_1(V,T) fc_mem_fc_free(V)
    #define fc_mem_release_n(V,T) fc_mem_fc_free(V)
    #define fc_mem_reallocate(V,T,O,N) s_cast(T*)(fc_mem_fc_realloc(V,
    sizeof(T) * (N) ) )

    /*
    These are macros, use them to call the heap memory functions.
    */

    #define fc_mem_fc_malloc(size) fc_mem_malloc(size)
    #define fc_mem_fc_free(memory) fc_mem_free(memory)
    #define fc_mem_fc_realloc(memory, size) fc_mem_realloc(memory, size)

    /*
    These are macros, use them to call the stack allocation function.
    */

    #include <stdlib.h>

    #define fc_mem_fc_stackalloc(size) alloca(size)
    #define fc_mem_fc_stackfree(memory)

    /*
    These are default heap allocation functions, eg malloc, free, and
    realloc.
    They do not initialize the memory.
    */

    fc_ptr_t fc_mem_malloc(fc_size_t size);
    void fc_mem_free(fc_ptr_t memory);
    fc_ptr_t fc_mem_realloc(fc_ptr_t memory, fc_size_t size);


    fc_ptr_t fc_mem_memset(fc_ptr_t dest, fc_size_t size, fc_uint8 value);
    fc_ptr_t fc_mem_memzero(fc_ptr_t dest, fc_size_t size);
    fc_ptr_t fc_mem_memcpy(fc_ptr_t dest, const fc_ptr_t src, fc_size_t
    size);
    fc_ptr_t fc_mem_memmove(fc_ptr_t dest, const fc_ptr_t src, fc_size_t
    size);
    fc_ptr_t fc_mem_memccpy(fc_ptr_t dest, const fc_ptr_t src, fc_uint8
    value, fc_size_t size);

    #endif /* h__fc_facility_memory_h */


    I'm concerned about a macro, eg fc_mem_allocate_1, calling a macro, eg
    fc_mem_fc_malloc, and wonder about in general what the order of those
    macro definitions should be.

    Anyways the allocate_1 and allocate_n and release_1 and release_n given
    the variable and type are also useful for the C++ compatibility,
    because they can be defined to use new and delete or new[] and delete[]
    and make use of the library user's type allocators, with more macros
    for type-specialized allocations.

    http://www.informit.com/guides/content.asp?g=cplusplus&seqNum=40&rl=1

    That gets into considerations of things like handling NULL return
    values from malloc or handling exceptions in the memory or resource
    allocation. I get to thinking about pointers to data vis-a-vis
    container, and template functions, and having the C library be strictly
    C++ compatible yet standard ISO C.

    For example, for C++,

    #define mem_allocate_1(V,T) V = new T
    #define mem_allocate_n(V,T,C) V = new T[C]
    #define mem_release_1(V,T) delete V
    #define mem_release_n(V,T) delete[] V

    or
    #define mem_allocate_n(V,T,C) try { \
    V = s_cast(T*)(new T[C]); \
    } catch (std::bad_alloc){ \
    V = NULL; \
    } (void)0
    #define mem_release_n(V,T) delete[] V
    char* cp;
    mem_allocate_n(cp, char);
    mem_release_n(cp, char);

    and implementing realloc. Then, there's still the problem of handling
    the NULL return value. Those are only good for empty constructors,
    point being nothing in C has any concept of a constructor. Consider
    http://www.scs.cs.nyu.edu/~dm/c -new.html .

    Then, just implement the C functions on pointers to memory instead of C
    streams of some sort, that could just be file-backed pointers to
    memory, except for sockets, and define template functions that accept
    generic containers that use the exact same implementation.

    Thank you,

    Ross F.


    #ifndef h__fc_typedefs_h
    #define h__fc_typedefs_h

    #include <stddef.h> /* wchar_t */

    #include <limits.h>
    #if CHAR_BIT != 8
    #error This application is supported only on systems with CHAR_BIT==8
    #endif

    #define fc_restrict restrict

    #define fc_cpp_extern

    #define PLATFORM_BYTE_ORDER be

    /** This is the type of the generic data pointer. */
    typedef void* fc_ptr_t;

    /** This is the scalar unsigned integer type of the generic data
    pointer. */
    typedef unsigned int fc_ptr_int_t;

    /** This is the size of memory blocks in bytes, sizeof's size_t. */
    typedef unsigned int fc_size_t;

    /** This should be the unsigned word size of the processor, 32, 64,
    128, ..., eg register_t, and not subject to right shift sign extension.
    */
    // typedef unsigned int fc_word_t
    typedef unsigned int fc_word_t;

    /** This should be the signed word size of the processor, 32, 64, 128,
    ...., eg register_t. */
    typedef int fc_sword_t;

    /** This is sizeof(fc_word_t) * 8, the processor register word width in
    bits. */
    #define fc_word_width 32

    /** This macro suffixes the function identifier with the word width
    type size identifier. */
    #define fc_word_width_suffix(x) x##_32

    /** This is fc_word_width - 1. */
    #define fc_word_max_shift 31

    /** This is the number of bits in a byte, eg CHAR_BIT, and must be 8.
    */
    #define fc_bits_in_byte 8

    /** This is fc_word_width / fc_bits_in_byte -1, eg 32/8 -1 = 3, 64/8 -
    1 = 7. */
    #define fc_align_word_mask (0x00000003)

    /** This is the literal suffix for literals 0x12345678 of fc_word_t, eg
    empty or UL, then ULL, ..., and should probably be unchanged. */
    #define UL UL

    /** This is the default unsigned scalar integer type. */
    typedef unsigned int fc_uint_t;
    /** This is the default signed scalar integer type. */
    typedef int fc_int_t;

    /** This type is returned from functions as error or status code. */
    typedef fc_int_t fc_ret_t;

    /** This is the type of the symbol from the coding alphabet, >= 8 bits
    wide, sizeof == 1. */
    typedef unsigned char fc_symbol_t;

    /** This is a character type for use with standard functions. */
    typedef char fc_char;
    /** This is a wide character type for use with standard functions. */
    typedef wchar_t fc_wchar_t;

    /** This is a fixed-width scalar type. */
    typedef char fc_int8;
    /** This is a fixed-width scalar type. */
    typedef unsigned char fc_uint8;
    /** This is a fixed-width scalar type. */
    typedef short fc_int16;
    /** This is a fixed-width scalar type. */
    typedef unsigned short fc_uint16;
    /** This is a fixed-width scalar type. */
    typedef int fc_int32;
    /** This is a fixed-width scalar type. */
    typedef unsigned int fc_uint32;

    #if fc_word_width >= 64

    /** This is a fixed-width scalar type. */
    typedef long long fc_int64;
    /** This is a fixed-width scalar type. */
    typedef unsigned long long fc_uint64;

    #endif /* fc_word_width >= 64 */



    #ifndef __cplusplus
    #ifndef NULL
    #define NULL ((void*)0)
    #endif
    /** This is the static cast for void pointer to typed pointer, typed
    pointer to void pointer, or integer to integer. */
    #define s_cast(T) (T)
    /** This is the reinterpret cast for pointer to integer or integer to
    pointer. */
    #define r_cast(T) (T)
    /** This is a Boolean type. */
    typedef fc_word_t fc_bool_t;
    /** This is true. */
    #define FC_TRUE 1
    /** This is false. */
    #define FC_FALSE 0

    #endif

    #ifdef __cplusplus
    #ifndef NULL
    #define NULL 0
    #endif
    /** This is the static cast for void pointer to typed pointer, typed
    pointer to void pointer, or integer to integer. */
    #define s_cast(T) static_cast<T>
    /** This is the reinterpret cast for pointer to integer or integer to
    pointer. */
    #define r_cast(T) reinterpret_cast<T>
    /** This is a Boolean type. */
    typedef bool fc_bool_t;
    /** This is true. */
    #define FC_TRUE true
    /** This is false. */
    #define FC_FALSE false
    #endif

    #endif /* h__fc_typedefs_h */
    Ross A. Finlayson, Mar 5, 2005
    #14
  15. Hi,

    I have some questions about struct tags.

    I think some old compilers require struct tags.

    typedef struct tagX_t{


    } X_t;

    For something like that, a function might be defined, or rather,
    declared, as either:

    f( struct tagX_t x);

    or

    f(X_t x);

    I read a good article on embedded.com,
    http://www.embedded.com/showArticle.jhtml?articleID=9900748 , that
    says that basically the tag and type name are in separate scopes so
    something along the lines of:

    typedef struct X_t X_t;
    struct X_t{

    };

    or

    typedef struct X_t {

    } X_t;

    are OK. For a self-referential struct then it has to be defined, even
    as a partial type, before use in self-same definition.

    typedef struct X_t{
    struct X_t* prev;
    struct X_t* next;
    struct X_t* parent;
    } X_t;

    Now generally I'm using plain old data structs,

    typedef struct {
    int i_1;
    int i_2;
    } X_t;

    void f(X_t x);

    But I wonder about in the gamut of implementations of structure
    definitions whether that is always going to compile correctly. So, in
    paranoia and as a form of procrastination I define these macros:

    #define fc_struct_tag(x) x##_struct_tag

    and then have something along the lines of

    typedef struct fc_struct_tag(X_t){

    } X_t;

    Then, also something along the lines of

    #define fc_struct(x) x

    But I could replace that with

    #define fc_struct(x) struct x##_struct_tag

    So, that article seems to make clear that

    #define fc_struct_tag(x)
    #define fc_struct(x) x

    typedef struct fc_struct_tag(X_t){
    int i_1;
    int i_2;
    } X_t;

    void f( fc_struct(X_t)* pod);

    would be OK, and it compiles fine the above varying definitions of
    those things. What I wonder is if any C, or C++, compiler would be
    going to barf on one of the above constructs or would need more
    elaboration, ie another definition for the typedef or pointer type, or
    it's a waste of time. Also I wonder about conditions where the tag
    needs to be the same or different from the typedef name, and nesting
    macros.

    I seem to recall that if you toss a struct to MSVC it wants the tag.

    You're right!

    Heh, Not Con(ZF).

    Thank you,

    Ross F.

    --
    "It's the smallest infinitesimal, Russell,
    there are smaller infinitesimals."
    Ross A. Finlayson, Mar 9, 2005
    #15
  16. Ross A. Finlayson

    Michael Mair Guest

    Structure tags and typedefs (was: Variadic functions calling variadicfunctions with the argument list, HLL bit shifts on LE processors)

    First off, if you have a new question which has nothing to do
    with an existing thread, please start a new thread.
    If there is a weak dependence on the old, change at least the
    subject line.

    Ross A. Finlayson wrote:
    > Hi,
    >
    > I have some questions about struct tags.
    >
    > I think some old compilers require struct tags.
    >
    > typedef struct tagX_t{
    >
    >
    > } X_t;
    >
    > For something like that, a function might be defined, or rather,
    > declared, as either:
    >
    > f( struct tagX_t x);
    >
    > or
    >
    > f(X_t x);
    >
    > I read a good article on embedded.com,
    > http://www.embedded.com/showArticle.jhtml?articleID=9900748 , that
    > says that basically the tag and type name are in separate scopes


    Structure, union and enumeration tags have one namespace of
    their own (I repeat: one namespace for all three kinds of tags).

    > so
    > something along the lines of:
    >
    > typedef struct X_t X_t;
    > struct X_t{
    >
    > };
    >
    > or
    >
    > typedef struct X_t {
    >
    > } X_t;
    >
    > are OK. For a self-referential struct then it has to be defined, even
    > as a partial type, before use in self-same definition.
    >
    > typedef struct X_t{
    > struct X_t* prev;
    > struct X_t* next;
    > struct X_t* parent;
    > } X_t;
    >
    > Now generally I'm using plain old data structs,
    >
    > typedef struct {
    > int i_1;
    > int i_2;
    > } X_t;
    >
    > void f(X_t x);
    >
    > But I wonder about in the gamut of implementations of structure
    > definitions whether that is always going to compile correctly.


    Any compiler conforming to either C89 or C99 will accept this.
    If you really have to use a compiler which does not accept the
    above then you probably will have trouble with many other things,
    too.

    > So, in
    > paranoia and as a form of procrastination I define these macros:
    >
    > #define fc_struct_tag(x) x##_struct_tag
    >
    > and then have something along the lines of
    >
    > typedef struct fc_struct_tag(X_t){
    >
    > } X_t;
    >
    > Then, also something along the lines of
    >
    > #define fc_struct(x) x


    This is unnecessary and a strange kind of procrastination as you
    always have to type the rather lengthy fc_struct_tag()...
    In addition, it does not help w.r.t. clearness or maintainability.

    >
    > But I could replace that with
    >
    > #define fc_struct(x) struct x##_struct_tag
    >
    > So, that article seems to make clear that
    >
    > #define fc_struct_tag(x)
    > #define fc_struct(x) x
    >
    > typedef struct fc_struct_tag(X_t){
    > int i_1;
    > int i_2;
    > } X_t;
    >
    > void f( fc_struct(X_t)* pod);
    >
    > would be OK, and it compiles fine the above varying definitions of
    > those things.


    Yep, as is to be expected.


    > What I wonder is if any C, or C++, compiler would be
    > going to barf on one of the above constructs or would need more
    > elaboration, ie another definition for the typedef or pointer type, or
    > it's a waste of time.


    It is a complete waste of time.
    <OT>C++ does not need the typedef struct X_t X_t; but accepts it.</OT>

    > Also I wonder about conditions where the tag
    > needs to be the same or different from the typedef name,


    The only thing I can come up with are programming standards imposed
    from without.

    > and nesting
    > macros.


    What do you mean by that?


    > I seem to recall that if you toss a struct to MSVC it wants the tag.


    This would astonish me somewhat. However, compilers are free to
    output any warnings they like.


    Cheers
    Michael
    --
    E-Mail: Mine is an /at/ gmx /dot/ de address.
    Michael Mair, Mar 10, 2005
    #16
  17. "Ross A. Finlayson" <> wrote in message
    news:...
    > Hi,
    >
    > I have some questions about struct tags.


    Originally the Tag field was ther so you could reference the structure being
    defined.

    ex:
    typedef struct TagNode{
    int data;
    TagNode * next;
    }

    A lot of compilers required it to be there. This was a bug in Borland (6?)
    that was fixed when 32 bit pmode support came out.

    most compilers don't require the typedef, it's redundant, and provide the
    tag field as an optional componant.

    Look up the doc for your compiler to see how it likes it.

    dan
    DHOLLINGSWORTH2, Mar 10, 2005
    #17
  18. Ross A. Finlayson

    infobahn Guest

    Re: Variadic functions calling variadic functions with the argumentlist, HLL bit shifts on LE processors

    DHOLLINGSWORTH2 wrote:
    >
    > "Ross A. Finlayson" <> wrote in message
    > news:...
    > > Hi,
    > >
    > > I have some questions about struct tags.

    >
    > Originally the Tag field was ther so you could reference the structure being
    > defined.
    >
    > ex:
    > typedef struct TagNode{
    > int data;
    > TagNode * next;
    > }


    A conforming implementation is required to diagnose this code.

    Translation: your definition is broken.
    infobahn, Mar 10, 2005
    #18
  19. Re: Structure tags and typedefs (was: Variadic functions calling variadic functions with the argument list, HLL bit shifts on LE processors)

    Hi Michael.

    I wonder if there are implementation limits of the macros, particularly
    with regards to macros defined in various places, and then used
    together.

    #define a(x) x
    #define b(x) a(x)
    #define c(x) b(x)

    With that, for sure c(x) == x, and the preprocessor replaces each
    instance of c(x) with b(x), and that with a(x), and that with x. I
    wonder if that is so for pretty much any implementation of the C
    preprocessor, for any order of those definitions, for any finite depth
    of those things, ie no #define a(x) z(x), which I believe gcc will
    notice, illegal recursive macros.

    These struct macros are basically library-internal, I wouldn't want to
    inflict them upon others in terms of
    usability/readability/maintainability, and I do wonder if they have any
    meaning. It just seems that somewhere there's a C compiler that has
    that feature broken.

    The concept, of struct macros, could be expanded somewhat, for defining
    objects that are on C compilation a struct and initializer definitions
    and compiled as C++ classes with constructors, again as plain old data,
    for default values. That would quickly become complicated from having
    the C memory allocation function call the initializer, for a C program.

    Procrastination means to some extent overgeneralization. Design can go
    on forever, at some point I need to stop and implement. For example,
    in a month, I only have four or five thousand lines in this program in
    thirty some files, most of that is generated, and it's _still_ not
    done.

    Anyways I do wonder if anyone else knows a reason why to keep the
    struct tag semantic and varying use of struct tagX_t vis-a-vis X_t. We
    know the way it is implemented in modern, standard compilers, it is not
    easily shown that no broken soi-disant C compiler that somebody needs
    does not have that feature.

    As an extension of this thread, basically I'm trying to isolate all
    platform variabilities for this set of C library functions into a
    single point-of-change file, with as well the necessary abstractions or
    redefinitions for the exact same source code to be each of ANSI C, ISO
    C, and standard C++ conformant. At the same time, I think there are
    widespread compiler variabilities, and not all of those are easily
    recognized or addressed, and so I wonder if the struct tag and type
    name issue is one of those. So, that's why I didn't start a new
    thread, although I should have searched for background into the
    question before posting about it.

    http://groups-beta.google.com/group/comp.lang.c/search?group=comp.lang.c&q=struct tag type name
    http://groups-beta.google.com/group/comp.lang.c/search?group=comp.lang.c&q=struct tag typename

    It adds drag to the code writer, in this case myself, but if ever it
    needed to be modified by any other user, for use of the external
    functions, they would be set to go. As it is I can pretty much do a
    global search and replace to remove these vestigial things. I'd like
    it to be one of those "active" components.

    Anyways what I really would like to know are more of these kinds of
    things about dialects of C or C++ compatibility that should be
    considered in the design.

    Warm regards,

    Ross F.
    --
    "That is a description of iota."
    Ross A. Finlayson, Mar 10, 2005
    #19
  20. Re: Variadic functions calling variadic functions with the argumentlist, HLL bit shifts on LE processors

    "Ross A. Finlayson" <> writes:
    > I have some questions about struct tags.
    >
    > I think some old compilers require struct tags.
    >
    > typedef struct tagX_t{
    >
    >
    > } X_t;
    >
    > For something like that, a function might be defined, or rather,
    > declared, as either:
    >
    > f( struct tagX_t x);
    >
    > or
    >
    > f(X_t x);


    Each of the following is legal:

    typedef struct foo {
    int n;
    } foo_t;

    typedef struct {
    int n;
    } foo_t;

    struct foo {
    int n;
    };

    struct {
    int n;
    };

    In each case, "foo_t" (if present) is a typedef name, and "foo" (if
    present) is a struct tag. The struct declaration creates a structure
    type; the typedef creates an alias for the existing structure type.

    The type may be referred to either as "foo_t" or as "struct foo".
    Within the structure declaration, the name "foo_t" isn't visible,
    since it hasn't been declared yet, but you can still make a pointer to
    the type using "struct foo* (if it's present). The type cannot be
    referred to as just "foo" in C, <OT>though it can in C++</OT>.

    Note the last declaration probably isn't very useful, since it doesn't
    give you a way to refer to the type. It might be used as part of an
    object declaration:

    struct {
    int n;
    } obj;

    but that's fairly unusual.

    You can also create the typedef before declaring the structure type:

    typedef struct foo foo_t;
    struct foo {
    int n;
    };

    As a matter of style, some would argue that the typedef is often a bad
    idea, and that it's better to use the more explicit "struct foo". The
    advantage of this is that it's more explicit about the fact that
    you're dealing with a structure type. The disadvantage of this is
    that it's more explicit about the fact that you're dealing with a
    structure type. If you're writing original code, pick a style and
    stick to it; if you're reading or maintaining code, be prepared to
    cope with either style.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Mar 10, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Colin Walters
    Replies:
    2
    Views:
    522
    Ben Pfaff
    Feb 13, 2004
  2. S?ren Gammelmark
    Replies:
    1
    Views:
    1,885
    Eric Sosman
    Jan 7, 2005
  3. Replies:
    2
    Views:
    346
    Dave Thompson
    Feb 27, 2006
  4. Replies:
    5
    Views:
    363
  5. Mayan Moudgill

    HLL VHDL & VCD

    Mayan Moudgill, Dec 20, 2007, in forum: VHDL
    Replies:
    4
    Views:
    613
    Colin Paul Gloster
    Jan 2, 2008
Loading...

Share This Page