Alignment issues -- are they an issue?

Discussion in 'C Programming' started by Tomás Ó hÉilidhe, Sep 22, 2008.

  1. I'm doing low-level networking programming at the moment writing my
    own Ethernet frames, so I start off with the Destination MAC address,
    then Source MAC address, then Protocol ID, then I have the IP packet,
    then the UDP segment, and so forth.

    The networking library I'm using is called Berkeley Sockets; I've
    decided to go with it because I hear it's the most ubiquitous
    networking library across all platforms.

    Anyway, although I want my program to be as portable as possible, I
    realise that it will only be portable to systems which have an
    implemenation of Berkeley Sockets, and also which have an exact 8-Bit
    type, a 16-Bit type and a 32-Bit type (all without padding). I get
    these types from stdint.h:

    #include <stdint.h>

    int VerifyPacketChecksum(uint8_t const *packet);

    Throughout my code though, there are a few instances in which I deal
    with taking 16-Bit numbers from an Ethernet frame. I know that one
    possible method of doing this would be:

    (p[0] << 8) | p[1]

    But at the moment I have the following in my code:

    ntohs( *(uint16_t const*)p )

    ("ntohs" is a function which converts from network byte order to host
    byte order)

    It's possible that "p" will not be aligned on a two-byte boundary, but
    I'm wondering if I'll have a problem? I realise that the C Standard
    says outright that the behaviour is undefined if alignment
    requirements are not met... but seeing as how I've already made
    assumptions about there being an 8-Bit, 16-Bit and 32-Bit type, would
    it not be also fair to assume that I can access a uint16_t regardless
    of how it's aligned?

    I suppose in essence what I'm asking is as follows: On the systems
    where Berkeley Sockets is implemented, and where there are exact 8-
    Bit, 16-Bit and 32-Bit types, is it OK to read or write a uint16_t
    from memory regardless of the alignment? The main platforms I have in
    mind are Windows, Linux, Mac, Unix, Solaris, and also possible XBox360
    and Playstation 3.

    Or should I just go with (p[0] << 8) | p[1] to be safe?
     
    Tomás Ó hÉilidhe, Sep 22, 2008
    #1
    1. Advertising

  2. On 22 Sep, 09:58, Tomás Ó hÉilidhe <> wrote:

    > I'm doing low-level networking programming at the moment writing my
    > own Ethernet frames, so I start off with the Destination MAC address,
    > then Source MAC address, then Protocol ID, then I have the IP packet,
    > then the UDP segment, and so forth.


    TCP/IP (I'm including UDP in that) headers are arranged so
    that on most sane architectures you don't have alignment issues.

    I think they try to align on 32-byte boundaries. Tricky for 9-bit
    bytes and 36-bit words but there you are. TCP/IP pretty well assumes
    you can generate (and receive) a stream of octets (8-bit bytes)
    somehow. It specifies what it's going to look like on-the-wire.
    How you represent it in your program is your problem (or in the
    real world, the socket library's problem).


    > The networking library I'm using is called Berkeley Sockets; I've
    > decided to go with it because I hear it's the most ubiquitous
    > networking library across all platforms


    yes, probably


    > Anyway, although I want my program to be as portable as possible, I
    > realise that it will only be portable to systems which have an
    > implemenation of Berkeley Sockets, and also which have an exact 8-Bit
    > type, a 16-Bit type and a 32-Bit type (all without padding). I get
    > these types from stdint.h:


    I'm not sure that's necessary. Doesn't the socket library
    provide the appropriate struct declarations?

    If you don't have a Berkely Socket implementaion then you
    have a little more to worry about than alignment. Like writing
    the whole UDP stack...


    >     #include <stdint.h>


    if this doesn't exist you can (usually) hack one together

    >     int VerifyPacketChecksum(uint8_t const *packet);
    >
    > Throughout my code though, there are a few instances in which I deal
    > with taking 16-Bit numbers from an Ethernet frame.


    you *are* writing your own UDP stack?


    > I know that one
    > possible method of doing this would be:
    >
    >     (p[0] << 8) | p[1]
    >
    > But at the moment I have the following in my code:
    >
    >     ntohs(    *(uint16_t const*)p    )
    >
    > ("ntohs" is a function which converts from network byte order to host
    > byte order)
    >
    > It's possible that "p" will not be aligned on a two-byte boundary, but
    > I'm wondering if I'll have a problem?


    I have seen code blow up on this. A well written UDP implementation
    should be ok with this. This is why I'm wondering if you are the
    implementor.

    > I realise that the C Standard
    > says outright that the behaviour is undefined if alignment
    > requirements are not met... but seeing as how I've already made
    > assumptions about there being an 8-Bit, 16-Bit and 32-Bit type,


    which I think is a bad idea

    > would
    > it not be also fair to assume that I can access a uint16_t regardless
    > of how it's aligned?


    no. Really no.


    > I suppose in essence what I'm asking is as follows: On the systems
    > where Berkeley Sockets is implemented, and where there are exact 8-
    > Bit, 16-Bit and 32-Bit types, is it OK to read or write a uint16_t
    > from memory regardless of the alignment?


    the old (ok, stoneage) 68000 used to do this. It had 16 and 32 bit
    types
    (well the compiler did) but they had to be aligned, or it trapped. I
    suspect some modern RISC chips still do this.


    > The main platforms I have in
    > mind are Windows, Linux, Mac, Unix, Solaris, and also possible XBox360
    > and Playstation 3.
    >
    > Or should I just go with (p[0] << 8) | p[1] to be safe?


    I go with shift and or. It isn't much more code!

    After we had the alignment trap we put this code in

    p[0] << 8 + p[1]

    which was "interesting"


    --
    Nick Keighley

    "I don't skydive; I don't bungee;
    I don't go on rollercoasters, they scare me to death."
    Col. Eileen Collins (Shuttle Pilot)
     
    Nick Keighley, Sep 22, 2008
    #2
    1. Advertising

  3. On Sep 22, 10:45 am, Nick Keighley <>
    wrote:

    > TCP/IP (I'm including UDP in that) headers are arranged so
    > that on most sane architectures you don't have alignment issues.



    Only problem is that then entire frame in memory might have strange
    alignment.


    > you *are* writing your own UDP stack?



    Yup!


    > I have seen code blow up on this. A well written UDP implementation
    > should be ok with this. This is why I'm wondering if you are the
    > implementor.



    I'm writing a program at the moment which maps a network, telling you
    all the hosts present, and also telling you which routers (if any)
    lead to the internet. I test for an internet connection by sending DNS
    request packets to the MAC address of each of the hosts and seeing if
    I get a reply. I hand-craft these DNS requests by myself, sending them
    to different MAC addresses.


    > I go with shift and or. It isn't much more code!
    >
    > After we had the alignment trap we put this code in
    >
    >     p[0] << 8 + p[1]
    >
    > which was "interesting"



    I wonder if it'd be too "distancing" to do the following for now:

    #define GET_S(p) (p[0] << 8 | p[1])

    But of course then I've the issue of multiple evaluation of the macro
    argument.
     
    Tomás Ó hÉilidhe, Sep 22, 2008
    #3
  4. Tomás Ó hÉilidhe

    Guest

    On Sep 22, 3:58 am, Tomás Ó hÉilidhe <> wrote:
    > It's possible that "p" will not be aligned on a two-byte boundary, but
    > I'm wondering if I'll have a problem? I realise that the C Standard
    > says outright that the behaviour is undefined if alignment
    > requirements are not met... but seeing as how I've already made
    > assumptions about there being an 8-Bit, 16-Bit and 32-Bit type, would
    > it not be also fair to assume that I can access a uint16_t regardless
    > of how it's aligned?
    >
    > I suppose in essence what I'm asking is as follows: On the systems
    > where Berkeley Sockets is implemented, and where there are exact 8-
    > Bit, 16-Bit and 32-Bit types, is it OK to read or write a uint16_t
    > from memory regardless of the alignment? The main platforms I have in
    > mind are Windows, Linux, Mac, Unix, Solaris, and also possible XBox360
    > and Playstation 3.
    >
    > Or should I just go with (p[0] << 8) | p[1] to be safe?



    There are absolutely systems where an unaligned access will, or can,
    fault. PPC (or POWER) and IPF (Itanium) are both examples. In both
    of those cases, there are *some* unaligned access than can succeed (in
    the case of PPC, so long as you don't cross a page boundary*, or so
    long as you don't cross a cache line on IPF). Other processors have
    had no direct unaligned support at all (Alpha, for example).

    And all three of those systems happen to support 8, 16 and 32 bit
    types.

    For CPUs that don’t support unaligned accesses, or restrict them,
    there are often some instructions meant for faking it by doing two
    accesses and then pasting the result together. And of course you can
    always fake it with a series of accesses, shifts, and whatnot. You
    can sometimes convince a C compiler, with a implementation specific
    extension telling it that an item is not aligned, to generate that
    sequence.

    OTOH, many OS's catch the unaligned access traps, and fake it for you,
    but that's at a severe performance penalty - usually on the order of
    100 times what a non-trapping access costs you. You can take
    advantage of that if unaligned accesses are rare, especially on
    systems like PPC, where many unaligned access are, in fact handled
    quickly by the hardware, but you do *not* want it to happen often.

    Of course in at least two OS's I know of, the fixup is *not* available
    to kernel code, so if you're writing something that lives there (which
    you might well be doing, since you're writing a TCP/IP stack - OTOH,
    you're using Sockets, so you might be doing a user level stack), that
    may not be an option.

    I would advise you to use the shift/or approach. You can always
    optimize later. Perhaps you can wrap it in a macro or function that
    you can easily change. And a nicely named macro or function is
    probably clearer than scattering those shift/or sequences all over
    your code. And before you go optimizing this, see what the compiler
    is actually generating - in many cases, if you're on a big-endian CPU,
    and unaligned accesses work naturally, the compiler will optimize that
    down to a single instruction anyway.


    *There are some cases where you can cross a page boundary with an
    unaligned access on PPC assuming several serious restrictions are met,
    and depending on the implementation)
     
    , Sep 22, 2008
    #4
  5. "" <> writes:
    [...]
    > There are absolutely systems where an unaligned access will, or can,
    > fault. PPC (or POWER) and IPF (Itanium) are both examples. In both
    > of those cases, there are *some* unaligned access than can succeed (in
    > the case of PPC, so long as you don't cross a page boundary*, or so
    > long as you don't cross a cache line on IPF). Other processors have
    > had no direct unaligned support at all (Alpha, for example).

    [...]

    I've seen at least one system (I think it was the old RS/6000) where
    an attempted unaligned access appears to succeed, but the low-order
    bit of the address is silently ignored.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Sep 22, 2008
    #5
  6. Tomás Ó hÉilidhe

    Flash Gordon Guest

    Keith Thompson wrote, On 22/09/08 18:01:
    > "" <> writes:
    > [...]
    >> There are absolutely systems where an unaligned access will, or can,
    >> fault. PPC (or POWER) and IPF (Itanium) are both examples. In both
    >> of those cases, there are *some* unaligned access than can succeed (in
    >> the case of PPC, so long as you don't cross a page boundary*, or so
    >> long as you don't cross a cache line on IPF). Other processors have
    >> had no direct unaligned support at all (Alpha, for example).

    > [...]
    >
    > I've seen at least one system (I think it was the old RS/6000) where
    > an attempted unaligned access appears to succeed, but the low-order
    > bit of the address is silently ignored.


    I had an old DR/6000 until recently, and my company still has 4 of them
    (one is live with a 233MHz processor, the other 3 will be a backup at my
    office and DR and backup at a satellite office). All of them running AIX
    4.3 and SW which is still under active maintenance. So such machines are
    not dead yet!
    --
    Flash Gordon
    If spamming me sent it to
    If emailing me use my reply-to address
    See the comp.lang.c Wiki hosted by me at http://clc-wiki.net/
     
    Flash Gordon, Sep 22, 2008
    #6
  7. Tomás Ó hÉilidhe

    Tim Prince Guest

    Keith Thompson wrote:
    > "" <> writes:
    > [...]
    >> There are absolutely systems where an unaligned access will, or can,
    >> fault. PPC (or POWER) and IPF (Itanium) are both examples. In both
    >> of those cases, there are *some* unaligned access than can succeed (in
    >> the case of PPC, so long as you don't cross a page boundary*, or so
    >> long as you don't cross a cache line on IPF). Other processors have
    >> had no direct unaligned support at all (Alpha, for example).

    > [...]
    >
    > I've seen at least one system (I think it was the old RS/6000) where
    > an attempted unaligned access appears to succeed, but the low-order
    > bit of the address is silently ignored.
    >

    Before that, the GeCOS systems ignored enough low order bits to make an
    effectively aligned address, and provided the data from that address.
    As to the Itanium, possibly depending on the OS, there is optional
    support for trapping to a function which fixes up a mis-aligned access,
    but the performance penalty is prohibitive (even when no mis-alignment
    occurs).
    And then, since the introduction of SSE instructions about 10 years ago,
    for "mov" operations, there are both aligned instructions, which fault
    on mis-alignment, and unaligned instructions, which support various
    degrees of mis-alignment, with varying degrees of performance penalties.
     
    Tim Prince, Sep 22, 2008
    #7
  8. Tomás Ó hÉilidhe

    Ian Collins Guest

    Tomás Ó hÉilidhe wrote:
    >
    > I wonder if it'd be too "distancing" to do the following for now:
    >
    > #define GET_S(p) (p[0] << 8 | p[1])
    >
    > But of course then I've the issue of multiple evaluation of the macro
    > argument.


    Use a function.

    --
    Ian Collins.
     
    Ian Collins, Sep 22, 2008
    #8
  9. On Mon, 22 Sep 2008 01:58:21 -0700 (PDT), Tomás Ó hÉilidhe
    <> wrote:

    >I'm doing low-level networking programming at the moment writing my
    >own Ethernet frames, so I start off with the Destination MAC address,
    >then Source MAC address, then Protocol ID, then I have the IP packet,
    >then the UDP segment, and so forth.
    >
    >The networking library I'm using is called Berkeley Sockets; I've
    >decided to go with it because I hear it's the most ubiquitous
    >networking library across all platforms.
    >
    >Anyway, although I want my program to be as portable as possible, I
    >realise that it will only be portable to systems which have an
    >implemenation of Berkeley Sockets, and also which have an exact 8-Bit
    >type, a 16-Bit type and a 32-Bit type (all without padding). I get
    >these types from stdint.h:
    >
    > #include <stdint.h>
    >
    > int VerifyPacketChecksum(uint8_t const *packet);
    >
    >Throughout my code though, there are a few instances in which I deal
    >with taking 16-Bit numbers from an Ethernet frame. I know that one
    >possible method of doing this would be:
    >
    > (p[0] << 8) | p[1]
    >
    >But at the moment I have the following in my code:
    >
    > ntohs( *(uint16_t const*)p )
    >
    >("ntohs" is a function which converts from network byte order to host
    >byte order)
    >
    >It's possible that "p" will not be aligned on a two-byte boundary, but
    >I'm wondering if I'll have a problem? I realise that the C Standard
    >says outright that the behaviour is undefined if alignment
    >requirements are not met... but seeing as how I've already made
    >assumptions about there being an 8-Bit, 16-Bit and 32-Bit type, would
    >it not be also fair to assume that I can access a uint16_t regardless
    >of how it's aligned?
    >
    >I suppose in essence what I'm asking is as follows: On the systems
    >where Berkeley Sockets is implemented, and where there are exact 8-
    >Bit, 16-Bit and 32-Bit types, is it OK to read or write a uint16_t
    >from memory regardless of the alignment? The main platforms I have in
    >mind are Windows, Linux, Mac, Unix, Solaris, and also possible XBox360
    >and Playstation 3.
    >
    >Or should I just go with (p[0] << 8) | p[1] to be safe?


    Even with all your assumptions, to which I add that your buffer is
    aligned on a multiple of four, the question is - will any 2 byte field
    in any message you need to process for which you would like to use
    ntohs ever be located on an odd boundary? If you can't guarantee that
    the answer will always for ever be "no", you have a problem just
    waiting to happen.

    --
    Remove del for email
     
    Barry Schwarz, Sep 23, 2008
    #9
  10. Tomás Ó hÉilidhe

    Chris Dollin Guest

    Flash Gordon wrote:

    > Keith Thompson wrote, On 22/09/08 18:01:
    >> "" <> writes:
    >> [...]
    >>> There are absolutely systems where an unaligned access will, or can,
    >>> fault. PPC (or POWER) and IPF (Itanium) are both examples. In both
    >>> of those cases, there are *some* unaligned access than can succeed (in
    >>> the case of PPC, so long as you don't cross a page boundary*, or so
    >>> long as you don't cross a cache line on IPF). Other processors have
    >>> had no direct unaligned support at all (Alpha, for example).

    >> [...]
    >>
    >> I've seen at least one system (I think it was the old RS/6000) where
    >> an attempted unaligned access appears to succeed, but the low-order
    >> bit of the address is silently ignored.

    >
    > I had an old DR/6000 until recently, and my company still has 4 of them
    > (one is live with a 233MHz processor, the other 3 will be a backup at my
    > office and DR and backup at a satellite office). All of them running AIX
    > 4.3 and SW which is still under active maintenance. So such machines are
    > not dead yet!


    The Archimedes and RISC PCs had (or have, or some had or have) hardware
    so that unaligned access to a word at P loaded the whole word from
    (P & ~3) [1] and then rotated it (P & 3) bytes round. I expect you
    can work out why ...

    [1] Because by this time the pointer really is an integer index into
    (mapped) memory.

    --
    'It changed the future .. and it changed us.' /Babylon 5/

    Hewlett-Packard Limited Cain Road, Bracknell, registered no:
    registered office: Berks RG12 1HN 690597 England
     
    Chris Dollin, Sep 23, 2008
    #10
  11. Thanks everybody for your helpful replies. At the moment I'm using the
    following:

    uint16_t Get16(uint8_t const *const p)
    {
    return ((uint16_t)(p[0]) << 8) | p[1];
    }

    void Set16(uint8_t *const p,uint16_t const val)
    {
    p[0] = val >> 8;
    p[1] = val & 0xFF;
    }

    uint32_t Get32(uint8_t const *const p)
    {
    return ((uint32_t)(p[0]) << 24) | ((uint32_t)(p[1]) << 16) |
    ((uint32_t)(p[2]) << 8) | p[3];
    }

    void Set32(uint8_t *const p,uint32_t const val)
    {
    p[0] = val >> 24;
    p[1] = val >> 16;
    p[2] = val >> 8;
    p[3] = val;
    }
     
    Tomás Ó hÉilidhe, Sep 24, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Asfand Yar Qazi
    Replies:
    4
    Views:
    433
    Asfand Yar Qazi
    Nov 12, 2004
  2. Martin M.
    Replies:
    4
    Views:
    369
    Simon Brunning
    Dec 15, 2005
  3. Dave Rudolf
    Replies:
    1
    Views:
    320
    Kai-Uwe Bux
    May 17, 2006
  4. Lionel B
    Replies:
    10
    Views:
    977
    Lionel B
    Jan 2, 2007
  5. David A. Black
    Replies:
    2
    Views:
    243
    Tim Hunter
    Aug 19, 2004
Loading...

Share This Page