padding?

Discussion in 'C Programming' started by Zach, Jun 13, 2009.

  1. Zach

    Zach Guest

    I looked in the index of K&R and couldn't find anything on padding.
    Could someone please explain what padding is in C programming and
    illustrate it with some code. I heard it is often used in constructing
    network packets.

    Zach
    Zach, Jun 13, 2009
    #1
    1. Advertising

  2. Zach

    Squeamizh Guest

    On Jun 13, 1:34 am, Richard Heathfield <> wrote:
    > Zach said:
    >
    > > I looked in the index of K&R and couldn't find anything on
    > > padding. Could someone please explain what padding is in C
    > > programming and illustrate it with some code. I heard it is often
    > > used in constructing network packets.

    >
    > Padding is simply a region of data storage that it is convenient to
    > "waste". Precisely whom is convenienced depends on the situation.


    [...]

    > The tm_yday field in a struct tm is essentially padding as far as
    > many people are concerned (but those who /do/ use it would
    > disagree!).


    Then I would strongly suggest that it isn't padding. Correct me if
    I'm wrong; you seem to be saying that padding is any data that you
    don't find useful in a particular situation.
    Squeamizh, Jun 13, 2009
    #2
    1. Advertising

  3. On 13 June, 12:35, Tor Rustad <> wrote:
    >
    > In C, padding is often used by compilers between struct members, for
    > example:
    >
    > struct s
    > {
    > char c;
    > <- compiler might insert 7 padding bytes here
    > double d;
    >
    > };
    >
    > this because the target CPU might have alignment requirements on double
    > access, or that double access is faster when e.g. aligned to e.g. 8
    > byte boundery.
    >
    > Hence in C, you can <no> assume that
    >
    > struct s mys;
    >
    > memcpy(&mys, ...);
    >
    > will work, due to potential padding bytes. To load some data into a
    > struct, you need to do that struct member for for struct member, unless
    > using a non-standard pragma for struct packing.


    What won't work ? It's not clear what you have in mind for "...".

    If structures s1 and s2 are of the same type then you can do s1=s2
    or memmove(&s1 , &s2 , sizeof(s2)) Both will "load some data"
    to s1 without explicitly assigning to each member.

    --
    Ever real life has plot holes.
    Spiros Bousbouras, Jun 13, 2009
    #3
  4. Zach

    Ian Collins Guest

    Tor Rustad wrote:
    >
    > Hence in C, you can <no> assume that
    >
    > struct s mys;
    >
    > memcpy(&mys, ...);
    >
    > will work, due to potential padding bytes. To load some data into a
    > struct, you need to do that struct member for for struct member, unless
    > using a non-standard pragma for struct packing.


    Or use a static initialiser, or even just a plain old copy.

    --
    Ian Collins
    Ian Collins, Jun 13, 2009
    #4
  5. Tor Rustad wrote:
    >> What won't work ? It's not clear what you have in mind for "...".

    >
    > I had a buffer in mind, which is the typical case when a network packet
    > arrives or you read some data from disk.


    Put another way, if you have a buffer

    char buf[1024];

    which has some received network packet in it, you can't do

    struct packet_header *header=(struct packet_header *)buf;

    and start reading the members of header and expect everything to be
    fine. The packet has all members aligned according to the network
    protocol, probably with very little padding, and the compiler probably
    aligns for instance ints and chars differently, inserting padding in
    between them or even reordering them inside the struct. There are no
    guarantees (well almost no guarantees, go read the standard).

    So, as an example, say you have a network protocol with a header that
    contains an 8 bit packet type and a 16 bit payload length. It probably
    looks like this:

    offset 0: type [byte]
    offset 1: length [high byte] [low byte]

    Now if you declare a struct for that as

    struct header{
    char type;
    short length;
    };

    on a little-endian 32 bit computer with no native 16 bit data type and a
    32 bit alignment restriction, you'll probably get a struct that looks
    like this in memory:

    offset 0: type [byte]
    offset 1: [padding] [padding] [padding]
    offset 4: length [low byte] [2:nd byte]
    [3:rd byte] [high byte]

    If you set a pointer like that to the beginning of the buffer and try to
    read the members, you'll get the right packet type, but the length will
    be outside of the header data and might segfault your program, or it
    might point to the second byte of the payload, or it might point to
    something else entirely.

    Without the padding, the length member of the struct would start at the
    right place but would still read four bytes instead if the two defined
    in the protocol and present in the buffer, and on top of that they would
    be in the wrong order.

    So that was a quick lesson in padding and network byte order.


    Bjarni
    --

    INFORMATION WANTS TO BE FREE
    Bjarni Juliusson, Jun 14, 2009
    #5
  6. On 14 June, 00:47, Bjarni Juliusson <> wrote:
    > The packet has all members aligned according to the network
    > protocol, probably with very little padding, and the compiler probably
    > aligns for instance ints and chars differently, inserting padding in
    > between them or even reordering them inside the struct.


    A compiler cannot reorder the fields of a structure. Paragraph 5 of
    6.5.8 says:

    When two pointers are compared, the result depends on the
    relative locations in the address space of the objects
    pointed to.
    [...]
    If the objects pointed to are members of the same aggregate
    object, pointers to structure members declared later compare
    greater than pointers to members declared earlier in the
    structure,

    --
    If kids realised how annoying they can be they would have a lot more
    appreciation for their parents.
    Spiros Bousbouras, Jun 14, 2009
    #6
  7. Spiros Bousbouras wrote:
    > On 14 June, 00:47, Bjarni Juliusson <> wrote:
    >> The packet has all members aligned according to the network
    >> protocol, probably with very little padding, and the compiler probably
    >> aligns for instance ints and chars differently, inserting padding in
    >> between them or even reordering them inside the struct.

    >
    > A compiler cannot reorder the fields of a structure. Paragraph 5 of
    > 6.5.8 says:
    >
    > When two pointers are compared, the result depends on the
    > relative locations in the address space of the objects
    > pointed to.
    > [...]
    > If the objects pointed to are members of the same aggregate
    > object, pointers to structure members declared later compare
    > greater than pointers to members declared earlier in the
    > structure,


    You are right, I apologise. In fact, it is stated more clearly in
    paragraph 13 of 6.7.2.1:

    Within a structure object, the [...] members [...] have
    addresses that increase in the order in which they are declared.

    I misremembered, and thought the only guarantee was that the first
    element always ended up first in memory.

    Can anyone tell me what the rationale is? It seems to me like it might
    be sensible to, say, take all the char members in a struct and pack them
    together at the end to preserve alignment of any int members without
    lots of padding.


    Bjarni
    --

    INFORMATION WANTS TO BE FREE
    Bjarni Juliusson, Jun 14, 2009
    #7
  8. Bjarni Juliusson wrote:
    > You are right, I apologise. In fact, it is stated more clearly in
    > paragraph 13 of 6.7.2.1:
    >
    > Within a structure object, the [...] members [...] have
    > addresses that increase in the order in which they are declared.
    >
    > I misremembered, and thought the only guarantee was that the first
    > element always ended up first in memory.
    >
    > Can anyone tell me what the rationale is? It seems to me like it might
    > be sensible to, say, take all the char members in a struct and pack them
    > together at the end to preserve alignment of any int members without
    > lots of padding.


    It flows partially from the requirement that any initial member types
    common to two structs be laid out in the same order and at the same
    offsets, which is used widely for crude polymorphism. The compiler
    can't know when compiling unit A what other structs will be in unit B
    (which may not even be written yet) and how many, if any, of their
    initial members may be in common. Therefore, the only reordering of
    members that _could_ potentially be allowed is fitting later members
    into the padding between earlier elements.

    Once you're going to put the above restriction on reordering, you don't
    lose much if you ban reordering entirely and therefore comply with the
    Rule of Least Surprise -- the programmer put the members in a particular
    order, so it would be logical for him to expect them to be laid out that
    way in memory. If he cares about the padding (which is, in most cases,
    a micro-optimization and therefore Evil(tm)), he can reorder them
    himself to minimize it.

    More importantly, though, I suspect that all known compilers at the time
    followed this rule already, so it was probably a matter of C89
    formalizing the behavior to keep future implementations from doing
    something that wouldn't be compatible. Remember, ANSI's primary goal
    was to standardize existing practice, not to create an ideal language.

    S

    --
    Stephen Sprunk "Stupid people surround themselves with smart
    CCIE #3723 people. Smart people surround themselves with
    K5SSS smart people who disagree with them." --Isaac Jaffe
    Stephen Sprunk, Jun 14, 2009
    #8
  9. Richard Heathfield <> writes:
    > Squeamizh said:
    >
    >> On Jun 13, 1:34 am, Richard Heathfield <>
    >> wrote:

    > <snip>
    >>
    >>> The tm_yday field in a struct tm is essentially padding as far as
    >>> many people are concerned (but those who /do/ use it would
    >>> disagree!).

    >>
    >> Then I would strongly suggest that it isn't padding. Correct me
    >> if I'm wrong; you seem to be saying that padding is any data that
    >> you don't find useful in a particular situation.

    >
    > ...but that /someone/ or /something/ finds useful, and thus we can't
    > just leave the padding out. Yes. That's not a formal definition,
    > obviously, but it seems to me to be a very pragmatic way of looking
    > at padding.


    It is nevertheless wrong: 'Padding' is additional storage beyond the
    one necessary to hold, say, the data values of a C struct, which is
    used to achieve some effect beyond what is specified in the
    C-standard, typically, to conform to ABI-requirements regarding
    alignment of objects of a particular size (for instance, '4-byte
    integers must always be stored at addresses evenly divisble by four')
    in order to be able to generate more efficient machine code (for
    instance, because properly aligned 4-byte values can be manipulated
    with machine instructions operating on 'words' of data). This is
    something different than 'data members someone may consider to be
    useless' (and hence, 'a waste of space').
    Rainer Weikusat, Jun 14, 2009
    #9
  10. On Sun, 14 Jun 2009 04:00:08 +0200, Bjarni Juliusson <> wrote:
    > You are right, I apologise. In fact, it is stated more clearly in
    > paragraph 13 of 6.7.2.1:
    >
    > Within a structure object, the [...] members [...] have
    > addresses that increase in the order in which they are declared.
    >
    > I misremembered, and thought the only guarantee was that the first
    > element always ended up first in memory.
    >
    > Can anyone tell me what the rationale is? It seems to me like it might
    > be sensible to, say, take all the char members in a struct and pack
    > them together at the end to preserve alignment of any int members
    > without lots of padding.


    For a struct with no aggregate members, this may not be quite as useful
    (but then I may also be missing an important detail behind the rule).
    But it seems quite sensible for code that includes structs inside other
    structs, i.e.:

    struct methods;

    struct object {
    long magic;
    size_t size;
    long type;
    struct methods *fptr;
    size_t nfptr;
    };

    struct myobject {
    struct object parent;
    char mydata[10];
    };

    This is commonly used to 'tag' structures of different types. If the
    compiler was allowed to reorder the fields of `myobject', it would be
    quite unpredictable where myobject.parent would end up.
    Giorgos Keramidas, Jun 15, 2009
    #10
  11. Zach

    Boon Guest

    Richard Heathfield wrote:

    > Padding in network packets: there's a byte of padding at the end of
    > the TCP header, purely so that the data is four-byte-aligned. For
    > some machines, this is an advantage (although the cynical part of
    > me suspects it was put there purely to make the diagram look
    > neater). The TCP protocol insists that this padding is set to 0.


    You lost me.

    http://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_structure

    Assuming no options (i.e. header size = 20 octets), the last two fields of a TCP
    header are Checksum and Urgent pointer.

    "Urgent pointer (16 bits) – if the URG flag is set, then this 16-bit field is an
    offset from the sequence number indicating the last urgent data byte"

    Perhaps you were thinking of headers with options?
    (In that case, padding might be needed.)

    > Note also that six /other/ bits of the header are unused, but
    > reserved for future use. ("Reserved for future use" very often
    > means the same as "padding"!)


    Only 4 bits are reserved for future use and should be set to zero.
    CWR and ECE were defined in 2001.

    http://tools.ietf.org/html/rfc3168

    Regards.
    Boon, Jun 15, 2009
    #11
  12. On Jun 13, 1:07 pm, Zach <> wrote:
    > I looked in the index of K&R and couldn't find anything on padding.
    > Could someone please explain what padding is in C programming and
    > illustrate it with some code. I heard it is often used in constructing
    > network packets.
    >


    Padding is used for Boundary Alignment w.r.t processor.
    It is very important to take care of boundary
    alignment while designing database's in C language in embedded
    environment as it improves the performance and also in saving
    the memory. So, Database Design is very important.

    Karthik Balaguru
    karthikbalaguru, Jun 15, 2009
    #12
  13. Zach

    Richard Bos Guest

    Rainer Weikusat <> wrote:

    > Richard Heathfield <> writes:


    > > ...but that /someone/ or /something/ finds useful, and thus we can't
    > > just leave the padding out. Yes. That's not a formal definition,
    > > obviously, but it seems to me to be a very pragmatic way of looking
    > > at padding.

    >
    > It is nevertheless wrong: 'Padding' is additional storage beyond the
    > one necessary to hold, say, the data values of a C struct, which is
    > used to achieve some effect beyond what is specified in the
    > C-standard, typically, to conform to ABI-requirements regarding
    > alignment of objects of a particular size (for instance, '4-byte
    > integers must always be stored at addresses evenly divisble by four')
    > in order to be able to generate more efficient machine code (for
    > instance, because properly aligned 4-byte values can be manipulated
    > with machine instructions operating on 'words' of data). This is
    > something different than 'data members someone may consider to be
    > useless' (and hence, 'a waste of space').


    What's more, padding is never a data member at all. The defining
    characteristic of padding is that it exists because of the space it
    takes, not because of the value it may or may even never have. Any data
    member, even a data member only few people find useful, should at some
    point in its life have a value that those people want to refer to.
    Padding may change randomly or never at all, and you can blot over it
    without affecting anyone.

    Richard
    Richard Bos, Jun 23, 2009
    #13
  14. Zach

    Eric Sosman Guest

    Richard Bos wrote:
    >
    > What's more, padding is never a data member at all. The defining
    > characteristic of padding is that it exists because of the space it
    > takes, not because of the value it may or may even never have. Any data
    > member, even a data member only few people find useful, should at some
    > point in its life have a value that those people want to refer to.
    > Padding may change randomly or never at all, and you can blot over it
    > without affecting anyone.


    So I guess you'd say "unused; must be zero" bits are
    not padding?

    --
    Eric Sosman
    lid
    Eric Sosman, Jun 23, 2009
    #14
  15. Zach

    Tim Rentsch Guest

    (Richard Bos) writes:

    > Rainer Weikusat <> wrote:
    >
    > > Richard Heathfield <> writes:

    >
    > > > ...but that /someone/ or /something/ finds useful, and thus we can't
    > > > just leave the padding out. Yes. That's not a formal definition,
    > > > obviously, but it seems to me to be a very pragmatic way of looking
    > > > at padding.

    > >
    > > It is nevertheless wrong: 'Padding' is additional storage beyond the
    > > one necessary to hold, say, the data values of a C struct, which is
    > > used to achieve some effect beyond what is specified in the
    > > C-standard, typically, to conform to ABI-requirements regarding
    > > alignment of objects of a particular size (for instance, '4-byte
    > > integers must always be stored at addresses evenly divisble by four')
    > > in order to be able to generate more efficient machine code (for
    > > instance, because properly aligned 4-byte values can be manipulated
    > > with machine instructions operating on 'words' of data). This is
    > > something different than 'data members someone may consider to be
    > > useless' (and hence, 'a waste of space').

    >
    > What's more, padding is never a data member at all. The defining
    > characteristic of padding is that it exists because of the space it
    > takes, not because of the value it may or may even never have. Any data
    > member, even a data member only few people find useful, should at some
    > point in its life have a value that those people want to refer to.
    > Padding may change randomly or never at all, and you can blot over it
    > without affecting anyone.


    In

    struct s {
    unsigned foo : 4;
    unsigned : 12;
    unsigned bas : 16;
    };

    would you say the data member between foo and bas is there as
    padding? If we have

    struct s x, y;
    memset( &y, 0, sizeof y );
    x = y;

    is the unnamed bit-field member guaranteed to hold zeroes, or
    not? If it is not guaranteed to hold zeroes (because structure
    assignments are not required to copy padding bits) does that mean
    it's illegal to put an unnamed bit-field member at the start of a
    structure (because structures are not allowed to have padding at
    the beginning)? Or is this a case of a data member that exists
    just because of the space it takes, yet is not padding? But if
    the values of unnamed bit-fields are supposed to be useful (ie,
    and not padding), why are they indeterminate even after
    initialization? (6.7.8 p 9)
    Tim Rentsch, Jun 23, 2009
    #15
  16. Zach

    Eric Sosman Guest

    Mark McIntyre wrote:
    > Eric Sosman wrote:
    >> Richard Bos wrote:
    >>>
    >>> What's more, padding is never a data member at all. The defining
    >>> characteristic of padding is that it exists because of the space it
    >>> takes, not because of the value it may or may even never have. Any data
    >>> member, even a data member only few people find useful, should at some
    >>> point in its life have a value that those people want to refer to.
    >>> Padding may change randomly or never at all, and you can blot over it
    >>> without affecting anyone.

    >>
    >> So I guess you'd say "unused; must be zero" bits are
    >> not padding?

    >
    > I believe the original context was with respect to structs, not unused
    > bits in a bitfield object.


    The original question mentioned padding in connection
    with "constructing network packets." True, it was only
    in an "I heard ..." context, but a structs-only view of
    the discussion seems a bit restrictive.

    --
    Eric Sosman
    lid
    Eric Sosman, Jun 23, 2009
    #16
  17. Zach

    Richard Bos Guest

    Eric Sosman <> wrote:

    > Richard Bos wrote:
    > >
    > > What's more, padding is never a data member at all. The defining
    > > characteristic of padding is that it exists because of the space it
    > > takes, not because of the value it may or may even never have. Any data
    > > member, even a data member only few people find useful, should at some
    > > point in its life have a value that those people want to refer to.
    > > Padding may change randomly or never at all, and you can blot over it
    > > without affecting anyone.

    >
    > So I guess you'd say "unused; must be zero" bits are
    > not padding?


    That depends on whether that is a political or a technical "must". If
    there is reasonable expectation that it may in the future be used, and
    can then get other values, it's not. If it's only required to be zero
    out of a show of future planning, it's padding. I admit that this may,
    for an outsider, be difficult to judge.

    Richard
    Richard Bos, Jun 24, 2009
    #17
  18. Zach

    Richard Bos Guest

    Tim Rentsch <> wrote:

    > (Richard Bos) writes:
    >
    > > What's more, padding is never a data member at all. The defining
    > > characteristic of padding is that it exists because of the space it
    > > takes, not because of the value it may or may even never have. Any data
    > > member, even a data member only few people find useful, should at some
    > > point in its life have a value that those people want to refer to.
    > > Padding may change randomly or never at all, and you can blot over it
    > > without affecting anyone.

    >
    > In
    >
    > struct s {
    > unsigned foo : 4;
    > unsigned : 12;
    > unsigned bas : 16;
    > };
    >
    > would you say the data member between foo and bas is there as
    > padding?


    Since you can't access its value at all (stupid tricks with unsigned
    char pointers aside), and that value is therefore irrelevant, yes.

    > If we have
    >
    > struct s x, y;
    > memset( &y, 0, sizeof y );
    > x = y;
    >
    > is the unnamed bit-field member guaranteed to hold zeroes, or
    > not?


    In y, yes, but they're irrelevant; in x, they're not even guaranteed to
    be zero.

    > (because structures are not allowed to have padding at the beginning)?


    Structures are not allowed to have _implementation-inserted_ padding
    _bytes_ at the beginning. You, as the user-programmer, are allowed to
    add as much padding of your own as pleases you. There is nothing in the
    Standard to stop you from declaring

    struct t {
    unsigned char padding[37];
    long int single_data_member;
    unsigned char more_padding[51];
    }

    Do not be surprised to find extra padding added after _your_ member
    called padding, and before or after more_padding.

    Tell me, did you _really_ not know all this, or are you being an awkward
    arsehole just to make the point that you _can_ be an awkward arsehole?

    Richard
    Richard Bos, Jun 24, 2009
    #18
  19. Zach

    James Kanze Guest

    On Jun 15, 7:35 pm, Richard Heathfield <> wrote:
    > Boon said:


    > > Richard Heathfield wrote:


    > >> Padding in network packets: there's a byte of padding at
    > >> the end of the TCP header, purely so that the data is
    > >> four-byte-aligned. For some machines, this is an advantage
    > >> (although the cynical part of me suspects it was put there
    > >> purely to make the diagram look neater). The TCP protocol
    > >> insists that this padding is set to 0.


    > > You lost me.


    > > <wiki URL>


    > I wrote my article after referring to:


    > <http://www.faqs.org/rfcs/rfc793.html>


    The diagram in this RFC does show three bytes of options and one
    of padding at the end. The text, however, makes it clear that
    the header may end immediately after the urgent pointer field,
    that the size of the options field is variable, that the ammount
    of padding is also variable---the sum of the sizes of the
    options and the padding must be a multiple of 4.

    > <snip>


    > >http://tools.ietf.org/html/rfc3168


    > Then it's possible that I'm out of date. (In fact, it's quite
    > probable.)


    RFC 793 is still the basic definition of TCP, although there are
    later RFC's which update it (e.g. by specifying additional
    options or additional control bits).

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Jun 24, 2009
    #19
  20. Zach

    James Kanze Guest

    On Jun 14, 1:47 am, Bjarni Juliusson <> wrote:
    > Tor Rustad wrote:
    > >> What won't work ? It's not clear what you have in mind for
    > >> "...".


    > > I had a buffer in mind, which is the typical case when a
    > > network packet arrives or you read some data from disk.


    > Put another way, if you have a buffer


    > char buf[1024];


    > which has some received network packet in it, you can't do


    > struct packet_header *header=(struct packet_header *)buf;


    > and start reading the members of header and expect everything
    > to be fine. The packet has all members aligned according to
    > the network protocol, probably with very little padding, and
    > the compiler probably aligns for instance ints and chars
    > differently, inserting padding in between them or even
    > reordering them inside the struct.


    And the network may use a different representation for negative
    values, or even a different byte size. The RFC's specify
    octets, the C standard bytes. Octets are eight bits, bytes may
    be any number of bits, depending on the hardware (although I've
    never heard of less than six, the C standard requires at least
    eight, and Posix does require exactly eight). Posix also
    requires 2's complement, so on a Posix compliant system, the
    only difference in representation can be byte order.

    En general, anytime you're moving between network data and
    internal data, you need some sort of marshalling code.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
    James Kanze, Jun 24, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dave
    Replies:
    7
    Views:
    5,664
    Joe Smith
    Jul 22, 2004
  2. RA
    Replies:
    1
    Views:
    371
  3. Becker

    Padding between textboxes

    Becker, Jun 24, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    1,285
    Eliyahu Goldin
    Jun 24, 2004
  4. =?Utf-8?B?U2FuZHk=?=

    VB code and Sql Server Ansi Padding

    =?Utf-8?B?U2FuZHk=?=, May 11, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    2,471
    =?Utf-8?B?U2FuZHk=?=
    May 11, 2005
  5. Robert Smith
    Replies:
    0
    Views:
    4,942
    Robert Smith
    Dec 8, 2005
Loading...

Share This Page