Noob question about the >> operator

Discussion in 'C Programming' started by Michael Barry, Jan 27, 2014.

  1. Hello, everyone. As an inexperienced/rusty user of C, I have come across a question for you all.

    Does C's >> operator shift in zeros or sign-bits in the compilers with which you are familiar? For example:

    Code (Text):

    /* let's assume that unsigned is 32-bits wide here */
    unsigned a = 0x80000000U;
    printf("%X", a >> 3);
    Should I expect 10000000 or F0000000 to be printed?

    I seem to remember reading somewhere that this behavior of >> is compiler-dependent, and I will be using gcc for my 65m32 simulator. I can spend a bitmore effort to explicitly deal with the MSBs in my code, should it turn out to be a potential portability issue.


    Michael Barry, Jan 27, 2014
    1. Advertisements

  2. Michael Barry

    Kaz Kylheku Guest

    The C99 standard has this wording, in 6.5.7 Bitwise shift operators

    The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has
    an unsigned type or if E1 has a signed type and a nonnegative value,
    the value of the result is the integral part of the quotient of E1 /
    2E2 . If E1 has a signed type and a negative value, the resulting
    value is implementation-defined.

    In practice, two's complement machines repeat the sign bit.

    If you don't want to rely on that, consider this: division is now required to
    be truncating toward zero. This means that we can obtain truncating toward
    negative infinity semantics by subtracting 1. And that is equivalent to
    shifting with a sign extension.


    For instance -5, which is the two's complement bit pattern ...11011,
    when divided by two, is required to yield -2 (truncation toward zero).
    And that pattern is ...11110: not the pattern we want in terms of
    a right shift.

    However, if we subtract 1 first, then divide, it works out.
    ...11011 (-5) goes to ...11010 (-6), which shifts to -3 ..11101

    Thus we can do something like this:

    int sign_extending_right_shift(int x)
    if (x < 0)
    return (x - 1) / 2;
    return x >> 1;
    This doesn't match your question. Unsigned types have no sign bit.

    They always shift in a zero.

    That's what the "un" means in front of "signed" in "unsigned".
    Kaz Kylheku, Jan 27, 2014
    1. Advertisements

  3. Michael Barry

    David Brown Guest

    Kaz has given quite a complete answer. But the easy answer is to stick
    to unsigned data when shifting (as you are doing here) - then there is
    no sign bit to worry about. For left shifts and other bit manipulation
    (&, |, ~, ^) you should usually use unsigned data - preferably
    length-specific types like uint32_t. Without signs, everything is
    clearly defined in the standards to work as you would expect.

    If you really need to do shifts with signed integers, you can use tricks
    like Kaz posted. Alternatively, you can cast your signed data into
    unsigned, do the shifts unsigned, and cast back. The details of this
    sort of thing is "implementation defined" in the standards - so the
    exact results can vary between compilers and targets. In practice,
    however, if you are sticking to "normal" processors then everything
    works as you might expect using twos-compliment.

    If you are thinking - as some "noobs" do - that it might be a good idea
    to use shifts for "optimising" multiplies and divides by powers of two,
    then don't do that. If you want to divide "a" by 8, write "a / 8" and
    let the compiler figure out the most efficient implementation.

    Here would be one way of being sure of signed shifts that I believe will
    be correct (at least on "normal" processors with two's compliment
    arithmetic, and support for the given integer sizes):

    // Note that when pre-processing, the same rules should apply as
    // during runtime for the target

    #if (-1 >> 2) < 0
    // Right-shift duplicates the sign bit ("arithmetic right shift")

    static inline int32_t sign_extending_right_shift(int32_t x) {
    return x >> 1;

    static inline int32_t zero_extending_right_shift(int32_t x) {
    return (int32_t) (((uint32_t) x) >> 1);

    // Right-shift zeros the sign bit ("logical right shift")

    static inline int32_t sign_extending_right_shift(int32_t x) {
    if (x < 0) return (x - 1) / 2;
    return x >> 1;

    static inline int32_t zero_extending_right_shift(int32_t x) {
    return x >> 1;

    David Brown, Jan 27, 2014
  4. Michael Barry

    BartC Guest

    Others have already answered that. But if you're new to C, watch out for the
    strange precedence of the << and >> operators. For example, you might think
    that a<<b+c means (a<<b)+c, but it doesn't!
    BartC, Jan 27, 2014
  5. Thanks to all for your prompt and polite replies.

    @Bart: Thanks for the extra heads-up about operator precedence. I have a feeling that my source is going to contain a lot of unnecessary casts and extra parentheses, which will undoubtedly be optimized away by gcc, but still show my noobness to all who care to read the source.

    A follow-up question, drifting a bit OT: On a hypothetical naked CPU, assuming that there is not enough opcode space for both types of shifts (signedand unsigned), which group is more useful to keep? In other words, is it 'less-expensive' to convert an unsigned shift to a signed, or vice-versa? I know that I could probably figure it out for myself with a few test-cases, but I thought that it wouldn't hurt to ask you nice folks for your opinions.

    Michael Barry, Jan 27, 2014
  6. Michael Barry

    Eric Sosman Guest

    Try to avoid unnecessary casts, or at least keep them down
    to a dull roar. The problem is that a cast operator often tells
    a compiler not to emit a warning (or error!) diagnostic that it
    otherwise would have generated; the compiler will frequently read
    a cast as the programmer's claim that "I know what I'm doing here."
    If the cast is truly unnecessary there won't be any such diagnostic,
    but if you get into the habit of throwing casts around (casting
    them around?), sooner or later you'll find that you've silenced
    some useful advice.

    Parentheses are another matter: Unlike casts they are not
    operators, but separators and groupers. They can certainly be
    overdone, but they can also be underdone -- especially if you
    find yourself working in multiple languages whose precedence rules
    are subtly different. (The inventor of C admitted that there'd
    been some poor choices in C's precedence rules, but that with
    "several hundred kilobytes of source code and maybe 3 installations"
    it was too late to change things.)
    Eric Sosman, Jan 27, 2014
  7. (snip)
    Well, first note that the C standard requires unsigned to shift in
    zeros, but doesn't restrict signed in the same way.

    Partly that is because C allows for ones complement or sign magnitude
    representation for signed integer types, in addition to the more
    common twos complement. The numerical effect of shifts, or the
    proper arithmetic shift, depend on the representation.

    But a CPU so restricted likely only has one bit shifts, where a
    loop is required for multiple bit shifts. Many early CPUs had no
    multiply or divide instruction, but add/subtract and a special shift
    (or sometimes rotate) instruction to make software multiply and
    divide easier. Often there was an accumulator and MQ (multiplicand/
    quotient) registers, the latter convenient for the use it was named
    Not so many years ago, I was working with Dynamic-C (the name of
    the compiler, it isn't actually so dynamic) and the Rabbit processor
    (similar to Z280, descendant of the 8080 and Z80). Seems that there
    are no signed 16 bit instructions, so 16 bit compare is done by
    complementing the sign big of both operands, doing an unsigned compare
    and, if necessary, uncomplementing.

    But there could be processors with signed, but not unsigned, shift

    -- glen
    glen herrmannsfeldt, Jan 27, 2014
  8. Michael Barry

    Jorgen Grahn Guest

    I have no idea -- I avoid using it with signed types! If I had to
    know in order to fix a bug in someone else's code, I'd write a test
    program to find out.

    Jorgen Grahn, Jan 27, 2014
  9. Michael Barry

    Tim Rentsch Guest

    It would be better not to rely on C99 division rules and
    not to risk undefined behavior:

    int sign_extending_right_shift(int x)
    return (x - !!(x%2)) / 2;
    Tim Rentsch, Feb 7, 2014
  10. Michael Barry

    Tim Rentsch Guest

    complement, not compliment.
    Better (using 'int' rather than the platform-dependent 'int32_t'):

    static inline int sign_extending_right_shift(int x){
    return (x - !!(x%2)) / 2;

    static inline int zero_extending_right_shift(int x){
    typedef union { unsigned u; int i; } ui;
    return (ui){ .u = (ui){ .i = x }.u >> 1 }.i;

    No platform-dependent preprocessor tests needed. Also the
    second function shifts the actual bits in machines that
    use signed magnitude or ones complement.
    Tim Rentsch, Feb 7, 2014
  11. Michael Barry

    David Brown Guest

    Spell chequers can't catch /all/ mistakes :)
    When you are dealing with shifts and other bit manipulation, you almost
    certainly know the sizes of the data you are working with. It is
    (normally) better to be explicit about that sort of thing - then any
    assumptions you make are clear in the text.

    "int32_t" is only "platform-dependent" in that theoretically, there may
    be platforms that don't support it. I say "theoretically", because
    there are no normal modern processors that don't support it. (There are
    a few DSP's that don't support the 8-bit or 16-bit types, but they are
    niche devices.)

    "int", on the other hand, is totally "platform-dependent" and there are
    many real-world systems where "int" is not 32-bit.

    It is your own choice, of course, but I strongly prefer to use
    explicitly sized types when the size matters.
    Yes, that works without needing a platform-dependent preprocessor test -
    but it is bigger, uglier, slower and harder to understand than either of
    the platform-specific versions I gave. If you have some reason to
    dislike having such preprocessor tests and multiple choices of
    implementation (they are arguably a "hack"), then turn the check into a
    static assert and just use the first versions of the functions. They
    will work fine on virtually all useful platforms, and the static assert
    will give you a compile-time error rather than weird run-time behaviour
    in the rare cases that the assumption is not true.
    That is directly equivalent to the version I gave using casting (and I
    agree it will work on either type of platform). But some compilers
    might do messy things when implementing your version, such as allocating
    the union on the stack. Even the most brain-dead compiler with
    optimisations disabled will implement the simple cast version directly.

    I know optimisation is not always vital - and it is /always/ secondary
    to correctness - but code like this often needs to be optimal if it is
    going to be useful. It is particularly common in embedded programming,
    where "small and fast" means "cheap".
    David Brown, Feb 7, 2014
  12. Michael Barry

    David Brown Guest

    As far as I understand it, the rules for "%" are defined in terms of the
    rules for division. So your version still relies on those same rules.

    From N1570 (C11 near-final draft):

    6.5.5 Multiplicative operators
    multiplicative-expression * cast-expression
    multiplicative-expression / cast-expression
    multiplicative-expression % cast-expression
    Each of the operands shall have arithmetic type. The operands of the %
    operator shall
    have integer type.
    3 The usual arithmetic conversions are performed on the operands.
    4 The result of the binary * operator is the product of the operands.
    5 The result of the / operator is the quotient from the division of the
    first operand by the
    second; the result of the % operator is the remainder. In both
    operations, if the value of
    the second operand is zero, the behavior is undefined.
    6 When integers are divided, the result of the / operator is the
    algebraic quotient with any
    fractional part discarded.105) If the quotient a/b is representable,
    the expression
    (a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is
    David Brown, Feb 7, 2014
  13. Michael Barry

    Tim Rentsch Guest

    Great! But now look at the code again and see if you can
    understand why it works under both the C99 rules and the more
    permissive C90 rules. The code I gave does rely on the rules for
    remainder (and so indirectly division), but it does not rely on
    the more restrictive C99 rules. That's a key distinction.
    Tim Rentsch, Feb 7, 2014
  14. Michael Barry

    David Brown Guest

    Fair enough. But the very fact that this is difficult to spot counts
    against writing the code in this way. When you have the choice, a
    clearer and simpler solution is the best.

    On the other hand, gcc (with -O) on amd64 spots that your version is
    equivalent to "x >> 1", and generates optimal code. The conditional
    version generates conditional code (at least with my somewhat outdated
    gcc). And when you have a choice, something that produces optimal code
    is always better.

    My preference:

    #if (-1 >> 2) < 0
    // Right-shift duplicates the sign bit ("arithmetic right shift")

    static inline int32_t sign_extending_right_shift(int32_t x) {
    return x >> 1;

    // Right-shift zeros the sign bit ("logical right shift")
    #error You've found one of the most awkward C compilers on the planet.

    If you want to substitute your version of the function instead of a
    compile-time error message, that's fine - but I would not consider it
    acceptable code without a hefty comment section explaining how it works.

    (I don't mean to say that there is no place for "clever" code like this
    - just that it should only be used when needed, and it must be
    clarified. It's the kind of thing you might put in a library but not in
    application code.)
    David Brown, Feb 7, 2014
  15. Michael Barry

    Tim Rentsch Guest

    By focusing on this incidental distinction you are missing
    the point. The examples I gave apply to any integer type
    (adjusting various declarations appropriately), including
    int32_t. However, because the point is about writing code
    that will work on all platforms, including those where
    int32_t cannot be provided, the examples use int instead
    of int32_t. How to make these available with explicit
    widths is a separate discussion - possibly an important
    discussion, but nevertheless a separate discussion.
    If you had bothered to try actually compiling it, as I did, I
    believe you would find my version generates code that is just as
    fast as the smaller of your two cases, and faster than the other,
    on platforms of interest (I tried two).

    Minor point: your statement that this way is "bigger" is simply
    false. My version has one line of function body. Your version
    has either one or two, or a total of three, and that is not
    counting the preprocessor statements to choose between the two
    alternatives. If we take into account the other needed overhead
    then your version is much bigger.
    My version has no need of either preprocessor tests or static
    asserts, and works in all conforming implementations. Given
    that it is just as fast or faster than the larger and less
    general version, I see no reason to prefer the latter.
    No, it isn't. The casting version works correctly only
    on platforms that use twos complement. This version
    doesn't have that restriction.
    This is a specious argument. No real work is done under such
    conditions. Your original example uses 'inline', which means the
    compiler was done after 1999. No contemporary compiler is going
    to be lacking in such rudimentary transformations, and anyone
    interested in good performance obviously is going to enable
    The functions I wrote produced code that is just as fast
    or faster than either of your alternate versions, using
    the lowest levels of optimization available. That includes
    test compiles targeting an embedded processor.
    Tim Rentsch, Feb 7, 2014
  16. Michael Barry

    Tim Rentsch Guest

    Other things being equal, clearer and simpler are better. But
    here they are not equal.
    I'm a bit confused by your statement here - you prefer a larger,
    less general solution to a smaller, optimal, and completely
    general one?

    Incidentally, I recently have been working on code that has lots of
    platform variability under control of #ifdef, etc. This results in
    a heavy cost doing code maintenance. IMO the complexity of having
    different versions far outweighs the complexity of the expression
    in the (one and only) return statement.
    I agree, a comment would be good.
    IMO the code with the preprocessor test is more "clever" than
    the platform-independent version. Also the name is wrong,
    because what you want is an "arithmetic shift" (ie, divide
    rounding down), not a "sign extending shift", which has a
    completely different meaning on signed magnitude platforms.
    So which version is more "clever" is subjective.
    Tim Rentsch, Feb 7, 2014
  17. Michael Barry

    David Brown Guest

    I prefer a simpler, clearer solution that works on all realistic
    platforms, and will be optimal on all compilers and all choice of flag
    settings (not every compiler is as smart as gcc, and some people don't
    enable optimisation). The pre-processor check is just for paranoia, in
    case the code is used on a particularly odd architecture - compile-time
    error messages are much better than run-time mistakes.
    If the complex stuff is well enough hidden from "normal" users, and well
    enough commented, then I agree with you.

    A few #ifdef's for conditional compilation is okay, but there is no
    doubt that it makes the code hard to work with if there is too much of
    it. Sometimes the balance tips in favour of a complex general solution
    rather than multiple simple expressions (especially if you can be sure
    the compiler will optimise it well). Sometimes it is best with general
    solutions even if they are sub-optimal. Sometimes it is best to have
    separate files (headers or even C files) that handle each platform
    individually, so that you only have a single #ifdef at the start of each
    file - that makes it clear what code is actually used, but risks
    duplication and the maintenance issues that follow.

    I don't think there is any good, general solution to such issues. But
    if the job were easy, everyone could do it :)
    I copied the function name from a previous post - I agree that
    "arithmetic_right_shift" would be better.

    I suppose any discussion about what is "clever" code is highly
    subjective - it depends on the programmer's experience, and the style of
    code they typically work with.

    I guess we can agree that both methods work, and both methods should
    have at least some comments to say /how/ they work. And we let the OP
    choose whatever he prefers.
    David Brown, Feb 7, 2014
  18. Michael Barry

    David Brown Guest

    The versions I gave also work for all integer types in the same way.

    When I write C, I aim for the code to work on different platforms
    (unless the code is directly accessing target-specific details, which is
    not uncommon for my embedded code). I try to write in a way that will
    work as expected on realistic platforms, and will produce compile-time
    errors if any of my assumptions do not hold. So if I want to work with
    32-bit values, I use the platform-independent type "int32_t" (or perhaps
    "int_least32_t", depending on the circumstances). I rarely use plain
    "int", precisely because my code often needs to be portable across
    different platforms. ("int" is usually synonymous with "int_fast16_t"
    as far as I am concerned.)

    (You may consider this a separate discussion, but you brought it up
    here. But I will try not to add more on the topic in this thread.)
    I tried the code, and reached the same basic conclusion.

    However, it is only as small and fast when optimisation is enabled (not
    everyone enables optimisation - I think that is generally a mistake on
    their part, but it is still true), and with a good compiler (I have used
    many compilers that would not have a chance of optimising your code
    here). And we only know that it is as fast as the plain "x >> 1"
    version on platforms which implement "x >> 1" as an arithmetic right
    shift - we don't know what code a compiler would generate on a platform
    that implemented "x >> 1" as a logical right shift, because we don't
    have easy access to such a platform (if one even exists).

    So we know your version is never faster than mine on any platform we can
    try, and that it is slower than mine on some platforms (I tried
    compiling for the 8-bit AVR, using avr-gcc 4.5.1), and slower if
    optimisation is disabled. We know nothing about the case for platforms
    which use logical right shift.
    I meant bigger object code, not source code (I can count source lines
    too). But to be fair, your code is not bigger than mine on "typical"
    See above. If your version were always as optimal, then I would mostly
    agree. (I still think your version is less clear - but as I said in
    another post, that is a subjective opinion.)
    Would that be because a cast /could/ change the bit representation of
    the data when converting from unsigned to signed? I am not sure if I am
    getting the details of the standards correct here, but I believe a cast
    between signed int and unsigned int will count as a "conversion". The
    cast from signed to unsigned is well defined, and will work even on
    one's complement systems (-1 converts to 0xffff ffff for 32-bit ints).
    But conversion from unsigned 0xffff ffff back to -1 is apparently
    implementation dependent. If that interpretation is correct, then
    perhaps my version is implementation-dependent even for two's complement
    systems. I'd appreciate a "ruling" on this point.

    As far as I can see from the standards, your type-punning method should
    be correct.

    However, the casting method will work correctly on platforms that are
    realistic for running new code (how many targets are /not/ two's

    Testing with gcc on amd64 and the AVR shows the casting version to be
    optimal for all settings, and the union method is optimal when
    optimisation is enabled and very sub-optimal when optimisation is
    disabled. And if anyone is interested (I know this is almost blasphemy
    in this group), the casting version is also valid in C++ while the union
    version is not. (This is the fault of the C++ committee - designated
    initialisers should have been included in C++.)
    Unfortunately, that is not true. In the world of embedded programming,
    there are some very poor compilers around. For many reasons, it is
    sometimes necessary to use very old tools even for new code - and there
    are plenty of "modern" compilers that do very little in the way of
    I tested on an 8-bit AVR, which is quite a common embedded processor.
    Admittedly it was not the latest gcc available, but it was approximately
    the same version as for amd64 that I have conveniently on my system. (I
    also tested for a 16-bit msp430, but as the results were the same as for
    amd64, I didn't bother giving the details.)
    David Brown, Feb 7, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.