Use 3 bytes to store integer in redis?

Discussion in 'C Programming' started by Haomai Wang, Sep 5, 2012.

  1. Haomai Wang

    Haomai Wang Guest

    In the implement of ziplist in redis, author use a way to reduce space to store different size integer.
    Such as the integer in the range of -8388608~8388607, ziplist use 3 bytes to store it avoid 4bytes.
    The function is below:

    static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
    ....
    } else if (encoding == ZIP_INT_24B) {
    i32 = value<<8;
    memrev32ifbe(&i32);
    memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    }...

    `p` is the memory stored to, `value` is the integer will be encoded, `encoding` is the encoding way to use determining the size 1 byte, 2bytes, 3bytes or 4bytes etc.
    Here we can find if encode integer in 3 bytes, firstly right shift 8 bits. I can't understand why?
    The page below may help understand the problem:
    https://github.com/antirez/redis/issues/469
    This issue shows author concerns the two-complement representation so right shift 8 bits.
    Haomai Wang, Sep 5, 2012
    #1
    1. Advertising

  2. Haomai Wang wrote:
    > In the implement of ziplist in redis, author use a way to reduce space to store different size integer.
    > Such as the integer in the range of -8388608~8388607, ziplist use 3 bytes to store it avoid 4bytes.
    > The function is below:
    >
    > static void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
    > ...
    > } else if (encoding == ZIP_INT_24B) {
    > i32 = value<<8;
    > memrev32ifbe(&i32);
    > memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    > }...
    >
    > `p` is the memory stored to, `value` is the integer will be encoded, `encoding` is the encoding way to use determining the size 1 byte, 2bytes, 3bytes or 4bytes etc.
    > Here we can find if encode integer in 3 bytes, firstly right shift 8 bits. I can't understand why?

    It's a left shift.
    > The page below may help understand the problem:
    > https://github.com/antirez/redis/issues/469
    > This issue shows author concerns the two-complement representation so right shift 8 bits.


    I think this is related to those 'arithmetic shifts', which are supposed
    to preserve sign of two's complement numbers. Avoid it if you can. The
    code looks highly suspicious. Especially the memcopy after that memrev.
    The left shift will align the most significant bit of the 24 bit int to
    the most sig. bit of the 32 bit integer. shifting back the 32 bit
    _signed_ integer by 8 bits afterwards will give a signed 32 bit integer
    the same value as the 24 bit signed integer. If they did not do it they
    would get their 24 bits interpreted as unsigned...(0-16.7M)
    Johann Klammer, Sep 5, 2012
    #2
    1. Advertising

  3. Haomai Wang

    Ike Naar Guest

    On 2012-09-05, Haomai Wang <> wrote:
    > static void zipSaveInteger(unsigned char *p, int64_t value, unsigned
    > char encoding) {
    > ...
    > } else if (encoding == ZIP_INT_24B) {
    > i32 = value<<8;
    > memrev32ifbe(&i32);
    > memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    > }...
    >
    > `p` is the memory stored to, `value` is the integer will be encoded,
    > `encoding` is the encoding way to use determining the size 1 byte,
    > 2bytes, 3bytes or 4bytes etc.
    > Here we can find if encode integer in 3 bytes, firstly right shift 8
    > bits. I can't understand why?


    It's not a right shift, it's a left shift.

    The statement 'i32=value<<8' will store the three least significant
    bytes of 'value' into the three most significant bytes of i32,
    e.g. if 'value' contained 0x0000000000123456 then i32 will contain
    0x12345600. If the machine uses big-endian representation for integers,
    the statement 'memrev32ifbe(&i32)' reverts the bytes.
    Now, the bytes in memory are, starting at address &i32:

    0x00, 0x56, 0x34, 0x12

    Then memcpy copies the second, third and fourth byte of the representation
    in memory, in that order, to the three-byte buffer pointed at by p,
    so that the end result is p[0]==0x56, p[1]==0x34, p[2]=0x12.
    Ike Naar, Sep 5, 2012
    #3
  4. Haomai Wang

    Nick Bowler Guest

    On Wed, 05 Sep 2012 07:45:23 +0000, Ike Naar wrote:
    > On 2012-09-05, Haomai Wang <> wrote:
    >> static void zipSaveInteger(unsigned char *p, int64_t value, unsigned
    >> char encoding) {
    >> ...
    >> } else if (encoding == ZIP_INT_24B) {
    >> i32 = value<<8;
    >> memrev32ifbe(&i32);
    >> memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
    >> }...

    [...]
    > The statement 'i32=value<<8' will store the three least significant
    > bytes of 'value' into the three most significant bytes of i32,
    > e.g. if 'value' contained 0x0000000000123456 then i32 will contain
    > 0x12345600. If the machine uses big-endian representation for integers,
    > the statement 'memrev32ifbe(&i32)' reverts the bytes.
    > Now, the bytes in memory are, starting at address &i32:
    >
    > 0x00, 0x56, 0x34, 0x12
    >
    > Then memcpy copies the second, third and fourth byte of the representation
    > in memory, in that order, to the three-byte buffer pointed at by p,
    > so that the end result is p[0]==0x56, p[1]==0x34, p[2]=0x12.


    Holy smokes, that seems like an incredibly overcomplicated way of writing

    p[0] = value & 0xff;
    p[1] = (value >> 8) & 0xff;
    p[2] = (value >> 16) & 0xff;
    Nick Bowler, Sep 12, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vivek Narayanan
    Replies:
    0
    Views:
    272
    Vivek Narayanan
    Jun 16, 2011
  2. Jabba Laci

    redis beginner question

    Jabba Laci, Nov 16, 2011, in forum: Python
    Replies:
    3
    Views:
    165
    Roy Smith
    Nov 16, 2011
  3. Michel Martens
    Replies:
    3
    Views:
    146
    James Edward Gray II
    Sep 16, 2009
  4. Tony Arcieri
    Replies:
    0
    Views:
    85
    Tony Arcieri
    Dec 7, 2010
  5. Replies:
    0
    Views:
    146
Loading...

Share This Page