Can we replace 8 bits by 2 bits?

Discussion in 'C Programming' started by Umesh, Jan 5, 2007.

  1. Umesh

    Umesh Guest

    This is a basic thing.
    Say A=0100 0001 in ASCII which deals with 256 characters(you know
    better than me!)
    But we deal with only four characters and 2 bits are enough to encode
    them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    this four alphabet in my work. Can u pl write a sample program to reach
    my goal?
    Umesh, Jan 5, 2007
    #1
    1. Advertising

  2. Umesh

    jacob navia Guest

    Umesh a écrit :
    > This is a basic thing.
    > Say A=0100 0001 in ASCII which deals with 256 characters(you know
    > better than me!)
    > But we deal with only four characters and 2 bits are enough to encode
    > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > this four alphabet in my work. Can u pl write a sample program to reach
    > my goal?
    >


    Dear customer

    We are ready to fulfill your request, and we thank you for the
    confidence you give us by placing your order.

    Please buy products for US$ 200 at our website. When your VISA
    card payments are accepted we will gladly send you the requested
    program.

    Yours sincerely

    J.K. OB

    Customer support.

    P.S.
    Our website address:
    www.DoMyHomework.com
    jacob navia, Jan 5, 2007
    #2
    1. Advertising

  3. Umesh

    Ondra Holub Guest

    Umesh napsal:
    > This is a basic thing.
    > Say A=0100 0001 in ASCII which deals with 256 characters(you know
    > better than me!)
    > But we deal with only four characters and 2 bits are enough to encode
    > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > this four alphabet in my work. Can u pl write a sample program to reach
    > my goal?


    Yes, you can encode it this way, but you would have problem to work
    with it. For example accessing 6th character of such array would be
    something like (I did not test following code):

    char GetNthChar(const char array[] a, size_t index)
    {
    const size_t im4 = index % 4 * 2;
    return (a[index / 4] & (3 << im4)) >> im4;
    }

    However it may be usefull to store your data in such format. So I would
    recommend to encode it before storing, decode it after loading and work
    with ordinary char array containing values 0, 1, 2 and 4 only.

    If you really need to save couple of bytes, you should write some
    wrapping class which overloads operator[] and hides all these bit
    operations. It would be something like std::vector<bool> not for 1 bit
    values, but for 2 bit values.
    Ondra Holub, Jan 5, 2007
    #3
  4. Umesh

    osmium Guest

    "Umesh" wrote:

    > Say A=0100 0001 in ASCII which deals with 256 characters(you know
    > better than me!)
    > But we deal with only four characters and 2 bits are enough to encode
    > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > this four alphabet in my work. Can u pl write a sample program to reach
    > my goal?


    The switch statement *may* be germane to your problem.

    BTW, ASCII does not deal with 256 characters, ASCII consists of only 128
    characters. Terminology is a bitch since it is often used improperly by the
    very people who in fact know better. It's "quicker" that way. :-(
    osmium, Jan 5, 2007
    #4
  5. Umesh

    Umesh Guest

    Suppose that I define an array of 2 bits {00,01,10,11} . Now when the
    program finds A in the text file it replaces with 00, B with 01, C with
    10 and D with 11. So the encoded file will take 1/8 of the space of the
    original file.

    During decoding I'll replace 00 by A, 01 by B, 10 by C and 11 by D to
    regain the original file.

    I'm an inexperienced programmer. Pl help.
    Umesh, Jan 5, 2007
    #5
  6. Umesh

    Lew Pitcher Guest

    Umesh wrote:
    > Suppose that I define an array of 2 bits {00,01,10,11} . Now when the
    > program finds A in the text file it replaces with 00, B with 01, C with
    > 10 and D with 11. So the encoded file will take 1/8 of the space of the
    > original file.
    >
    > During decoding I'll replace 00 by A, 01 by B, 10 by C and 11 by D to
    > regain the original file.
    >
    > I'm an inexperienced programmer. Pl help.


    Your question doesn't really belong in comp.lang.c
    You aren't asking about a C feature or problem, you are asking for
    someone to teach you the rudiments of the skill of writing programs.

    Write the code to do this:

    open the input file
    open the output file
    for each character in the input file
    if the character is 'A'
    write binary 00 to the output file
    else if the character is 'B'
    write binary 01 to the output file
    else if the character is 'C'
    write binary 10 to the output file
    else if the character is 'D'
    write binary 11 to the output file
    else
    do nothing
    end-if
    end-if
    end-if
    end-if
    end-for-loop
    close the output file
    close the input file
    terminate the program

    HTH
    --
    Lew
    Lew Pitcher, Jan 5, 2007
    #6
  7. Umesh said:

    > This is a basic thing.
    > Say A=0100 0001 in ASCII which deals with 256 characters


    128, actually, but I know what you mean. Certainly 8 bits are sufficient to
    encode 256 characters, which is what you actually care about.

    > But we deal with only four characters and 2 bits are enough to encode
    > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > this four alphabet in my work. Can u pl write a sample program to reach
    > my goal?


    Here's some code to split a byte into four:

    void decode(char *letter, int ch)
    {
    const char alphabet[] = "ABCD";
    int mask = 0x11;

    for(i = 0; i < 4; i++)
    {
    letter = alphabet[(ch & mask) >> (i * 2)];
    mask <<= 1;
    }
    }

    letter must point to the first element in an array of at least four chars.
    Note that decode() does not build a string. If you want a string, deal with
    the null terminator yourself.

    If you are decoding, say, 0xAD, this is 10101101 in binary, and at the end
    of the decoding process letter[0] will store 'C', letter[1] will store 'C',
    letter[2] will store 'D', and letter[3] will store 'B'.

    Encoding is quite easy too. Simply reverse the process. For decoding,
    though, you may find it convenient to have an alphabet array of UCHAR_MAX +
    1 bytes, all of which have the value 0, but set alphabet['A'] to 1,
    alphabet['B'] to 2, alphabet['C'] to 3, and alphabet['D'] to 4. Then you
    can say: if(alphabet[letter] == 0) { error - invalid code } else { your
    OR-mask is alphabet[letter] - 1 so you can OR it into your encoding, and
    then shift left ready for the next mask }

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Jan 5, 2007
    #7
  8. Umesh said:

    > Suppose that I define an array of 2 bits {00,01,10,11} . Now when the
    > program finds A in the text file it replaces with 00, B with 01, C with
    > 10 and D with 11. So the encoded file will take 1/8 of the space of the
    > original file.


    A quarter.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Jan 5, 2007
    #8
  9. Umesh

    Umesh Guest


    > Your question doesn't really belong in comp.lang.c
    > You aren't asking about a C feature or problem, you are asking for
    > someone to teach you the rudiments of the skill of writing programs.
    >
    > Write the code to do this:
    >
    > open the input file
    > open the output file
    > for each character in the input file
    > if the character is 'A'
    > write binary 00 to the output file
    > else if the character is 'B'
    > write binary 01 to the output file
    > else if the character is 'C'
    > write binary 10 to the output file
    > else if the character is 'D'
    > write binary 11 to the output file
    > else
    > do nothing
    > end-if
    > end-if
    > end-if
    > end-if
    > end-for-loop
    > close the output file
    > close the input file
    > terminate the program
    >
    > HTH
    > --
    > Lew


    Dear Lew,
    I'm not asking you to put down the algorithm which I already did. I
    want you to write a part of the program. I won't ask you to do that
    once I learn C well. Because I've only learnt to make algorithms and
    determine their time complexity, I need your help.

    I heard that it is easier to implement programs than to make effective
    algorithm which I already did. Actually my original algo is far more
    complex than this. But I want to start from the simple one. Becuse I've
    none to teach me, I think sometimes a little help comes handy. That's
    why I'm here. I hope that expert folks like you won't upset me. Thank
    you. I look forward to hear from you again. God bless you.
    Umesh, Jan 5, 2007
    #9
  10. Umesh

    jacob navia Guest

    Umesh a écrit :
    >>Your question doesn't really belong in comp.lang.c
    >>You aren't asking about a C feature or problem, you are asking for
    >>someone to teach you the rudiments of the skill of writing programs.
    >>
    >>Write the code to do this:
    >>
    >> open the input file
    >> open the output file
    >> for each character in the input file
    >> if the character is 'A'
    >> write binary 00 to the output file
    >> else if the character is 'B'
    >> write binary 01 to the output file
    >> else if the character is 'C'
    >> write binary 10 to the output file
    >> else if the character is 'D'
    >> write binary 11 to the output file
    >> else
    >> do nothing
    >> end-if
    >> end-if
    >> end-if
    >> end-if
    >> end-for-loop
    >> close the output file
    >> close the input file
    >> terminate the program
    >>
    >>HTH
    >>--
    >>Lew

    >
    >
    > Dear Lew,
    > I'm not asking you to put down the algorithm which I already did. I
    > want you to write a part of the program. I won't ask you to do that
    > once I learn C well. Because I've only learnt to make algorithms and
    > determine their time complexity, I need your help.
    >


    Lew is right.

    You will NOT learn until you practice. And you will not practice
    if somebody else does the work for you.

    And besides, why should we work for you for free?

    You must learn the basics first. Buy the book from Kernighan and
    Ritchie, learn it, and then ask questions. I learned C that way,
    and there wasn't any body else there to ask questions. It is
    perfectly doable if you WORK.

    Or pay a class in computer programming. That is possible too.

    But we can't replace a teacher or a book, and we will not work
    for you for free.
    jacob navia, Jan 5, 2007
    #10
  11. Umesh

    Lew Pitcher Guest

    Umesh wrote:
    [snip]
    > I'm not asking you to put down the algorithm which I already did. I
    > want you to write a part of the program.


    Sorry, but no.

    But, I'll make it simpler for you

    First off, just write the code that frames your program. This is the
    minimum code; just the startup and termination. Something like

    #include <stdlib.h>
    int main(void)
    {
    return EXIT_SUCCESS;
    }

    Compile and run this code, making changes until it works. This
    shouldn't take very long, as this rudimentary code is almost foolproof.

    Next, add in the file open and close functions and retest

    Next, add in the input read logic, and retest

    Next, add in the logic to choose between 'A', 'B', 'C', and 'D', and
    retest

    Next, add in the logic to feed bit pairs to the output (this doesn't
    /yet/ have to write the pairs to the output), and retest

    Next, add in the logic to write out the bit pairs to the output, and
    retest

    Now, you are done


    [snip]
    Lew Pitcher, Jan 5, 2007
    #11
  12. Umesh

    Guest

    Richard Heathfield wrote:
    > Umesh said:
    >
    > > This is a basic thing.
    > > Say A=0100 0001 in ASCII which deals with 256 characters

    >
    > 128, actually, but I know what you mean. Certainly 8 bits are sufficient to
    > encode 256 characters, which is what you actually care about.
    >
    > > But we deal with only four characters and 2 bits are enough to encode
    > > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > > this four alphabet in my work. Can u pl write a sample program to reach
    > > my goal?

    >
    > Here's some code to split a byte into four:
    >
    > void decode(char *letter, int ch)
    > {
    > const char alphabet[] = "ABCD";
    > int mask = 0x11;
    >
    > for(i = 0; i < 4; i++)
    > {
    > letter = alphabet[(ch & mask) >> (i * 2)];
    > mask <<= 1;
    > }
    > }


    mask=0x3;
    Well i am not sure but i think and
    mask <<=2, and then you have to reverse letter
    or if we know number of digits
    then 0x110000 (construct mask)
    then shift ch&mask by appropriate amount.

    >
    > letter must point to the first element in an array of at least four chars.
    > Note that decode() does not build a string. If you want a string, deal with
    > the null terminator yourself.
    >
    > If you are decoding, say, 0xAD, this is 10101101 in binary, and at the end
    > of the decoding process letter[0] will store 'C', letter[1] will store 'C',
    > letter[2] will store 'D', and letter[3] will store 'B'.
    >

    It is giving strange stuff

    > Encoding is quite easy too. Simply reverse the process. For decoding,
    > though, you may find it convenient to have an alphabet array of UCHAR_MAX +
    > 1 bytes, all of which have the value 0, but set alphabet['A'] to 1,
    > alphabet['B'] to 2, alphabet['C'] to 3, and alphabet['D'] to 4. Then you
    > can say: if(alphabet[letter] == 0) { error - invalid code } else { your
    > OR-mask is alphabet[letter] - 1 so you can OR it into your encoding, and
    > then shift left ready for the next mask }
    >
    > --
    > Richard Heathfield
    > "Usenet is a strange place" - dmr 29/7/1999
    > http://www.cpax.org.uk
    > email: rjh at the above domain, - www.
    , Jan 5, 2007
    #12
  13. said:

    > Richard Heathfield wrote:


    <snip>

    >> Here's some code to split a byte into four:
    >>
    >> void decode(char *letter, int ch)
    >> {
    >> const char alphabet[] = "ABCD";
    >> int mask = 0x11;
    >>
    >> for(i = 0; i < 4; i++)
    >> {
    >> letter = alphabet[(ch & mask) >> (i * 2)];
    >> mask <<= 1;
    >> }
    >> }

    >
    > mask=0x3;


    oops

    > Well i am not sure but i think and
    > mask <<=2,


    oops squared

    Let's just pretend that article didn't happen, shall we? :-(

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Jan 5, 2007
    #13
  14. Umesh

    Jim Langston Guest

    "Umesh" <> wrote in message
    news:...
    > This is a basic thing.
    > Say A=0100 0001 in ASCII which deals with 256 characters(you know
    > better than me!)
    > But we deal with only four characters and 2 bits are enough to encode
    > them. I want to confirm if we can encode A in 2bits(say 00), B in 2
    > bits (01), C in 2 bits(10) and D in 2 bits by some program. I only use
    > this four alphabet in my work. Can u pl write a sample program to reach
    > my goal?


    Short answer, yes you can.

    Long answer, it's probably more pain than it's worth.

    The problem is modern computers usually use 8 bit bytes. A byte is the
    smallest unit the computer will handle as one unit, and it is usually 8 bits
    (it may be more or maybe a few less on some system). So, if you defined
    each value as 2 bits (A, B, C, D or 0, 1, 2, 3 or whatever) it would still
    need to be stored in a byte. With an 8 bit byte you could store 4 of these
    in each byte. But, since most computers deal with a minimum of 8 bits at a
    time, it would be up to you to extract the bits yourself.

    This means you couldn't use some simple define or such, you'd need to make a
    class and things get complicated from there.

    Unless you are storing a whole lot of these so size becomes an issue, it is
    easier to just waste the extra 6 bits of the byte and store each value in a
    byte so you can use std::vector and such more easy.
    Jim Langston, Jan 5, 2007
    #14
  15. Jim Langston said:

    <snip>
    >
    > Unless you are storing a whole lot of these so size becomes an issue, it
    > is easier to just waste the extra 6 bits of the byte and store each value
    > in a byte so you can use std::vector and such more easy.


    He can't use std::vector because there's no such thing. Except, of course,
    that you know perfectly well that there *is* such a thing. But I know
    you're wrong - and you know I'm wrong - and that's the trouble with
    cross-posting.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at the above domain, - www.
    Richard Heathfield, Jan 5, 2007
    #15
  16. Umesh

    Jim Langston Guest

    "Richard Heathfield" <> wrote in message
    news:...
    > Jim Langston said:
    >
    > <snip>
    >>
    >> Unless you are storing a whole lot of these so size becomes an issue, it
    >> is easier to just waste the extra 6 bits of the byte and store each value
    >> in a byte so you can use std::vector and such more easy.

    >
    > He can't use std::vector because there's no such thing. Except, of course,
    > that you know perfectly well that there *is* such a thing. But I know
    > you're wrong - and you know I'm wrong - and that's the trouble with
    > cross-posting.


    Oh, my bad. I missed the fact he cross posted. I even looked and made sure
    I was replying to comp.lang.c++ before I gave a c++ answer. Yes, cross
    posting is evil.
    Jim Langston, Jan 5, 2007
    #16
  17. In article <>,
    Richard Heathfield <> wrote:
    >Umesh said:
    >
    >> Suppose that I define an array of 2 bits {00,01,10,11} . Now when the
    >> program finds A in the text file it replaces with 00, B with 01, C with
    >> 10 and D with 11. So the encoded file will take 1/8 of the space of the
    >> original file.

    >
    >A quarter.


    True. But he's right, too. It does take an eighth of the space of the
    original.

    And then another eighth...
    Kenny McCormack, Jan 6, 2007
    #17
  18. In article <fCynh.59$>,
    Jim Langston <> wrote:
    >"Richard Heathfield" <> wrote in message
    >news:...
    >> Jim Langston said:
    >>
    >> <snip>
    >>>
    >>> Unless you are storing a whole lot of these so size becomes an issue, it
    >>> is easier to just waste the extra 6 bits of the byte and store each value
    >>> in a byte so you can use std::vector and such more easy.

    >>
    >> He can't use std::vector because there's no such thing. Except, of course,
    >> that you know perfectly well that there *is* such a thing. But I know
    >> you're wrong - and you know I'm wrong - and that's the trouble with
    >> cross-posting.

    >
    >Oh, my bad. I missed the fact he cross posted. I even looked and made sure
    >I was replying to comp.lang.c++ before I gave a c++ answer. Yes, cross
    >posting is evil.


    You know, it is interesting. "Everybody knows" that (i.e., the
    conventional wisdom is that) cross-posting is better than multi-posting,
    but I have often argued that that bit of CW 'taint so. For, among other
    reasons, this one. Cross-posting assumes that answers are correct (or,
    more precisely, that answers can be evaluated) regardless of which forum
    they are posted in. A perfectly reasonable layman's position, to be
    sure, but, as we see here, not good enough for us experts.

    Whereas, if the newbie multi-posts (which is his natural inclination,
    given that many [most?] of the commonly available newbie tools - i.e.,
    Google and Microsoft - make proper cross-posting difficult), he could then
    follow the responses independently in each forum and deal accordingly.

    P.S. Specific example. Every once in a awhile, somebody will post some
    sort of Unix-y/C-y question, cross-posting it to a dozen or so Unix-y/C-y
    groups, including clc, and the clc pedants will do their usual:

    Off topic. Not portable. Cant discuss it here. Blah, blah, blah.

    routine, posting that bit of valuable information to, of course, all
    dozen or so groups - after which the post degenerates into the usual clc
    bickering about topicality, all the while being posted to all dozen or
    so groups. This despite the fact that the post *was* topical in
    probably all but one (or maybe two, if clc++ was included and the
    participants there are as anal as the clc guys) of those groups.

    And the point, of course, is that simple multi-posting would have
    avoided this mess.
    Kenny McCormack, Jan 6, 2007
    #18
  19. Umesh

    Cesar Rabak Guest

    Kenny McCormack escreveu:
    > In article <fCynh.59$>,

    [snipped]

    > You know, it is interesting. "Everybody knows" that (i.e., the
    > conventional wisdom is that) cross-posting is better than multi-posting,
    > but I have often argued that that bit of CW 'taint so. For, among other
    > reasons, this one. Cross-posting assumes that answers are correct (or,
    > more precisely, that answers can be evaluated) regardless of which forum
    > they are posted in. A perfectly reasonable layman's position, to be
    > sure, but, as we see here, not good enough for us experts.
    >
    > Whereas, if the newbie multi-posts (which is his natural inclination,
    > given that many [most?] of the commonly available newbie tools - i.e.,
    > Google and Microsoft - make proper cross-posting difficult), he could then
    > follow the responses independently in each forum and deal accordingly.
    >
    > P.S. Specific example. Every once in a awhile, somebody will post some
    > sort of Unix-y/C-y question, cross-posting it to a dozen or so Unix-y/C-y
    > groups, including clc, and the clc pedants will do their usual:
    >
    > Off topic. Not portable. Cant discuss it here. Blah, blah, blah.
    >
    > routine, posting that bit of valuable information to, of course, all
    > dozen or so groups - after which the post degenerates into the usual clc
    > bickering about topicality, all the while being posted to all dozen or
    > so groups. This despite the fact that the post *was* topical in
    > probably all but one (or maybe two, if clc++ was included and the
    > participants there are as anal as the clc guys) of those groups.
    >
    > And the point, of course, is that simple multi-posting would have
    > avoided this mess.
    >

    Perhaps the point is the 'experts' and zealots that have so quick
    fingers to post about non topicality could change the behaviour either
    not post at all (exchanging the 'off topic' repply by silence) or first
    look at header of the msg and see if it is cross-posted.
    Cesar Rabak, Jan 6, 2007
    #19
  20. Umesh

    CBFalconer Guest

    Cesar Rabak wrote:
    > Kenny McCormack escreveu:
    >

    .... snip ...
    >>
    >> And the point, of course, is that simple multi-posting would have
    >> avoided this mess.

    >
    > Perhaps the point is the 'experts' and zealots that have so quick
    > fingers to post about non topicality could change the behaviour
    > either not post at all (exchanging the 'off topic' repply by
    > silence) or first look at header of the msg and see if it is
    > cross-posted.


    McCormack is a troll, and should always be ignored. In addition
    the proper answer is to set follow-ups to a single group when
    cross-posting, which will avoid the ever growing babble. Google
    could do this automatically, by insisting that a cross-posted
    article have a follow-up set.

    --
    Chuck F (cbfalconer at maineline dot net)
    Available for consulting/temporary embedded and systems.
    <http://cbfalconer.home.att.net>
    CBFalconer, Jan 6, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GGG
    Replies:
    10
    Views:
    12,515
    Donar
    Jul 6, 2006
  2. sarmin kho
    Replies:
    2
    Views:
    817
    A. Lloyd Flanagan
    Jun 15, 2004
  3. Miki Tebeka
    Replies:
    1
    Views:
    432
    Marcin 'Qrczak' Kowalczyk
    Jun 14, 2004
  4. Umesh
    Replies:
    23
    Views:
    687
    Randy Howard
    Jan 10, 2007
  5. Zhi
    Replies:
    2
    Views:
    1,121
    Mike Treseler
    Oct 9, 2007
Loading...

Share This Page