Memory "plot" and bit packing question

T

Taylor Howell

Hello all,

I have a delima. I have 8 5bit numbers that I need to pack into one (or
more) variables. They then must be written (exactly 40bits (5Bytes)) to a
file and have the ability to be put back into memory and then get the 5bit
numbers out again. As I am not by any means proficient in C++ (still
learning) I don't know how to go about doing this correctly. The way I was
planning to do it is as follows:

1. create a character array of 5 chars.
2. create a long long pointer to the array.
3. using shift operators, pack my numbers into the array.
4. write the character data to file. this will lop off any extra
data from the long long
5. read the data back into a different character array.
6. move the pointer to the new array.
7. use bitwise operators to transfer five bits to new int
8. shift the pointer and do 7 again until done.

.... Please don't laugh ... :)

When 4 is done I read the file with a hex editor and all it is garble. Its
not the bits I'm expecting...

Anyway, if you know of any better way of doing this please point me in the
right direction. I don't want the answer though. I want to learn not just
cut and paste, and any help would be greatly appreciated!

Thanks
Taylor Howell
 
M

Mark P

Taylor said:
Hello all,

I have a delima. I have 8 5bit numbers that I need to pack into one (or
more) variables. They then must be written (exactly 40bits (5Bytes)) to a
file and have the ability to be put back into memory and then get the 5bit
numbers out again. As I am not by any means proficient in C++ (still
learning) I don't know how to go about doing this correctly. The way I was
planning to do it is as follows:

1. create a character array of 5 chars.
2. create a long long pointer to the array.
3. using shift operators, pack my numbers into the array.
4. write the character data to file. this will lop off any extra
data from the long long
5. read the data back into a different character array.
6. move the pointer to the new array.
7. use bitwise operators to transfer five bits to new int
8. shift the pointer and do 7 again until done.

... Please don't laugh ... :)

When 4 is done I read the file with a hex editor and all it is garble. Its
not the bits I'm expecting...

Anyway, if you know of any better way of doing this please point me in the
right direction. I don't want the answer though. I want to learn not just
cut and paste, and any help would be greatly appreciated!

Thanks
Taylor Howell


Are you trying to interpret the array of characters as a long? In
general this is a bad idea since you don't know how the bytes of a long
are ordered (look up big-endian and little-endian). Is the ones bit
part of the first byte or the last byte? It's platform dependent in
general.

If you're making an array of chars you should be dealing with chars
only. This may mean a little more work in the case that a 5 bit number
spans the boundary between two chars, but with bit manipulation this is
manageable.

Or you could use std::bitset if that's permissible.
 
V

Victor Bazarov

Taylor said:
I have a delima. I have 8 5bit numbers that I need to pack into one (or
more) variables. They then must be written (exactly 40bits (5Bytes)) to a
file and have the ability to be put back into memory and then get the 5bit
numbers out again. As I am not by any means proficient in C++ (still
learning) I don't know how to go about doing this correctly. The way I was
planning to do it is as follows:

1. create a character array of 5 chars.
2. create a long long pointer to the array.

There is no such thing as "a long long pointer to the array". In C there
is, but we're not in C.
3. using shift operators, pack my numbers into the array.
4. write the character data to file. this will lop off any extra
data from the long long
5. read the data back into a different character array.
6. move the pointer to the new array.
7. use bitwise operators to transfer five bits to new int
8. shift the pointer and do 7 again until done.

... Please don't laugh ... :)

When 4 is done I read the file with a hex editor and all it is garble. Its
not the bits I'm expecting...

Anyway, if you know of any better way of doing this please point me in the
right direction. I don't want the answer though. I want to learn not just
cut and paste, and any help would be greatly appreciated!

I would try using 'std::bitset' or 'std::vector<bool>'.

V
 
X

Xepo

hmm, I don't really think there's a 'good' way to do this. Working
with bits is a pretty annoying task in C/C++. About the only thing I
have that could help is a few directions to point ya:
* vector<bool> (or a bit_vector) could be used for the container, but
there's no easy way to output them to a file (assuming one bit per
entry);
* In order to access each independent bit of a character, you can use
this technique:
-------------------------------
struct easy_bit_char {
union {
char c;
struct {
int b0 : 1; int b1 : 1; int b2 : 1; int b3 : 1;
int b4 : 1; int b5 : 1; int b6 : 1; int b7 : 1;
} a;
};
};
--------------------------------
Now you can access each bit of the c member of easy_bit_char using a.b0
through a.b7. For example:
--------------------------------
struct easy_bit_char v;
v.a.b3 = 1;
//Now v.c == 16
v.a.b2 = 1;
//Now v.c == 48
--------------------------------
Of course, writing a function to do all of the shifting and stuff for
you makes this kind of unnecessary, but it's one way to do it.

You could also, in the struct, use this type: long long c:40;
Then define b to contain 40 of the :1 ints. Then you can access each
bit of the 40-bit long long. You should probably also divide it into
characters so that you can output it easily. Oh yea, and make
everything unsigned (working with signs can hold some pitfalls when
working on the bit level).

Er, you say you're still learning C++, so here's a quick explanation:
union { }; - This indicates that every variable declared within the
union takes the *same* place in memory. This was used a lot in C, and
is used more in operating systems programming. We're using that so
that you can access each bit of the variables easily.

int a:1; - in a structure, following the variable name with a :#
indicates how many bits should be allocated to that variable. The rest
of the bits are assumed to be zero, and I think this concept may only
work with integral data types. It's *rarely* used, and is also
something that was carried over from C.

Anyway, good luck. You could probably find a library for bit
manipulation, and writing to files, but it's rare to need, so it might
not be too easy to find.
 
X

Xepo

Err, as the other people who posted said, use bitset. It'd be much
more platform-independent, and better supported than using all of the
union and :1 stuff. Plus more legible. I missed that class on my
look-over of the STL standard.
 
A

Andrew Phillips

I think the way you were going is spot on, but here are a few tips:

1. Use an array of 8 characters not 5, otherwise manipulating the
"long long" (which I assume is a 64-bit integer for your compiler)
will modify extra bytes past the end of the array which is not good.
2. Always used unsigned types - try "unsigned long long". This may be
why you are not getting what you expect.
3. Make sure you know the byte order for integers on your system -
whether they are big- or little-endian - and write the right 5 bytes
of the array. Many UNIX machines are big-endian in which case you
need to save bytes 3 to 7.
4. If the code *and* the file format need to be portable then you can
test the byte order and react accordingly. Eg for little-endian write
array elts 0-4 but for big-endian write arrays elts 3-7 in the reverse
order.

A simpler possiblity (but even less portable) is to use bitfields.
However, this relies on your compiler either using a bit-field storage
unit of 64-bit integers (or being able to set it to that) or having a
compiler where bit-fields "straddle" storage units. You will still
need to know the byte order of integers but also need to know the
order that bit-fields are generated by your compiler (top to bottom or
bottom to top).

struct a
{
unsigned int b0: 5;
unsigned int b1: 5;
unsigned int b2: 5;
unsigned int b3: 5;
unsigned int b4: 5;
unsigned int b5: 5;
unsigned int b6: 5;
unsigned int b7: 5;
};

Using this makes it very easy to assign the values. However, most
compilers use a bit-field storage unit of 32-bits (or even 16 bits).
So this will put b0-b5 (30 bits) in the first four bytes (with 2
unused bits), then b6 and b7 in the next storage unit.

Re bitset:
I have never used std::bitset or std::vector<bool> but they are really
designed for manipulating individual bits not 5-bit "mini-integers"
and I also believe they are really intended for use when the numbers
of bits is larger than the biggest supported integer size (apparently
64 bits in your case).

Also using bitset is less portable. Doing direct bit manipulation
means the code is portable to C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top