Representing a Sequence of Shorts in Bits

Hahnemann · Jun 25, 2008

Is there a sample in C out there to represent a sequence of numbers
(shorts) using bits? With the intention of saving space, instead of
using the full 16 bits per short. Thanks.

- Hahnemann

viza · Jun 26, 2008

Hi

Is there a sample in C out there to represent a sequence of numbers
(shorts) using bits? With the intention of saving space, instead of
using the full 16 bits per short. Thanks.

So, if the short is 0, you set the bit to 0.
If the short is 1 you set the bit to 1.
If the short is 2... do you see the problem yet?
The bit can only contain 0 or 1.

If you have a large number of shorts, and there is significant
correlation between them, ie: many of them have the same value or they
follow some pattern, then you can compress the data and save some space
at the expense of time, but even then in most cases you will do well to
get 16:1.

Try http://google.com/search?q=compression+algorithms

HTH

Ben Bacarisse · Jun 26, 2008

viza said:
So, if the short is 0, you set the bit to 0.
If the short is 1 you set the bit to 1.
If the short is 2... do you see the problem yet?
The bit can only contain 0 or 1.

I think the OP is looking represent a set of smallish integers as a
bit set. The term sequence is confusing but the OP had written "set"
everywhere where they wrote "sequence" I think you'd have offered a
different answer.

To the OP: search for "bit set in C" or something like that.

Walter Roberson · Jun 26, 2008

I think the OP is looking represent a set of smallish integers as a
bit set. The term sequence is confusing but the OP had written "set"
everywhere where they wrote "sequence" I think you'd have offered a
different answer.

My interpretation was that the original poster was looking for one of
the various "word-oriented" compression techniques (that is, each
stored item has distinct boundaries, in contrast to (e.g.) arithmetic
encoding where a stream of bits has to be decoded in order to discern
the several numbers that were encoded into the stream) -- though I
didn't figure out if they were interested in fixed-length encodings or
variable-length encodings.

There are, of course, a number of different implementations of chunks
of storage being treated as arrays of a fixed number of bits,
and perhaps that is what the original poster is looking for ?

rahul · Jun 26, 2008

Is there a sample in C out there to represent a sequence of numbers
(shorts) using bits? With the intention of saving space, instead of
using the full 16 bits per short. Thanks.

- Hahnemann

You can save some space by using bitwise operators to store and
retrieve the values but remember, you can not actually read and write
in terms of bits ( not even in assembly/machine language). If you are
writing 1, even though it can be represented using 1 bit, you will
have to use 1 byte.

I am not aware how a typical bit set works but that too is not going
to very helpful in saving space.

forkazoo · Jun 26, 2008

I am not aware how a typical bit set works but that too is not going
to very helpful in saving space.

If the platform provides a 16 bit short, and the range of values the
OP is interested only range between, for example, 0-511, then you
could pack each number into 9 bits. This would be nearly half the
storage space.

Obviously, the utility of such a technique depends on the situation.

Hahnemann · Jun 26, 2008

I read the following from a book called "Compression Algorithms for
Real Programmers":

"...Programmers have long known that it is silly to use double
precision values or long integers when regular floating point numbers
or integers will serve the purpose... David Kopf takes this one step
further with an algorithm that carefully adjusts the number of bits in
a sequence. If this is too many bits, then it reduces the size by i
bits for the next value. If it is too few, then it increases the size
by j bits and repeats the value. In some cases, the algorithm uses a
fixed i and j, and, in others, it uses a preset number".

I am wondering how this works in actual C code. I Googled that
algorithm but I can't find it. Does this make sense? Maybe I'm
getting this "packing of bits" idea incorrectly.

- Hahnemann

Dan · Jun 27, 2008

Hahnemann said:
Is there a sample in C out there to represent a sequence of numbers
(shorts) using bits? With the intention of saving space, instead of
using the full 16 bits per short. Thanks.

Yes, you can use the bitwise operations (& | >> <<)
For example 4 values of range 0-15 can fit into a 16bit short:

unsigned short v = 0;
unsigned short val[4] = {3,8,15,6};
//to pack them:
v = (val[0]<<12) | (val[1]<<8) | (val[2] << 4) | (val[3]);

//and to get the values back:
val[0] = (v >> 12);
val[1] = (v >> 8) & 0x0F;
val[2] = (v>>4) & 0x0F;
val[3] = v & 0x0F;

vippstar · Jun 27, 2008

Is there a sample in C out there to represent a sequence of numbers
(shorts) using bits? With the intention of saving space, instead of
using the full 16 bits per short. Thanks.

Click to expand...

Yes, you can use the bitwise operations (& | >> <<)
For example 4 values of range 0-15 can fit into a 16bit short:

unsigned short v = 0;
unsigned short val[4] = {3,8,15,6};
//to pack them:
v = (val[0]<<12) | (val[1]<<8) | (val[2] << 4) | (val[3]);

//and to get the values back:
val[0] = (v >> 12);
val[1] = (v >> 8) & 0x0F;
val[2] = (v>>4) & 0x0F;
val[3] = v & 0x0F;

You somehow assume that 'v' is not/can not result in a trap
representation.
It's possible to write a generic solution such as
void pack(void *dest, size_t bitsize, size_t size, size_t nmemb, const
void *src);
Your example would be equivalent to:
/* ... */
unsigned sort v[4] = { 3, 8, 15, 6 };
unsigned char s[2];
pack(s, 4, sizeof *v, sizeof v / sizeof *v, v);

vippstar · Jun 27, 2008

Yes, you can use the bitwise operations (& | >> <<)
For example 4 values of range 0-15 can fit into a 16bit short:

Click to expand...

unsigned short v = 0;
unsigned short val[4] = {3,8,15,6};
//to pack them:
v = (val[0]<<12) | (val[1]<<8) | (val[2] << 4) | (val[3]);

Click to expand...

//and to get the values back:
val[0] = (v >> 12);
val[1] = (v >> 8) & 0x0F;
val[2] = (v>>4) & 0x0F;
val[3] = v & 0x0F;

Click to expand...

You somehow assume that 'v' is not/can not result in a trap
representation.
It's possible to write a generic solution such as
void pack(void *dest, size_t bitsize, size_t size, size_t nmemb, const
void *src);
Your example would be equivalent to:
/* ... */
unsigned sort v[4] = { 3, 8, 15, 6 };
unsigned char s[2];
pack(s, 4, sizeof *v, sizeof v / sizeof *v, v);

Sorry I just realized that it's not possible to write such function,
because unsigned short might have padding bits (inside the pack()
function 'v' would be treated as a pointer to unsigned char)
You colud restrict yourself to CHAR_BIT bits to pack, in which case
it's possible to write pack().

Data saving in condition of changing reality	0	Apr 29, 2022
double precise enough for unsigned short?	11	May 31, 2010
representing a sequence of bits in Java?	5	Mar 10, 2008
print bits of unsigned value	7	Jun 15, 2010
Representing a hierarchical file system	10	Aug 15, 2011
printing bits ... the right way	2	Apr 1, 2010
Finding a bit sequence in another	11	Oct 22, 2009
So I have (a sketch of) a universal system...	3	Sep 2, 2022

Representing a Sequence of Shorts in Bits

Hahnemann

viza

Ben Bacarisse

Walter Roberson

rahul

forkazoo

Hahnemann

Dan

vippstar

vippstar

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads