# unsigned char [6] numerical representation

Discussion in 'C++' started by Mathieu Malaterre, Jan 21, 2005.

1. ### Mathieu MalaterreGuest

Hello,

I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being to
stored it on the minimal number of bytes.

The first approach is a representation ala IP address:
255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
get rid of the dot since the lenght is fixed. So I can go down to 3*6 =
18 bytes.

I relized that this also can be represented as a number 256^6 =
281474976710656 which fits only on 15 bytes . Unfortunately this number
is too big for usual c types. I tried to reimplement the multiplication
with strings, but I gave up quickly. I then try the union approach:

union { unsigned char s[6]; unsigned int i[2]; } dual;

dual.s = {0, 0, 0, 0, 0, 0};
char s[15];
std::copy( dual.i[0], dual.i[2], s);

but I am not sure how to do copy the unsigned int into a string ?

Thanks for any advice (in particular regarding the bytes ordering).

Thanks
Mathieu

Mathieu Malaterre, Jan 21, 2005

2. ### Victor BazarovGuest

"Mathieu Malaterre" <> wrote...
> I have the following problem. I need to convert a unsigned char[6] array
> into a string using only number (0-9) and '.'. The goal being to stored it
> on the minimal number of bytes.

What does 'unsigned char[6]' represent? Is that a 6-byte (48-bit)
integer number (up to 2^48-1)? Or is that something else? Converting
is a common task, but how to interpret those 6 bytes? What limits are
there imposed on their values? Or are those limits natural (0..255)?

> The first approach is a representation ala IP address:
> 255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
> get rid of the dot since the lenght is fixed. So I can go down to 3*6 = 18
> bytes.
>
> I relized that this also can be represented as a number 256^6 =
> 281474976710656 which fits only on 15 bytes . Unfortunately this number is
> too big for usual c types.

I am not certain what you consider "usual", but nowadays you can have
an integer of 64 bits, should be enough.

> I tried to reimplement the multiplication with strings, but I gave up
> quickly. I then try the union approach:
>
> union { unsigned char s[6]; unsigned int i[2]; } dual;
>
> dual.s = {0, 0, 0, 0, 0, 0};
> char s[15];
> std::copy( dual.i[0], dual.i[2], s);
>
> but I am not sure how to do copy the unsigned int into a string ?
>
> Thanks for any advice (in particular regarding the bytes ordering).

If you want to treat those six bytes as one big integer (48 bits, no
sign), then you have to perform arithmetic operations to find out those
digits. Essentially, you need to find all remainders while dividing

Of course, the simplest thing would be using a compiler-specific 64-bit
integer, for which arithmetic operators are probably already implemented.
Or, find a library with "large integer" support and see what they offer.

Victor

Victor Bazarov, Jan 21, 2005

3. ### =?ISO-8859-1?Q?Ney_Andr=E9_de_Mello_Zunino?=Guest

Mathieu Malaterre wrote:

> I have the following problem. I need to convert a unsigned char[6]
> array into a string using only number (0-9) and '.'. The goal being to
> stored it on the minimal number of bytes.
>
> The first approach is a representation ala IP address:
> 255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
> get rid of the dot since the lenght is fixed. So I can go down to 3*6 =
> 18 bytes.

[...]

I may have missed your point but, if the goal were to minimize storage
space and have the ability to read back the big number as a string, why
not encapsulate it in a class? The following (non-optimized) code could
give you an idea:

#include <string>
#include <sstream>
#include <iostream>

class IP
{
public:
IP(const std::string& ipString);
std::string asString();
private:
unsigned char mBytes[6];
};

IP::IP(const std::string& ipString)
{
std::istringstream iss(ipString);
std::istringstream snum;
unsigned int b;
std::string s;
for (unsigned int i = 0; i < 6; ++i)
{
std::getline(iss, s, '.');
snum.str(s);
snum >> b;
snum.clear();
mBytes = b;
}
}

std::string IP::asString()
{
std:stringstream oss;
unsigned int i;
for (i = 0; i < 5; ++i)
oss << static_cast<unsigned int>(mBytes) << '.';
oss << static_cast<unsigned int>(mBytes);
return oss.str();
}

int main()
{
IP ip("215.210.14.5.116.198");
std::cout << ip.asString();
}

Best regards,

--

=?ISO-8859-1?Q?Ney_Andr=E9_de_Mello_Zunino?=, Jan 21, 2005
4. ### msaltersGuest

Mathieu Malaterre wrote:
> Hello,
>
> I have the following problem. I need to convert a unsigned char[6]
> array into a string using only number (0-9) and '.'. The goal being

to
> stored it on the minimal number of bytes.
>
> The first approach is a representation ala IP address:
> 255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can

even
> get rid of the dot since the lenght is fixed. So I can go down to 3*6

=
> 18 bytes.

Actually, char's could also be 16 bits, in which case you'd need
6*5 chars. In general, they're CHAR_BIT wide.

> I relized that this also can be represented as a number 256^6 =
> 281474976710656 which fits only on 15 bytes . Unfortunately this

number
> is too big for usual c types. I tried to reimplement the

multiplication
> with strings, but I gave up quickly.

Don't. It's the correct way. Remember, you don't have to store the
number
itself. You're generating a string.
Basically, what you need to do is to write a class:

class Num {
uchar data[6];
public:
Num ( ? );// I don't know how you get those chars
char div10();
};

Num::div10() should return Num%10 and modify Num::data[6].
With that, you can do a recursive div10 until div10 returns 0.

Now, how do you implement (uchar data[6])%10? Simple.
The main rule is (A+B)%X = ((A%X)+(B%X))%X. This reduces
the size of the numbers, and fixes overflows.

You basically want(data<<CHAR_BIT*i)%10, which therefore
is equal to ((data<<CHAR_BIT*i) %10)%10. The advantage of
this approach is that ((data<<CHAR_BIT*i) %10) will be at
most 6*9 = 54.

Again, you need to do a similar reduction to calculate
(data<<CHAR_BIT*i) %10. This is equal to
(data*(1<<CHAR_BIT*i)) % 10, and we have a similar rule
(A * B)%X = ((A%X) * (B%X))%X . You get the idea, do the %
operation early so you know each term stays small.

BTW, you can get the string even shorter if you use the '.'
as the eleventh digit. In that case, 255 = 2*121+1*11+2 =
"212" base 11, and 242 is "1.."

Regards,
Michiel Salters

msalters, Jan 21, 2005
5. ### msaltersGuest

msalters wrote:
> Mathieu Malaterre wrote:
> > Hello,
> >
> > I have the following problem. I need to convert a unsigned char[6]
> > array into a string using only number (0-9) and '.'. The goal being
> > to stored it on the minimal number of bytes.

[ SNIP]
> > I tried to reimplement themultiplication
> > with strings, but I gave up quickly.

>
> Don't. It's the correct way. Remember, you don't have to store the
> number itself. You're generating a string.
> Basically, what you need to do is to write a class:
>
> class Num {
> uchar data[6];
> public:
> Num ( ? );// I don't know how you get those chars
> char div10();
> };
>
> Num::div10() should return Num%10 and modify Num::data[6].
> With that, you can do a recursive div10 until div10 returns 0.

Ok, so I implemented it /after/ I posted it, and that's a bug.
Just try to print 1024, and you'll see why it fails. Stop when
all chars are 0 (simple test, really).

Furthermore, I found it a lot easier to just do a repeated divide
by N with carry. I.e.
carry = 0;
for(i)
{
extended = carry << CHAR_BIT + data
data = extended / N;
carry = extended % N
}
postcondition: carry is the last digit.

Regards,
Michiel Salters

msalters, Jan 21, 2005
6. ### Mathieu MalaterreGuest

Re: unsigned char [6] numerical representation: solution

Hello,

Thanks a lot all for your suggestions. But I always prefer the easy
way. Therefore here is what I do (works like a charm):

------------------------------------------------------------
#include <stdio.h>

#ifdef _MSC_VER
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif

int main()
{
union dual { unsigned char s[6]; uint64_t i; };
dual d = { 0 };

d.s[0] = 0;
d.s[1] = 0;
d.s[2] = 0;
d.s[3] = 0;
d.s[4] = 1;
d.s[5] = 0;

#ifdef _MSC_VER
printf("%015I64u\n", d.i);
#else

printf("%015llu\n", d.i);
#endif

return 0;
}
------------------------------------------------------------

Thanks anyway
Mathieu

msalters wrote:
> msalters wrote:
>
>>Mathieu Malaterre wrote:
>>
>>>Hello,
>>>
>>>I have the following problem. I need to convert a unsigned char[6]
>>>array into a string using only number (0-9) and '.'. The goal being
>>>to stored it on the minimal number of bytes.

>
> [ SNIP]
>
>>>I tried to reimplement themultiplication
>>>with strings, but I gave up quickly.

>>
>>Don't. It's the correct way. Remember, you don't have to store the
>>number itself. You're generating a string.
>>Basically, what you need to do is to write a class:
>>
>>class Num {
>>uchar data[6];
>>public:
>>Num ( ? );// I don't know how you get those chars
>>char div10();
>>};
>>
>>Num::div10() should return Num%10 and modify Num::data[6].
>>With that, you can do a recursive div10 until div10 returns 0.

>
>
> Ok, so I implemented it /after/ I posted it, and that's a bug.
> Just try to print 1024, and you'll see why it fails. Stop when
> all chars are 0 (simple test, really).
>
> Furthermore, I found it a lot easier to just do a repeated divide
> by N with carry. I.e.
> carry = 0;
> for(i)
> {
> extended = carry << CHAR_BIT + data
> data = extended / N;
> carry = extended % N
> }
> postcondition: carry is the last digit.
>
> Regards,
> Michiel Salters
>

Mathieu Malaterre, Jan 21, 2005