unsigned char [6] numerical representation

M

Mathieu Malaterre

Hello,

I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being to
stored it on the minimal number of bytes.

The first approach is a representation ala IP address:
255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
get rid of the dot since the lenght is fixed. So I can go down to 3*6 =
18 bytes.

I relized that this also can be represented as a number 256^6 =
281474976710656 which fits only on 15 bytes . Unfortunately this number
is too big for usual c types. I tried to reimplement the multiplication
with strings, but I gave up quickly. I then try the union approach:

union { unsigned char s[6]; unsigned int i[2]; } dual;

dual.s = {0, 0, 0, 0, 0, 0};
char s[15];
std::copy( dual.i[0], dual.i[2], s);

but I am not sure how to do copy the unsigned int into a string ?

Thanks for any advice (in particular regarding the bytes ordering).

Thanks
Mathieu
 
V

Victor Bazarov

Mathieu Malaterre said:
I have the following problem. I need to convert a unsigned char[6] array
into a string using only number (0-9) and '.'. The goal being to stored it
on the minimal number of bytes.

What does 'unsigned char[6]' represent? Is that a 6-byte (48-bit)
integer number (up to 2^48-1)? Or is that something else? Converting
is a common task, but how to interpret those 6 bytes? What limits are
there imposed on their values? Or are those limits natural (0..255)?
The first approach is a representation ala IP address:
255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
get rid of the dot since the lenght is fixed. So I can go down to 3*6 = 18
bytes.

I relized that this also can be represented as a number 256^6 =
281474976710656 which fits only on 15 bytes . Unfortunately this number is
too big for usual c types.

I am not certain what you consider "usual", but nowadays you can have
an integer of 64 bits, should be enough.
I tried to reimplement the multiplication with strings, but I gave up
quickly. I then try the union approach:

union { unsigned char s[6]; unsigned int i[2]; } dual;

dual.s = {0, 0, 0, 0, 0, 0};
char s[15];
std::copy( dual.i[0], dual.i[2], s);

but I am not sure how to do copy the unsigned int into a string ?

Thanks for any advice (in particular regarding the bytes ordering).

If you want to treat those six bytes as one big integer (48 bits, no
sign), then you have to perform arithmetic operations to find out those
digits. Essentially, you need to find all remainders while dividing
your 48-bit number by 10.

Of course, the simplest thing would be using a compiler-specific 64-bit
integer, for which arithmetic operators are probably already implemented.
Or, find a library with "large integer" support and see what they offer.

Victor
 
?

=?ISO-8859-1?Q?Ney_Andr=E9_de_Mello_Zunino?=

Mathieu said:
I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being to
stored it on the minimal number of bytes.

The first approach is a representation ala IP address:
255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
get rid of the dot since the lenght is fixed. So I can go down to 3*6 =
18 bytes.

[...]

I may have missed your point but, if the goal were to minimize storage
space and have the ability to read back the big number as a string, why
not encapsulate it in a class? The following (non-optimized) code could
give you an idea:

#include <string>
#include <sstream>
#include <iostream>

class IP
{
public:
IP(const std::string& ipString);
std::string asString();
private:
unsigned char mBytes[6];
};

IP::IP(const std::string& ipString)
{
std::istringstream iss(ipString);
std::istringstream snum;
unsigned int b;
std::string s;
for (unsigned int i = 0; i < 6; ++i)
{
std::getline(iss, s, '.');
snum.str(s);
snum >> b;
snum.clear();
mBytes = b;
}
}

std::string IP::asString()
{
std::eek:stringstream oss;
unsigned int i;
for (i = 0; i < 5; ++i)
oss << static_cast<unsigned int>(mBytes) << '.';
oss << static_cast<unsigned int>(mBytes);
return oss.str();
}

int main()
{
IP ip("215.210.14.5.116.198");
std::cout << ip.asString();
}

Best regards,
 
M

msalters

Mathieu said:
Hello,

I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being to
stored it on the minimal number of bytes.

The first approach is a representation ala IP address:
255.255.255.255.255.255 therefore it takes 6*3+5 = 23 bytes. I can even
get rid of the dot since the lenght is fixed. So I can go down to 3*6 =
18 bytes.

Actually, char's could also be 16 bits, in which case you'd need
6*5 chars. In general, they're CHAR_BIT wide.
I relized that this also can be represented as a number 256^6 =
281474976710656 which fits only on 15 bytes . Unfortunately this number
is too big for usual c types. I tried to reimplement the multiplication
with strings, but I gave up quickly.

Don't. It's the correct way. Remember, you don't have to store the
number
itself. You're generating a string.
Basically, what you need to do is to write a class:

class Num {
uchar data[6];
public:
Num ( ? );// I don't know how you get those chars
char div10();
};

Num::div10() should return Num%10 and modify Num::data[6].
With that, you can do a recursive div10 until div10 returns 0.

Now, how do you implement (uchar data[6])%10? Simple.
The main rule is (A+B)%X = ((A%X)+(B%X))%X. This reduces
the size of the numbers, and fixes overflows.

You basically want(data<<CHAR_BIT*i)%10, which therefore
is equal to ((data<<CHAR_BIT*i) %10)%10. The advantage of
this approach is that ((data<<CHAR_BIT*i) %10) will be at
most 6*9 = 54.

Again, you need to do a similar reduction to calculate
(data<<CHAR_BIT*i) %10. This is equal to
(data*(1<<CHAR_BIT*i)) % 10, and we have a similar rule
(A * B)%X = ((A%X) * (B%X))%X . You get the idea, do the %
operation early so you know each term stays small.

BTW, you can get the string even shorter if you use the '.'
as the eleventh digit. In that case, 255 = 2*121+1*11+2 =
"212" base 11, and 242 is "1.." ;)

Regards,
Michiel Salters
 
M

msalters

msalters said:
Mathieu said:
Hello,

I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being
to stored it on the minimal number of bytes. [ SNIP]
I tried to reimplement themultiplication
with strings, but I gave up quickly.

Don't. It's the correct way. Remember, you don't have to store the
number itself. You're generating a string.
Basically, what you need to do is to write a class:

class Num {
uchar data[6];
public:
Num ( ? );// I don't know how you get those chars
char div10();
};

Num::div10() should return Num%10 and modify Num::data[6].
With that, you can do a recursive div10 until div10 returns 0.

Ok, so I implemented it /after/ I posted it, and that's a bug.
Just try to print 1024, and you'll see why it fails. Stop when
all chars are 0 (simple test, really).

Furthermore, I found it a lot easier to just do a repeated divide
by N with carry. I.e.
carry = 0;
for(i)
{
extended = carry << CHAR_BIT + data
data = extended / N;
carry = extended % N
}
postcondition: carry is the last digit.

Regards,
Michiel Salters
 
M

Mathieu Malaterre

Hello,

Thanks a lot all for your suggestions. But I always prefer the easy
way. Therefore here is what I do (works like a charm):


------------------------------------------------------------
#include <stdio.h>



#ifdef _MSC_VER
typedef unsigned __int64 uint64_t;
#else
#include <stdint.h>
#endif

int main()
{
union dual { unsigned char s[6]; uint64_t i; };
dual d = { 0 };



d.s[0] = 0;
d.s[1] = 0;
d.s[2] = 0;
d.s[3] = 0;
d.s[4] = 1;
d.s[5] = 0;

#ifdef _MSC_VER
printf("%015I64u\n", d.i);
#else

printf("%015llu\n", d.i);
#endif



return 0;
}
------------------------------------------------------------

Thanks anyway
Mathieu
msalters said:
Mathieu said:
Hello,

I have the following problem. I need to convert a unsigned char[6]
array into a string using only number (0-9) and '.'. The goal being
to stored it on the minimal number of bytes.

[ SNIP]
I tried to reimplement themultiplication
with strings, but I gave up quickly.

Don't. It's the correct way. Remember, you don't have to store the
number itself. You're generating a string.
Basically, what you need to do is to write a class:

class Num {
uchar data[6];
public:
Num ( ? );// I don't know how you get those chars
char div10();
};

Num::div10() should return Num%10 and modify Num::data[6].
With that, you can do a recursive div10 until div10 returns 0.


Ok, so I implemented it /after/ I posted it, and that's a bug.
Just try to print 1024, and you'll see why it fails. Stop when
all chars are 0 (simple test, really).

Furthermore, I found it a lot easier to just do a repeated divide
by N with carry. I.e.
carry = 0;
for(i)
{
extended = carry << CHAR_BIT + data
data = extended / N;
carry = extended % N
}
postcondition: carry is the last digit.

Regards,
Michiel Salters
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top