Base32 Encoding

M

Marc

Hello,

Does anyone know where I can find some sample code for encoding a string
into Base32? And when I say "code" I'm not talking about some MS VC++
project... I'm talking about real C or C++ code that can be used on any
platform (as C and C++ should be ;-) ).

Thanks in advance!
Marc
 
S

Suzanne Vogel

Just home-brewed code that does any base encoding. Comments on bad style
are welcome.

My favorite is the base-32 representation of the number 15.333...
It's F.ALALA...
== F*32^0 + A*(1/32) + L*(1/32)^2 + A*(1/32)^3 + L*(1/32)^4 + ...
== 15*32^0 + 10*(1/32) + 21*(1/32)^2 + 10*(1/32)^3 + 21*(1/32)^4 + ...

---
#include <string>
#include <iostream>
#include <math.h>

/******************************************************************************
* Given a value 'value' in base 10, convert it to a digit in base 'base'.
* PRECONDITION: 0 <= 'base' <= 36 (because of limit of alphabet size == 26)
******************************************************************************/
std::string val2digit(const int value, const int base) {
static const SIZE_OF_ALPHABET = int('Z') - int('A') + 1;
static const int GENERAL_NUMBER_BASE = 10;
static const int CHARS_PER_DIGIT = 2; // 1 is not enough!!!
std::string s = "";
if (value <= 9) {
char* buf = (char*)malloc(sizeof(char)*CHARS_PER_DIGIT);
itoa(value, buf, GENERAL_NUMBER_BASE);
s += buf;
free(buf);
}
else if (value <= 9 + SIZE_OF_ALPHABET) {
s += char((value - 9) + int('A')-1);
}
else {
printf("%s%d%s", "Error: Value must be in range 9 to ", 9 +
SIZE_OF_ALPHABET, "\n");
std::cout << std::flush;
}
return s;
}

/******************************************************************************
* Given a base-10 value 'n', convert it to a strong representation in base
* 'base', using a maximum of 'maxNumDecPlaces' number of decimal places.
******************************************************************************/
std::string base10ToBaseN(double n, const int base, int maxNumDecPlaces
= -1) {
static const int MAX_NUM_DEC_PLACES = 8;
maxNumDecPlaces = (maxNumDecPlaces < 0) ? MAX_NUM_DEC_PLACES :
maxNumDecPlaces;
int numDecPlaces = 0;

std::string s = "";
long p = long(log(n)/log(base)); // e.g., p = log(1056)/log(2) == 10
double m = pow(base, p); // e.g., m = pow(2,10) == 1024
long val;
bool hasDecPt = false;

// try all powers m that are <= the original n: m == (base^p,
base^(p-1),..., 1)
while ((n > 0 || m >= 1) && numDecPlaces < maxNumDecPlaces) {
if (!hasDecPt && m < 1) {
s += ".";
hasDecPt = true;
}
if (hasDecPt) {
numDecPlaces++;
}
val = long(n/m);
s += val2digit(val, base);
if (m <= n) {
n = n - val*m;
}
m = m/base; // e.g., m = 1024/2 == 512; m = 0.5/2 == 0.25
}
if (s[0]=='.') {
s = "0" + s;
}
return s;
}

/******************************************************************************
* Print the base 'base' representations of values in the base-10 range from
* 'min' to 'max', inclusive, using 'numIncsPerOne' increments between each
* base-10 step size of one. e.g., 'numIncsPerOne'==4 gives steps of 1/4 incs
*
* e.g., printBase(16, 0, 32, 4); // Print base-16 reps of 0..32, in 1/4 incs
******************************************************************************/
void printBase(const int base, const long min, const long max, const int
numIncsPerOne = 1) {
std::cout << "--- BASE " << base << "---\n";
double n;
for (long i=min; i<=max; i++) {
for (int j=0; j<numIncsPerOne; j++) {
n = double(i + double(j)/numIncsPerOne);
std::cout << n << ":\t\t" << base10ToBaseN(n, base) << "\n";
}
}
}

/******************************************************************************
* Test drive
******************************************************************************/
void main() {
printBase(32, 0, 32, 6); // Print base-16 reps of 0..32, in 1/6 incs
}
---
 
M

MiniDisc_2k2

Marc said:
Hello,

Does anyone know where I can find some sample code for encoding a string
into Base32? And when I say "code" I'm not talking about some MS VC++
project... I'm talking about real C or C++ code that can be used on any
platform (as C and C++ should be ;-) ).

Thanks in advance!
Marc

Some people complain that this is not standard. I still haven't gotten a
definite answer of whether or not this is standard. If you can find this
function, good for you. I've found it in every compiler I've used.

int in10; // number in decimal
char* in32 = new char[31]; // soon to be number in 32-base
// fill in10 here.
itoa(in10, in32, 32); // convert and store

Okay, now, what this function does: It takes the integer representation of
the number and converts it to any base you would like (2 <= base <= 36). The
first parameter is the number, the second is the char* which it will store
it too, and the third is the base which you want to put it in. There's also
an atoi() which I have found, very useful for converting a const char* to an
int. There's also dtoa, ltoa, and atod and atol, and atof, ftoa, basically
any built-in type. Find it if you can.
 
V

Victor Bazarov

Suzanne Vogel said:
Just home-brewed code that does any base encoding. Comments on bad style
are welcome.

Well, since you've asked for it... Your program is not supposed
to compile. See below.
My favorite is the base-32 representation of the number 15.333...
It's F.ALALA...
== F*32^0 + A*(1/32) + L*(1/32)^2 + A*(1/32)^3 + L*(1/32)^4 + ...
== 15*32^0 + 10*(1/32) + 21*(1/32)^2 + 10*(1/32)^3 + 21*(1/32)^4 + ...

---
#include <string>
#include <iostream>
#include <math.h>

/***************************************************************************
***
* Given a value 'value' in base 10, convert it to a digit in base 'base'.
* PRECONDITION: 0 <= 'base' <= 36 (because of limit of alphabet size == 26)
****************************************************************************
**/
std::string val2digit(const int value, const int base) {
static const SIZE_OF_ALPHABET = int('Z') - int('A') + 1;

Do you really think that his formula is better than 26? It isn't.
And it's not going to work on any character representation where
letters are not in order (like EBCDIC).
static const int GENERAL_NUMBER_BASE = 10;
static const int CHARS_PER_DIGIT = 2; // 1 is not enough!!!
std::string s = "";
if (value <= 9) {
char* buf = (char*)malloc(sizeof(char)*CHARS_PER_DIGIT);

'malloc' already returns char*, there is no need to cast it.

sizeof(char) is always 1, there is not need to mention it.

If you're using C++, it is better to apply 'new' to allocate
memory, not malloc.
itoa(value, buf, GENERAL_NUMBER_BASE);

There is no such function in C++. So, your program won't compile.
s += buf;
free(buf);

I wonder why couldn't you simply do

s += char(value + '0');

without all that dance with dynamic memory allocation?
}
else if (value <= 9 + SIZE_OF_ALPHABET) {
s += char((value - 9) + int('A')-1);

There is no need to cast 'A' to 'int', you can use it directly.
Also, 'value - 9 - 1' could be simply replaced with 'value - 10',
there is no need to make excessive arithmetic here.

However... There is no guarantee that the alphabet is represented in
the character set in the order. ASCII does have them in order, but
EBCDIC doesn't. You have to use a table.
}
else {
printf("%s%d%s", "Error: Value must be in range 9 to ", 9 +
SIZE_OF_ALPHABET, "\n");
std::cout << std::flush;

'printf' is undefined here. Why don't you use std::cerr to report
an error?
}
return s;
}

/***************************************************************************
***
* Given a base-10 value 'n', convert it to a strong representation in base
* 'base', using a maximum of 'maxNumDecPlaces' number of decimal places.
****************************************************************************
**/
std::string base10ToBaseN(double n, const int base, int maxNumDecPlaces
= -1) {
static const int MAX_NUM_DEC_PLACES = 8;
maxNumDecPlaces = (maxNumDecPlaces < 0) ? MAX_NUM_DEC_PLACES :
maxNumDecPlaces;
int numDecPlaces = 0;

std::string s = "";

When defining a string there is no need to use "". It will be empty
on construction.
long p = long(log(n)/log(base)); // e.g., p = log(1056)/log(2) == 10
double m = pow(base, p); // e.g., m = pow(2,10) == 1024
long val;
bool hasDecPt = false;

// try all powers m that are <= the original n: m == (base^p,
base^(p-1),..., 1)
while ((n > 0 || m >= 1) && numDecPlaces < maxNumDecPlaces) {
if (!hasDecPt && m < 1) {
s += ".";
hasDecPt = true;
}
if (hasDecPt) {
numDecPlaces++;
}
val = long(n/m);
s += val2digit(val, base);
if (m <= n) {
n = n - val*m;
}
m = m/base; // e.g., m = 1024/2 == 512; m = 0.5/2 == 0.25
}
if (s[0]=='.') {
s = "0" + s;
}
return s;
}

/***************************************************************************
***
* Print the base 'base' representations of values in the base-10 range from
* 'min' to 'max', inclusive, using 'numIncsPerOne' increments between each
* base-10 step size of one. e.g., 'numIncsPerOne'==4 gives steps of 1/4 incs
*
* e.g., printBase(16, 0, 32, 4); // Print base-16 reps of 0..32, in 1/4 incs****************************************************************************
**/
void printBase(const int base, const long min, const long max, const int
numIncsPerOne = 1) {
std::cout << "--- BASE " << base << "---\n";
double n;

Why do you declare 'n' outside of the loop? You're not using it
outside of the loop...
for (long i=min; i<=max; i++) {
for (int j=0; j<numIncsPerOne; j++) {
n = double(i + double(j)/numIncsPerOne);
std::cout << n << ":\t\t" << base10ToBaseN(n, base) << "\n";
}
}
}

/***************************************************************************
***
* Test drive
****************************************************************************
**/
void main() {

'main' always returns 'int'.
printBase(32, 0, 32, 6); // Print base-16 reps of 0..32, in 1/6 incs

The comment is deceiving. It prints base-32 reps.
 
V

Victor Bazarov

Suzanne Vogel said:
Thanks, Victor. I looked at your suggestions and made some of the
changes. I had overlooked that I wasn't using C++ headers, that my
character representation was machine-dependent, and that my comments
were outdated. I didn't know that itoa()/atoi() are not standard C++,

'atoi' is standard, 'itoa' isn't.
or
that std::cerr existed. In certain cases, I did typecasts (e.g.,
(char*)malloc(...)) and initializations (e.g., std::string foo("")) just
to be explicit, not because they were really necessary.

So, if I write

int i = (int)42;

, you would call that "explicit"? I'd call it excessively verbose.
I tried that, and that doesn't work.

How didn't it work?

#include <iostream>
#include <string>

using namespace std;

int main()
{
string s;
for (int i = 0; i < 10; ++i)
s += char(i + '0'); // <<<<<<<<<<<<<<<<<,

cout << s << endl;

return 0;
}
I think it keeps the bits of the
int 'value + '0'' the same but simply reinterprets it as a char (i.e., I
think it does reinterpret_cast instead of dynamic_cast). Hence, when I
print it out, I get a bunch of symbols and the occasional "beep" for '\a'.

Are you sure you didn't do

s += char(value + 0);

instead of

s += char(value + '0');

??? You might want to pay attention next time...
Anyway, I switched to using a symbol table (i.e., a char array).

Not a bad idea.
FWIW:
http://www.cs.unc.edu/~vogel/playpen/c++/std/BaseAndBitReps/Main.cpp
A URL that will die eventually, but oh well, 'cause it's not like anyone
will care to save it. ;)

Still plenty of errors.

Try this:
------------------------------------------------------------------
#include <iostream>
#include <string>
#include <sstream>
#include <cstdlib>
#include <climits>
#include <bitset>

using namespace std;

bool little_endian()
{
union {
int a;
char c[sizeof(int)];
} u;
u.a = 1;
return u.c[0] == 1;
}

template<class T> string toBinary(T t)
{
string s;
char *pcc = new char[sizeof(T)];
memcpy(pcc, &t, sizeof(T));
char *pc = pcc;

if (little_endian())
pc += sizeof(t) - 1;

for (int j = 0; j < sizeof(T); ++j)
{
s += toBinary(*pc);

if (little_endian())
--pc;
else
++pc;
}

delete[] pcc;

return s;
}

string toBinary(char c)
{
bitset<CHAR_BIT> bs(c);
ostringstream os;
os << bs;
return os.str();
}

int main()
{
cout << toBinary(32.0) << endl;
cout << toBinary(32.0f) << endl;
cout << toBinary(32) << endl;
cout << toBinary(little_endian) << endl;

return 0;
}
------------------------------------------------------------------

There are some that would make a point that 'little_endian()' is
not correctly written, but I don't really care too much here.

Victor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top