Convert HEX string to bin

G

GIR

Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.

I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

Does anybody know a efficient way of converting these values?

TIA,
GIR
 
G

GIR

Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.

I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

Does anybody know a efficient way of converting these values?

TIA,
GIR


Ofcourse this:
buffer[0] = A;
buffer[1] = 1;

should be this:
buffer[0] = 'A';
buffer[1] = '1';

Just to avoid confusion.
 
P

pete

GIR said:
Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.

I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

Does anybody know a efficient way of converting these values?

hex2bin.c output:

The binary representation of 0xA1 is 10100001.

/* BEGIN hex2bin.c */

#include <stdio.h>
#include <limits.h>

char *bitstr(char *, void const *, size_t);

int main(void)
{
char buffer[64] = "A1";
unsigned char value;
char *pointer;
char ebits[CHAR_BIT * sizeof value + 1];

puts("hex2bin.c output:\n");
value = 0;
for (pointer = buffer; *pointer; ++pointer) {
value *= 16;
switch (*pointer) {
case 'a':
case 'A':
value += 0xa;
break;
case 'b':
case 'B':
value += 0xb;
break;
case 'c':
case 'C':
value += 0xc;
break;
case 'd':
case 'D':
value += 0xd;
break;
case 'e':
case 'E':
value += 0xe;
break;
case 'f':
case 'F':
value += 0xf;
break;
default:
value += *pointer - '0';
break;
}
}
bitstr(ebits, &value, sizeof value);
printf("The binary representation of 0x%s is %s.\n",
buffer, ebits);
return 0;
}

char *bitstr(char *str, const void *obj, size_t n)
{
unsigned char mask;
const unsigned char *byte = obj;
char *const ptr = str;

while (n--) {
mask = ((unsigned char)-1 >> 1) + 1;
do {
*str++ = (char)(mask & byte[n] ? '1' : '0');
mask >>= 1;
} while (mask);
}
*str = '\0';
return ptr;
}

/* END hex2bin.c */
 
R

Richard Heathfield

GIR said:
Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.

I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

Does anybody know a efficient way of converting these values?

Assuming it really is a string (i.e. null-terminated), this ought to work,
modulo bugs...

#include <ctype.h>
#include <string.h>

void printbinary(const char *hex)
{
char *p;
int i, j;
while(*hex)
{
p = strchr("0123456789abcdef", tolower(*hex));
if(p != NULL)
{
for(i = p - hex, j = 8; j > 0; j /= 2)
{
if(i >= j)
{
putchar('1');
i -= j;
}
else
{
putchar('0');
}
}
}
++hex;
}
}
 
S

Sidney Cadot

GIR said:
I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

"buffer[0] = A;" doesn't mean anything; I'm assuming that you mean

buffer[0] = 'A';
buffer[1] = '1';
buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.
I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

I'd write something like this (which doesn't depend on ASCII per se):

for(i=0;i<64;i++)
{
switch(tolower(buffer))
{
case '0': emit("0000"); break;
case '1': emit("0001"); break;
case '2': emit("0010"); break;
case '3': emit("0011"); break;
case '4': emit("0100"); break;
case '5': emit("0101"); break;
case '6': emit("0110"); break;
case '7': emit("0111"); break;
case '8': emit("1000"); break;
case '9': emit("1001"); break;
case 'a': emit("1010"); break;
case 'b': emit("1011"); break;
case 'c': emit("1100"); break;
case 'd': emit("1101"); break;
case 'e': emit("1110"); break;
case 'f': emit("1111"); break;
default:
fprintf(stderr, "oops!\n");
exit(EXIT_FAILURE);
}
}
 
N

Nejat AYDIN

Richard said:
Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.
[...]

Assuming it really is a string (i.e. null-terminated), this ought to work,
modulo bugs...
^^^^^^^^^^^

Some bug fixes below.
#include <ctype.h>
#include <string.h>

void printbinary(const char *hex)
{
char *p;
int i, j;
while(*hex)
{

char *hexdigits = "0123456789abcdef";
p = strchr("0123456789abcdef", tolower(*hex));

p = strchr(hexdigits, tolower(*hex));
if(p != NULL)
{
for(i = p - hex, j = 8; j > 0; j /= 2)
^^^

for(i = p - hexdigits, j = 8; j > 0; j /= 2)

/* I would also do while(isxdigit(*hex)) in the while loop
* instead of while(*hex), so I would not have to test
* the p for NULL, and the function would return as soos as
* it encounters a non-hexadecimal digit.
*/
 
R

Richard Heathfield

Nejat said:
Richard said:
Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.
[...]

Assuming it really is a string (i.e. null-terminated), this ought to
work, modulo bugs...
^^^^^^^^^^^

Some bug fixes below.
#include <ctype.h>
#include <string.h>

void printbinary(const char *hex)
{
char *p;
int i, j;
while(*hex)
{

char *hexdigits = "0123456789abcdef";
p = strchr("0123456789abcdef", tolower(*hex));

Stupid stupid stupid of me.

Yes, your corrections are spot-on.

That's why I put these damn things into libraries.
/* I would also do while(isxdigit(*hex)) in the while loop

Nice. A definite improvement.
 
G

GIR

GIR said:
I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

"buffer[0] = A;" doesn't mean anything; I'm assuming that you mean

buffer[0] = 'A';
buffer[1] = '1';

Yes, I made a little post below this one with the subject: "Re: Small
correction: Convert HEX string to bin". I ment 'A' and not A.
buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.
I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

I'd write something like this (which doesn't depend on ASCII per se):

for(i=0;i<64;i++)
{
switch(tolower(buffer))
{
case '0': emit("0000"); break;
case '1': emit("0001"); break;
case '2': emit("0010"); break;
case '3': emit("0011"); break;
case '4': emit("0100"); break;
case '5': emit("0101"); break;
case '6': emit("0110"); break;
case '7': emit("0111"); break;
case '8': emit("1000"); break;
case '9': emit("1001"); break;
case 'a': emit("1010"); break;
case 'b': emit("1011"); break;
case 'c': emit("1100"); break;
case 'd': emit("1101"); break;
case 'e': emit("1110"); break;
case 'f': emit("1111"); break;
default:
fprintf(stderr, "oops!\n");
exit(EXIT_FAILURE);
}
}


The thing is that I cannot use any of the standard functions, they're
too big and too slow. This is my solution:

bit sec_char; // holds wether processing first or second char
unsigned char memory[32768]; // holds the final bit patern
/* SBUF is the Serial Buffer which holds the current char to be
processed */

if(!sec_char) {

memory = 0x00;

if (SBUF >= 'A' && SBUF <= 'F') {
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

memory <<= 4;
sec_char = 1;

} else {

if (SBUF >= 'A' && SBUF <= 'F') {
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

sec_char = 0;
i++;
}

The thing is that I need something even faster.
 
S

Sidney Cadot

GIR said:
Yes, I made a little post below this one with the subject: "Re: Small
correction: Convert HEX string to bin". I ment 'A' and not A.

Ok, missed that.
The thing is that I cannot use any of the standard functions, they're
too big and too slow. This is my solution:

bit sec_char; // holds wether processing first or second char
unsigned char memory[32768]; // holds the final bit patern
/* SBUF is the Serial Buffer which holds the current char to be
processed */

if(!sec_char) {

memory = 0x00;

if (SBUF >= 'A' && SBUF <= 'F') {
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

memory <<= 4;
sec_char = 1;

} else {

if (SBUF >= 'A' && SBUF <= 'F') {
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

sec_char = 0;
i++;
}

The thing is that I need something even faster.


Ok, this does something different than I understood you needed; you are
not converting to a binary external representation but you basically
need to parse a hexadecimal ascii representation to an array of unsigned
bytes.

Judging from your implementation, I'd say that roughly a factor 2--3
gain is still possible (that's without a lookup table, which I think
will be prohibitively big). Before we go down that road, please explain
where you get SBUF from, is it directly coming from a serial controller?
Is its value reliable (i.e., what should happen if SBUF contains a
character other than '0'..'9', 'A'..'F')?

And: what happened to the 64-byte length of buffer? This could yield 32
bytes of information, but not 32768.

In short, in order to squeeze every last bit of performance out of the
compiler for your problem, more problem-specific context is needed.

Regards, Sidney
 
C

CBFalconer

GIR said:
I've been trying to tackle the following problem but haven't
been succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be
converted into 10100001.

I know I could use sscanf to get the values out but sscanf
is not an option because it's too big.

Does anybody know a efficient way of converting these values?

Ofcourse this:
buffer[0] = A;
buffer[1] = 1;

should be this:
buffer[0] = 'A';
buffer[1] = '1';

If you simply impose a condition "buffer[2] = non-hex-character",
which includes '\0', blank, etc. you have an extremely simple
solution:

#include <stdlib.h>

...
<some integral type> val; /* suitable to hold the values */
...
val = strtoul(buffer, NULL, 16);
 
A

Andy

The thing is that I cannot use any of the standard functions, they're
too big and too slow. This is my solution:

bit sec_char; // holds wether processing first or second char
unsigned char memory[32768]; // holds the final bit patern
/* SBUF is the Serial Buffer which holds the current char to be
processed */

if(!sec_char) {

memory = 0x00;


Make memory a register variable. After assembling a complete byte,
then store the register contents in memory.
if (SBUF >= 'A' && SBUF <= 'F') {

If SBUF is guaranteed to be only hex characters,
if (SBUF > '9') { is sufficient.
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

memory <<= 4;
sec_char = 1;

} else {

if (SBUF >= 'A' && SBUF <= 'F') {
memory |= (SBUF - '0') - 7;
} else {
memory |= (SBUF - '0');
}

sec_char = 0;
i++;
}

The thing is that I need something even faster.
 
R

Richard Heathfield

Andy said:
If SBUF is guaranteed to be only hex characters,
if (SBUF > '9') { is sufficient.

No, because the C language does not guarantee that 'A' has a coding point
higher than '9'.
 
T

The Real OS/2 Guy

Hi,

I've been trying to tackle the following problem but haven't been
succesfull so far.

I have string

char buffer[64];

which is filled with hexadecimal values, like this.

buffer[0] = A;
buffer[1] = 1;

buffer[0..1] combined make one value, 0xA1, which should be converted
into 10100001.

I know I could use sscanf to get the values out but sscanf is not an
option because it's too big.

Does anybody know a efficient way of converting these values?

TIA,
GIR


Ofcourse this:
buffer[0] = A;
buffer[1] = 1;

should be this:
buffer[0] = 'A';
buffer[1] = '1';

Just to avoid confusion.

works when
1. buffer has always an even size
2. buffer if always filled completely:


size_t i, j;
unsigned char c = 0;
for (i = j = 0; j < sizeof(buffer); ) {
if (buffer[j] >= '0' && buffer[j] <= '9')
c = buffer[j++] - '0';
else {
switch (buffer[j]) {
case 'a':
c = buffer[j++] - 'a';
break;
case 'A'
c = buffer[i++] - 'A';
break;
......other hex digits go here.....
case 'F':
c = buffer[i++] - 'F';
break;
}
c <<= 8;
if (buffer[j] >= '0' && buffer[j] <= '9'
c |= buffer[j++] - '0';
else {
.......like the else above, but insted of = use |=
}
buffer[i++] = c;
}

The standard guarantees thet '0' to '9' are continous, but there is no
guarantee that 'a' - 'f' or 'A' - 'F' have the same continuity, so use
a switch for them makes it portable.
 
P

Peter Nilsson

Richard Heathfield said:
#include <ctype.h> ....
void printbinary(const char *hex)
{ ....
while(*hex)
{
...tolower(*hex)...

I believe your book has some explanation on why an (unsigned char) cast is
reasonably appropriate here. :)
 
P

pete

Richard said:
No, because the C language does not guarantee
that 'A' has a coding point higher than '9'.

The C language does not guarantee
that 'F' has a coding point higher than 'A'.
Which character constants, are bewteen 'A' and 'F',
is implementation defined also.
 
C

CBFalconer

The said:
.... snip ...

The standard guarantees thet '0' to '9' are continous, but there is
no guarantee that 'a' - 'f' or 'A' - 'F' have the same continuity,
so use a switch for them makes it portable.

Instead of all the complexity, and assuming that the OP will not
use the strto*() family for some reason, the digit conversions can
be done by:

#include <string.h>

/* Convert hex char to value, -1 for non-hex char */
int unhexify(char c)
{
static hexchars = "0123456789abcdefABCDEF";
char *p;
int val;

val = -1; /* default assume non-hex */
if ((p = strchr(hexchars, c))) {
val = p - hexchars;
if (val > 15) val = val - 6;
}
return val;
} /* unhexify, untested */

Which I believe to be fully portable, including char coding. Now
the OP can convert his (terminated) input string with:

char *p
unsigned int v;
int d;

.....
p = &instring[0];
v = 0;
while ((d = unhexify(*p)) >= 0) {
v = 16 * v + d;
++p;
}
/* desired result in v */
 
R

Richard Heathfield

Peter said:
I believe your book has some explanation on why an (unsigned char) cast is
reasonably appropriate here. :)

Believe it or not, I put it in, and then took it out again. :)

That whole reply was a heap of crud, which I wish I'd never posted.

Incidentally, it may be worth pointing out that, if I'd used
while(isxdigit((unsigned char)*hex) instead of while(*hex), the cast to
unsigned char in the call to tolower() would be unnecessary (since all the
characters '0' through '9' and ABCDEFabcdef are guaranteed to be
representable as unsigned char).
 
R

Richard Heathfield

(I meant "code point".)
The C language does not guarantee
that 'F' has a coding point higher than 'A'.

That's another reason, yes. :)
Which character constants, are bewteen 'A' and 'F',
is implementation defined also.

And that!
 
G

GIR

Ok, this does something different than I understood you needed; you are
not converting to a binary external representation but you basically
need to parse a hexadecimal ascii representation to an array of unsigned
bytes.

Judging from your implementation, I'd say that roughly a factor 2--3
gain is still possible (that's without a lookup table, which I think
will be prohibitively big). Before we go down that road, please explain
where you get SBUF from, is it directly coming from a serial controller?
Is its value reliable (i.e., what should happen if SBUF contains a
character other than '0'..'9', 'A'..'F')?

And: what happened to the 64-byte length of buffer? This could yield 32
bytes of information, but not 32768.

In short, in order to squeeze every last bit of performance out of the
compiler for your problem, more problem-specific context is needed.

Regards, Sidney

Okay here's a little explenation and background info.

This is for a little project which somebody dumped on mu desk. The
idea is to have a 8051 derative (Infineon 80c515A) process audio at
8bit mono 8khz.

There is a preprocessor which converts the binary wavefile to hex and
strips the header/footer. Don't ask me why, it wasn't my idea. In my
opinion it would be much simpeler if the preprocessor would just dump
everything to the serialport, now we go from bin=>hex - serial -
hex=>bin. The preprocessor is out of my range. In the docs it's
specified that the preprocessor will check the stream for errors and
such, so I don't have to worry about that.

What happens when the microcontroller receives the data over the
serialport? First of all a interupt is generated, the data on the
serial line is placed in SBUF (a special function register) and the
processor jumps to the interuptvector depending on it's priority.

Why the 32k length of the buffer? I don't know why... ;) The thing has
64k of memory so I just took half of that, remember this is just
version 0.000000001a. I was thinking of building a dynamic listm but
that seriously cuts in on the available memory. For instance:

struct mem_byte {
char mem;
char *next_byte;
};

So I need a byte for storing the info and I need a byte for storing
the adres of the next byte. That's double...

Anywayz, I'll just run over to the guyz who are did the preprocessor
and "ask" (read yell and order) them if it isn't better for them to
just dump binary data on the line.

Anywayz, tnx for your help. You guyz got a nice little group going
here :)

grtz,
GIR
 
S

Sidney Cadot

Richard said:
No, because the C language does not guarantee that 'A' has a coding point
higher than '9'.

The OP is of course presuming ascii... c-'0'-7 to get 'A' to 10 is the clue.

Best regards,

Sidney
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top