newbie: type conversion

  • Thread starter Andrew V. Tkachenko
  • Start date
A

Andrew V. Tkachenko

Hello.
I'm newbie in C programming, so I'm sorry for (probably stupid)
question, but I'm stuck with it.

I use 'iconv' function to convert from utf8 to ucs4

iconv(iconv_t cd, char *in, char *out ....)

so, I got 'out' as a result of operation. But I need result as array of
uint32_t type, not chars. How can I do such conversion?

Is it legal to 'pack' array of chars into uint32 array using the method
below?

uint32_t inta[50];
char string[] = "123456789";

char *p = (char *)inta;
char *p1 = (char *)string;

for(;*p1;p1++, p++)
{
*p = *p1 & 0xFF;
}


Thanks for advance.
 
K

Kevin Goodsell

Andrew said:
Hello.
I'm newbie in C programming, so I'm sorry for (probably stupid)
question, but I'm stuck with it.

I use 'iconv' function to convert from utf8 to ucs4

iconv(iconv_t cd, char *in, char *out ....)

so, I got 'out' as a result of operation. But I need result as array of
uint32_t type, not chars. How can I do such conversion?

Is it legal to 'pack' array of chars into uint32 array using the method
below?

It's perfectly legal to access any object as an array of char. However,
while you can use this method to write arbitrary bit-patterns into the
object, there is no guarantee that the result will be the value you
expected or even a legal representation of a value for the type in
question. It's possible that you could set the object to a trap
representation.
uint32_t inta[50];

uint32_t is an optional type for C99 implementations. It's a little
strange to see you using it, because C99 implementations are extremely
rare at the moment. It's also less portable than, say, uint_least32_t
(which is required in C99) or unsigned long (which is at least 32 bits,
and available on all C implementations).
char string[] = "123456789";

char *p = (char *)inta;
char *p1 = (char *)string;

This cast is completely useless, since the type of the expression
'string' is already char *.
for(;*p1;p1++, p++)
{
*p = *p1 & 0xFF;
}

The safe way to do this would be to use bit-wise operators to act on the
uint32_t objects directly, something like this:

uint32_t val = 0;
const char *p = "123456789";
size_t i;

for (i=0; i<sizeof(val); ++i)
{
val = (val << CHAR_BIT) | (p & 0xFF);
}

A loop built around something like this should work (unless I've made a
mistake or misunderstood the intent of your code), and doesn't rely on a
particular byte-order or the lack of padding bits.

-Kevin
 
P

Peter Nilsson

Kevin Goodsell said:
It's perfectly legal to access any object as an array of char. However,
while you can use this method to write arbitrary bit-patterns into the
object, there is no guarantee that the result will be the value you
expected or even a legal representation of a value for the type in
question. It's possible that you could set the object to a trap
representation.

But not for uint32_t. At least, not in C99. Note that C99 differs from
N869 in that it states that the exact width integer types cannot have
padding bits.
uint32_t inta[50];

uint32_t is an optional type for C99 implementations. It's a little
strange to see you using it, because C99 implementations are extremely
rare at the moment.

But 32-bit integers certainly aren't! :)

Both C standards have a class of portability that is weaker than
strict conformance, but nonetheless correct in the sense that programs
are consistent and well behaved (i.e. won't invoke undefined
behaviour).

Use of exact width types does limit portability, but if the program
specification is to target implementations with those specific width
types, then C99 accomodates such programs in a standardised manner.
It's also less portable than, say, uint_least32_t
(which is required in C99) or unsigned long (which is at least 32 bits,
and available on all C implementations).
char string[] = "123456789";

char *p = (char *)inta;
char *p1 = (char *)string;

This cast is completely useless, since the type of the expression
'string' is already char *.
for(;*p1;p1++, p++)
{
*p = *p1 & 0xFF;
}

uint32_t inta[50];
char string[] = "123456789";

unsigned char *p = (void *) inta;
const unsigned char *p1 = (const void *) string;

while (*p1)
{
*p++ = *p1++;
}

Or just...

memcpy(inta, string, strlen(string));
The safe way to do this would be to use bit-wise operators to act on the
uint32_t objects directly,

A straight byte copy is sufficient and safe. Note that if uint32_t
exists, then CHAR_BIT must divide 32 exactly, so there's no point
worrying about bit size mistmatches between char and uint32_t.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top