converting from windows wchar_t to linux wchar_t

yakir22 · Aug 14, 2008

Hello experts,
I am dealing now in porting our server from windows to linux. our
client is running only on windows machine.
to avoid the wchar_t size problem ( in windows its 2 bytes and linux
is 4 bytes ) we defined

#ifdef WIN32
#define t_wchar_t wchar_t
#else // LINUX
#define t_wchar_t short
#endif

on the server I get a buffer that contains windows t_wchar_t string.
something like

struct user_data
{
t_wchar_t name[32];
.....
.....
};

all the data transfer is working great as long as the server don't
care what's in the string
my problem start when I want to print out some logs on the server
using the content of the buffer.

my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).

Thanks

Rolf Magnus · Aug 14, 2008

my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).

I suggest to use something like libiconv(http://en.wikipedia.org/wiki/Iconv)
to convert to a common character set on both sides.

James Kanze · Aug 14, 2008

You might be better off with a typedef, although it's not a
very significant difference.

I would be if the second were unsigned short. Something like
"t_wchar_t( something )" would be legal if it were a typedef,
not if it were a #define.

Also, for some reason I seem to remember that wchar_t is an
unsigned type. Since 'char' is often signed (though different
from 'singed char', of course), perhaps I remember
incorrectly...

Both are very implementation defined. In practice, you
generally shouldn't be using wchar_t in portable code:-(.

James Kanze · Aug 15, 2008

wchar_t is a particularly useless type : Because its
implementation defined it doesn't have (in protable code) any
kind of assurance of what type of character encoding it may be
using or capable of using.

That's partially true of char as well; in addition, the
character encoding can depend on the source of the data. But at
least, char is guaranteed to be at least 8 bits, so you know
that it can hold all useful external encodings. (For better or
for worse, the external world is 8 bits, and any attempt to do
otherwise is bound to fail in the long run.)

The next point is that *unicode* characters are unsigned.

I'm not sure what that's supposed to mean. ALL character
encodings I've ever seen use only non-negative values: ASCII
doesn't define any negative encodings, nor do any of the ISO
8859 encodings. The fact that char can be (and often is) a
signed 8 bit value causes no end of problems because of this.
The character value isn't really signed or unsigned: it's just a
value (that happens never to be negative).

What is true is that the Unicode encoding formats UTF-16 and
UTF-8 require values in the range of 0-0xFFFF and 0-0xFF,
respectively, and that if you're short is 16 bits or your char 8
(both relatively frequent cases), those values won't fit in the
corresponding signed types. (For historical reasons, we still
manage to make do putting UTF-8, and other 8 bit encodings, in
an 8 bit signed char. It's a hack, and it's not, at least in
theory, guaranteed to work, but in practice, it's often the
least bad choice available.)

XMLCh and wchar_t	3	Nov 1, 2006
[C Language] Need help transferring Linux CodeBlocks Project to Windows CodeBlocks Project	1	Jun 19, 2023
Linux: using "clone3" and "waitid"	0	Oct 17, 2023
std::wstringbuf and imbue to convert from utf-8 to wchar_t?	3	Nov 2, 2008
ffi: How to use wchar_t strings?	3	Jan 18, 2010
main(int argc, wchar_t *argv[]) on linux i386 (redhat)	4	Mar 23, 2007
C++, wchar_t, Unicode and all that stuff	3	Dec 23, 2005
Converting EBCDIC to Unicode	3	Sep 28, 2010

converting from windows wchar_t to linux wchar_t

yakir22

Rolf Magnus

James Kanze

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads