B
brad
Does standard C++ have any methods to do this? I'd like to convert raw
bytes to utf-8. Thanks for any tips.
bytes to utf-8. Thanks for any tips.
Victor said:What is the difference between "raw bytes" and "utf-8"?
V
brad said:raw bytes are not character streams. They do not conform to the
concept of a char. grep a binary file for a string, then grep a text
file for a string to gain a better understanding of this difference.
brad said:Does standard C++ have any methods to do this? I'd like to convert raw
bytes to utf-8. Thanks for any tips.
I think you are confusing two different (although related)
concepts.
An "unicode string" and an "utf-8 string" are two different
things.
The former is a string where each character represents a
unicode character. Usually that means that every character
must be 4 bytes long in order to be able to store any unicode
value. (Although I'm not sure if there's an existing
convention for this. I'm not exactly sure what's the
"standard" width for a unicode wide character.)
An "utf-8 string" is a string which has been utf-8-encoded.
This means that each "character" in the string is of variable
length.
Each character may be between 1 and 4 bytes in size.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.