J
jeanlutrin
Lets take the Euro symbol. In Unicode, its represented:
Impossible... only chars from 0x80 to 0x7ff (included) can
be represented on two bytes in UTF-8.
Moreover the last "hex letter" in UTF-8 is always the same
as the Unicode codepoint (U+20AC). So in this example, the
last "UTF-8" byte has to end with "C".
Correct UTF-8 representation for U+20AC is :
E2 82 AC
Now I know that you just forgot to paste the "AC", but still...
It needed to be corrected
Peace,
Jean
U+20AC
Its represented in UTF-8 in memory as:
E2 82
Impossible... only chars from 0x80 to 0x7ff (included) can
be represented on two bytes in UTF-8.
Moreover the last "hex letter" in UTF-8 is always the same
as the Unicode codepoint (U+20AC). So in this example, the
last "UTF-8" byte has to end with "C".
Correct UTF-8 representation for U+20AC is :
E2 82 AC
Now I know that you just forgot to paste the "AC", but still...
It needed to be corrected
Peace,
Jean