Dan Pop said:
[...]
It's fortune that you know it.
Could you please be a little more careful when writing English text?
Yes, it's just your opinion, not what C90 says, which is what I said.
So what?
I am perfectly entitled to my opinion. Just like anyone else.
I don't think so. It's very restrictive rather than broken at that
time; read Larry's posting on this.
I have: it didn't sound very convincing to someone inclined to use his
own judgement instead of blindly believing everything said by a committee
member.
A standard that prevents mixing, say, EBCDIC (characters) and UCS (wide
characters), for NO good reason, is downright broken in my book. And both
C89 and C99 do that.
You said it's broken. I said it's not broken, just very restrictive.
But what C90 says doesn't change regardless of whatever we think about
it. The standards, C90 and C99 as the current state, explicitly
guarantees that 'a' == L'a'. What's the problem with this? What
justifies you to say:
The fact that A belongs to the basic character set has
no relevance on the value of L'A'
I have already explained what. And I agree that the standard provides
this guarantee. What's the problem with this? ;-)
Read the book, "The Standard C Library" by PJ Plauger, <locale.h>
section, IIRC.
Quote the relevant paragraphs.
Read the underlined wording.
Does it change the fact that both standards say the same thing? If not,
the underlined text doesn't prove anything at all.
The multibyte character sequence given to printf() by user can have
redundant shift characters which can make the resulting mb characters
from the wide characters differ from the original.
Differ in what sense? Are the semantics of the text preserved or not?
The guarantee that
'%' == L'%' can make it easy to write a code to scan the conversion
specifier from the mb character sequence,
Nope, it cannot: you cannot process multibyte characters *before*
converting them to wide characters, because the standard does NOT
specify the encoding mechanism. Keep in mind that characters from the
base character set preserve their single byte values *only* in the initial
shift state (whatever that is):
While in the
initial shift state, all single-byte characters retain their usual
interpretation and do not alter the shift state. The interpretation
^^^^^^^^^^^^^^^^^^
for subsequent bytes in the sequence is a function of the current
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
shift state.
^^^^^^^^^^^^
despite lack of support for
conversion between characters; of course, there was a more complicated
way to do it not depedning on the fact.
There is no other way, without making assumptions about how mb characters
are encoded (see the quote above). And if you make such assumptions,
your code is no longer portable. There is no easy way to tell whether
a byte you read from the string corresponds to a single byte character
or is a shift state changer or is the first character of a multibyte
character.
Dan