How do you check the utf8 flag on a given string?

G

~greg

One case of the problem I'm having is that I'm reading from a file strings
containing Ns with tildes on them
Ñ
(- or they're something, anyway, that I know should look like
a ascii 209 --in 8859-1 (I guess) )

And I'm trying to sort strings so that these get sorted like 'n's.

At the moment I'm trying to do this by checking for ordinals 209
and then converting them to 'n's for the comparisons.

I'm also checking, eg, for ordinals 195 ( 'A's with a tildes on them, - Ã )

Problem is that the 'n's are getting sorted as if they were 'a's.

And the Ñ (--N with tilde) get printed to an HTML file as Ã'
( ---A with tilde followed by a back-tick)
-- which does appear in the rendered HTM as an Ñ (N with tilde).

OBVIOUSLY - I don't don't know what I'm doing!

It seems to me that the sorting is being done on bytes,
whereas it should be being done on utf8-wide characters.

So I think the first step for me would be to check the utf8-flags
of the program's strings at each point in the processing,
to see how the program is regarding them.

But I don't know how to check that.

~greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top