compare unicode to non-unicode strings

A

Asterix

how could I test that those 2 strings are the same:

'séd' (repr is 's\\xc3\\xa9d')

u'séd' (repr is u's\\xe9d')
 
J

John Machin

how could I test that those 2 strings are the same:

'séd' (repr is 's\\xc3\\xa9d')

u'séd' (repr is u's\\xe9d')

[note: your reprs are wrong; change the \\ to \]

You need to decode the non-unicode string and compare the result with
the unicode string. You need to know the encoding used for the non-
unicode string. In the example that you gave, it's about 99.99% likely
that it's UTF-8.

HTH,
John
 
F

Fredrik Lundh

Asterix said:
how could I test that those 2 strings are the same:

'séd' (repr is 's\\xc3\\xa9d')

u'séd' (repr is u's\\xe9d')

determine what encoding the former string is using (looks like UTF-8),
and convert it to Unicode before doing the comparision.
True

</F>
 
M

Méta-MCI (MVP)

Par Toutatis !
Si tu avais posé la question à Ordralphabétix, ou sur un des ng français
consacrés à Python, au lieu de refaire "La grande Traversée", la réponse
aurait peut-être été plus rapide.

@-salutations
 
M

Matt Nordhoff

Asterix said:
how could I test that those 2 strings are the same:

'séd' (repr is 's\\xc3\\xa9d')

u'séd' (repr is u's\\xe9d')

You may also want to look at unicodedata.normalize(). For example, é can
be represented multiple ways:
False

The first form is "composed", just being U+00E9 (LATIN SMALL LETTER E
WITH ACUTE). The second form is "decomposed", being made up of U+0065
(LATIN SMALL LETTER E) and U+0301 (COMBINING ACUTE ACCENT).

Even though they represent the same thing to a human, they don't compare
as equal. But if you normalize them to the same form, they will.

For more information, look at the unicodedata module's documentation:
<http://docs.python.org/lib/module-unicodedata.html>
--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top