Convert on uppercase unaccentent unicode character

J

JBJ

Hi,
I'am very newbie in Python.
For the moment I'am trying to convert an unicode character to his uppercase
unaccented character.
By example with locale fr_FR:
a,A,à,À should return A
o,O,ô,Ô should return O
½,¼ should return ¼
i,I,î,Î should return I

Have you some suggestions ?

Thank.
 
T

timaranz

Hi,
I'am very newbie in Python.
For the moment I'am trying to convert an unicode character to his uppercase
unaccented character.
By example with locale fr_FR:
a,A,à,À should return A
o,O,ô,Ô should return O
½,¼ should return ¼
i,I,î,Î should return I

Have you some suggestions ?

Thank.

Unicode strings have an upper() method - try that. I'm think it
should work properly with your locale - it doesn't give the expected
result for me with an english locale.
 
D

Duncan Booth

Unicode strings have an upper() method - try that. I'm think it
should work properly with your locale - it doesn't give the expected
result for me with an english locale.
No, that will uppercase the string, but it doesn't (and shouldn't) strip
the accents:
a,A,à,À should return A
o,O,ô,Ô should return O
½,¼ should return ¼
i,I,î,Î should return I'''BY EXAMPLE WITH LOCALE FR_FR:
A,A,À,À SHOULD RETURN A
O,O,Ô,Ô SHOULD RETURN O
½,¼ SHOULD RETURN ¼
I,I,Î,Î SHOULD RETURN I
I guess maybe my newreader corrupted the third line. It probably corrupts
all the others when I send this.
 
J

John Machin

Hi,
I'am very newbie in Python.
For the moment I'am trying to convert an unicode character to his uppercase
unaccented character.
By example with locale fr_FR:
a,A,à,À should return A
o,O,ô,Ô should return O
½,¼ should return ¼
i,I,î,Î should return I

Have you some suggestions ?

Thank.

Google in this newsgroup for a thread started by "bussiere" on or
about 2006-03-25. The code snippet provided by Fredrik Lundh should
help you.
 
S

Steve Holden

Duncan said:
No, that will uppercase the string, but it doesn't (and shouldn't) strip
the accents:
I can agree that is doesn't (though I am taking your word for it), but a
French person will definitely feel it's doing the wrong thing. Upper
case letters aren't accented in written French.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
 
D

Duncan Booth

Steve Holden said:
I can agree that is doesn't (though I am taking your word for it), but
a French person will definitely feel it's doing the wrong thing. Upper
case letters aren't accented in written French.
I didn't know that, and I'm not sure I believe it: but then the French
tend to have conventions honoured more in the breach than the observance. I
just hit a few French websites, and the first one that I found which had
any capital letters that might be accented had four accented capital
letters on its front page (two capitalized words and two words in block
capitals).
 
J

John Machin

I didn't know that, and I'm not sure I believe it: but then the French
tend to have conventions honoured more in the breach than the observance. I
just hit a few French websites, and the first one that I found which had
any capital letters that might be accented had four accented capital
letters on its front page (two capitalized words and two words in block
capitals).

The usual rationale for such treatment of accented characters is for
fuzzy matching:
if upshiftedunaccented(text1) == upshiftedunaccented(text2):
 
L

Lawrence D'Oliveiro

I didn't know that, and I'm not sure I believe it: but then the French
tend to have conventions honoured more in the breach than the observance.

Second most diabolical spelling system in the world ... after English.
 
J

JBJ

John said:
Google in this newsgroup for a thread started by "bussiere" on or
about 2006-03-25. The code snippet provided by Fredrik Lundh should
help you.
Thanks
 
S

Steve Holden

JBJ said:

Malheureusement, I see that absence of accented capitals is a modern
phenomenon that is regarded as an impediment to the language mostly
stemming from laziness of individual authors and inadequacy of low-end
typesetting software. I hadn't realised I was so up-to-date ;-)

So I will have to stop propagating this misinformation.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline so I couldn't cat it
 
W

Wildemar Wildenburger

Steve said:
Malheureusement, I see that absence of accented capitals is a modern
phenomenon that is regarded as an impediment to the language mostly
stemming from laziness of individual authors and inadequacy of low-end
typesetting software. I hadn't realised I was so up-to-date ;-)

So I will have to stop propagating this misinformation.

Thats really weird, because I was taught in school that caps are not to
be accented. In school! Big Brother is an idiot.

I'm equally ammused by the part of JBJ's link where it says that a
missing acccent "fait hésiter sur la prononciation". Yeah, AS IF written
French had anything to do with the way it is pronounced. Not that I
don't like french, mind you. Everywhere outside action movies its pretty
cool.

/W
 
?

=?ISO-8859-15?Q?Ricardo_Ar=E1oz?=

Wildemar said:
Thats really weird, because I was taught in school that caps are not to
be accented. In school! Big Brother is an idiot.

I'm equally ammused by the part of JBJ's link where it says that a
missing acccent "fait hésiter sur la prononciation". Yeah, AS IF written
French had anything to do with the way it is pronounced. Not that I
don't like french, mind you. Everywhere outside action movies its pretty
cool.

/W

Then you never saw Taxi. Or some of Depardieu's ones. Or Les
adventuriers (sp?) (that's a reeeeally old one), or I comme Icarus....
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top