H
hg
Hi,
I'm bringing over a thread that's going on on f.c.l.python.
The point was to get rid of french accents from words.
We noticed that len('à') != len('a') and I found the hack below to fix
the "problem" ... yet I do not understand - especially since 'à' is
included in the extended ASCII table, and thus can be stored in one byte.
Any clue ?
hg
# -*- coding: utf-8 -*-
import string
def convert(mot):
print len(mot)
print mot[0]
print '%x' % ord(mot[1])
table =
string.maketrans('àâäéèêëîïôöùüû','\x00a\x00a\x00a\x00e\x00e\x00e\x00e\x00i\x00i\x00o\x00o\x00u\x00u\x00u')
return mot.translate(table).replace('\x00','')
c = 'àbôö a '
print convert(c)
I'm bringing over a thread that's going on on f.c.l.python.
The point was to get rid of french accents from words.
We noticed that len('à') != len('a') and I found the hack below to fix
the "problem" ... yet I do not understand - especially since 'à' is
included in the extended ASCII table, and thus can be stored in one byte.
Any clue ?
hg
# -*- coding: utf-8 -*-
import string
def convert(mot):
print len(mot)
print mot[0]
print '%x' % ord(mot[1])
table =
string.maketrans('àâäéèêëîïôöùüû','\x00a\x00a\x00a\x00e\x00e\x00e\x00e\x00i\x00i\x00o\x00o\x00u\x00u\x00u')
return mot.translate(table).replace('\x00','')
c = 'àbôö a '
print convert(c)