unicode text file

J

Junaid

I want to do replacements in a utf-8 text file. example

f=open("test.txt","r") #this file is uft-8 encoded

raw = f.read()
txt = raw.decode("utf-8")

txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
but not working

f.write(txt)
f.close()
f.flush()


please, help me

thanks
 
V

Vlastimil Brom

2009/9/27 Junaid said:
I want to do replacements in a utf-8 text file. example

f=open("test.txt","r") #this file is uft-8 encoded

raw = f.read()
txt = raw.decode("utf-8")

txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
but not working

f.write(txt)
f.close()
f.flush()


please, help me

thanks

Does
txt.replace('English', ur'ഇംഗàµà´²àµ€à´·àµ')
instead of
txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ')

fix the problem?

hth
vbr
 
M

MRAB

Junaid said:
I want to do replacements in a utf-8 text file. example

f=open("test.txt","r") #this file is uft-8 encoded

raw = f.read()
txt = raw.decode("utf-8")

txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
but not working
txt = txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ')
f.write(txt)
f.close()
f.flush()

The file will be flushed when it's closed, and flushing it after closing
is meaningless.
 
M

Mark Tolonen

Junaid said:
I want to do replacements in a utf-8 text file. example

f=open("test.txt","r") #this file is uft-8 encoded
raw = f.read()
txt = raw.decode("utf-8")

You can use the codecs module to open and decode the file in one step
txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
but not working

The replace method returns the altered string. It does not modify it in
place. You also should use Unicode strings for both the arguments (although
it doesn't matter in this case). Using a raw Unicode string is also
unnecessary in this case.

txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
f.write(txt)

You opened the file for writing. You'll need to close the file and reopen
it for writing.
f.close()
f.flush()

Flush isn't required. close() will flush.

Also to have text like ഇംഗàµà´²àµ€à´·àµ in a file you'll need to declare the
encoding of the file at the top and be sure to actually save the file in the
encoding.

In summary:

# coding: utf-8
import codecs
f = codecs.open('test.txt','r','utf-8')
txt = f.read()
txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
f.close()
f = codecs.open('test.txt','w','utf-8')
f.write(txt)
f.close()

-Mark
 
J

Junaid

You can use the codecs module to open and decode the file in one step




The replace method returns the altered string.  It does not modify it in
place.  You also should use Unicode strings for both the arguments (although
it doesn't matter in this case).  Using a raw Unicode string is also
unnecessary in this case.

    txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')


You opened the file for writing.  You'll need to close the file and reopen
it for writing.


Flush isn't required.  close() will flush.

Also to have text like ഇംഗàµà´²àµ€à´·àµ in a file you'll need to declare the
encoding of the file at the top and be sure to actually save the file in the
encoding.

In summary:

    # coding: utf-8
    import codecs
    f = codecs.open('test.txt','r','utf-8')
    txt = f.read()
    txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
    f.close()
    f = codecs.open('test.txt','w','utf-8')
    f.write(txt)
    f.close()

-Mark

thanx everyone for replying,

I did as Mark suggested, and it worked :)

thanx once more
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top