unicode text file

Discussion in 'Python' started by Junaid, Sep 27, 2009.

  1. Junaid

    Junaid Guest

    I want to do replacements in a utf-8 text file. example

    f=open("test.txt","r") #this file is uft-8 encoded

    raw = f.read()
    txt = raw.decode("utf-8")

    txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
    but not working

    f.write(txt)
    f.close()
    f.flush()


    please, help me

    thanks
     
    Junaid, Sep 27, 2009
    #1
    1. Advertising

  2. 2009/9/27 Junaid <>:
    > I want to do replacements in a utf-8 text file. example
    >
    > f=open("test.txt","r") #this file is uft-8 encoded
    >
    > raw = f.read()
    > txt = raw.decode("utf-8")
    >
    > txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
    > but not working
    >
    > f.write(txt)
    > f.close()
    > f.flush()
    >
    >
    > please, help me
    >
    > thanks
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >


    Does
    txt.replace('English', ur'ഇംഗàµà´²àµ€à´·àµ')
    instead of
    txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ')

    fix the problem?

    hth
    vbr
     
    Vlastimil Brom, Sep 27, 2009
    #2
    1. Advertising

  3. Junaid

    MRAB Guest

    Junaid wrote:
    > I want to do replacements in a utf-8 text file. example
    >
    > f=open("test.txt","r") #this file is uft-8 encoded
    >
    > raw = f.read()
    > txt = raw.decode("utf-8")
    >
    > txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
    > but not working
    >

    txt = txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ')

    > f.write(txt)
    > f.close()
    > f.flush()


    The file will be flushed when it's closed, and flushing it after closing
    is meaningless.
    >
    >
    > please, help me
    >
    > thanks
     
    MRAB, Sep 27, 2009
    #3
  4. Junaid

    Mark Tolonen Guest

    "Junaid" <> wrote in message
    news:...
    >I want to do replacements in a utf-8 text file. example
    >
    > f=open("test.txt","r") #this file is uft-8 encoded
    > raw = f.read()
    > txt = raw.decode("utf-8")


    You can use the codecs module to open and decode the file in one step

    >
    > txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
    > but not working


    The replace method returns the altered string. It does not modify it in
    place. You also should use Unicode strings for both the arguments (although
    it doesn't matter in this case). Using a raw Unicode string is also
    unnecessary in this case.

    txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')

    > f.write(txt)


    You opened the file for writing. You'll need to close the file and reopen
    it for writing.

    > f.close()
    > f.flush()


    Flush isn't required. close() will flush.

    Also to have text like ഇംഗàµà´²àµ€à´·àµ in a file you'll need to declare the
    encoding of the file at the top and be sure to actually save the file in the
    encoding.

    In summary:

    # coding: utf-8
    import codecs
    f = codecs.open('test.txt','r','utf-8')
    txt = f.read()
    txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
    f.close()
    f = codecs.open('test.txt','w','utf-8')
    f.write(txt)
    f.close()

    -Mark
     
    Mark Tolonen, Sep 27, 2009
    #4
  5. Junaid

    Junaid Guest

    On Sep 27, 6:39 pm, "Mark Tolonen" <> wrote:
    > "Junaid" <> wrote in message
    >
    > news:...
    >
    > >I want to do replacements in a utf-8 text file. example

    >
    > > f=open("test.txt","r") #this file is uft-8 encoded
    > > raw = f.read()
    > > txt = raw.decode("utf-8")

    >
    > You can use the codecs module to open and decode the file in one step
    >
    >
    >
    > > txt.replace{'English', ur'ഇംഗàµà´²àµ€à´·àµ') #replacing raw unicode string,
    > > but not working

    >
    > The replace method returns the altered string.  It does not modify it in
    > place.  You also should use Unicode strings for both the arguments (although
    > it doesn't matter in this case).  Using a raw Unicode string is also
    > unnecessary in this case.
    >
    >     txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
    >
    > > f.write(txt)

    >
    > You opened the file for writing.  You'll need to close the file and reopen
    > it for writing.
    >
    > > f.close()
    > > f.flush()

    >
    > Flush isn't required.  close() will flush.
    >
    > Also to have text like ഇംഗàµà´²àµ€à´·àµ in a file you'll need to declare the
    > encoding of the file at the top and be sure to actually save the file in the
    > encoding.
    >
    > In summary:
    >
    >     # coding: utf-8
    >     import codecs
    >     f = codecs.open('test.txt','r','utf-8')
    >     txt = f.read()
    >     txt = txt.replace(u'English', u'ഇംഗàµà´²àµ€à´·àµ')
    >     f.close()
    >     f = codecs.open('test.txt','w','utf-8')
    >     f.write(txt)
    >     f.close()
    >
    > -Mark


    thanx everyone for replying,

    I did as Mark suggested, and it worked :)

    thanx once more
     
    Junaid, Oct 3, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Mark Bram
    Replies:
    0
    Views:
    4,035
    Robert Mark Bram
    Sep 28, 2003
  2. ygao

    unicode wrap unicode object?

    ygao, Apr 8, 2006, in forum: Python
    Replies:
    6
    Views:
    588
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Apr 8, 2006
  3. Gabriele *darkbard* Farina

    Unicode digit to unicode string

    Gabriele *darkbard* Farina, May 16, 2006, in forum: Python
    Replies:
    2
    Views:
    560
    Gabriele *darkbard* Farina
    May 16, 2006
  4. Jeremy
    Replies:
    1
    Views:
    840
    Alex Willmer
    Jan 11, 2011
  5. Jeremy
    Replies:
    0
    Views:
    621
    Jeremy
    Jan 11, 2011
Loading...

Share This Page