universal newlines and utf-16

Discussion in 'Python' started by Baz Walter, Apr 11, 2010.

  1. Baz Walter

    Baz Walter Guest

    i am using python 2.6 on a linux box and i have some utf-16 encoded
    files with crlf line-endings which i would like to open with universal
    newlines.

    so far, i have been unable to get this to work correctly.

    for example:

    >>> open('test.txt', 'w').write(u'a\r\nb\r\n'.encode('utf-16'))
    >>> repr(open('test.txt', 'rbU').read().decode('utf-16'))

    "u'a\\n\\nb\\n\\n'"
    >>> import codecs
    >>> repr(codecs.open('test.txt', 'rbU', 'utf-16').read())

    "u'a\\n\\nb\\n\\n'"

    of course, the output i want is:

    "u'a\\nb\\n'"

    i suppose it's not too surprising that the built-in open converts the
    line endings before decoding, but it surprised me that codecs.open does
    this as well.

    is there a way to get universal newlines to work properly with utf-16 files?

    (nb: i'm not interested in other methods of converting line endings -
    just whether universal newlines can be made to work correctly).
     
    Baz Walter, Apr 11, 2010
    #1
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JJBW
    Replies:
    1
    Views:
    12,513
    Joerg Jooss
    Apr 24, 2004
  2. =?Utf-8?B?QXNoYQ==?=
    Replies:
    3
    Views:
    607
  3. Torsten Marek

    no universal newlines in os.popen

    Torsten Marek, Aug 8, 2003, in forum: Python
    Replies:
    1
    Views:
    381
    Martin v. =?iso-8859-15?q?L=F6wis?=
    Aug 8, 2003
  4. Arifi Koseoglu
    Replies:
    2
    Views:
    1,312
    Arifi Koseoglu
    Apr 13, 2004
  5. Jimmy Shaw

    Converting from UTF-16 to UTF-32

    Jimmy Shaw, Jul 31, 2006, in forum: C++
    Replies:
    7
    Views:
    1,677
    P.J. Plauger
    Aug 1, 2006
  6. darrel
    Replies:
    5
    Views:
    690
    =?ISO-8859-1?Q?G=F6ran_Andersson?=
    Apr 14, 2007
  7. jmfauth
    Replies:
    4
    Views:
    490
    jmfauth
    Oct 13, 2010
  8. Grzegorz ¦liwiñski
    Replies:
    2
    Views:
    1,342
    Grzegorz ¦liwiñski
    Jan 19, 2011
Loading...