Unicode string handling problem (revised)

  • Thread starter Richard Schulman
  • Start date
R

Richard Schulman

The appended program fragment works correctly with an ascii input
file. But the file I actually want to process is Unicode (utf-16
encoding). This file must be Unicode rather than ASCII or Latin-1
because it contains mixed Chinese and English characters.

When I run the program I get an attribute_count of zero. This
is incorrect for the input file, which should give a value of fifteen
or sixteen. In other words, the count function isn't recognizing the

",

characters to be counted in the line read.

Here's the program:

in_file = open("c:\\pythonapps\\in-graf1.my","rU")
try:
# Skip the first line; make the second available for processing
in_file.readline()
in_line = in_file.readline()
attribute_count = in_line.count('",')
print attribute_count
finally:
in_file.close()

Any suggestions?

Richard Schulman
(delete 'xx' characters for email reply)
 
J

John Machin

Richard Schulman wrote:
[snip]
in_line = in_file.readline()
[snip]

We'd already deduced that that line was incorrectly published.
Please don't start new threads like this; if you want to make a
correction, do a couple-of-lines reply to your original message.
Now please leave this new thread alone, and reply to the
much-more-meaningful questions in the original thread.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Unicode string handling problem 8
Thinking Unicode 0
b64encode and unicode problem 2
Unicode list 4
Ascii to Unicode. 4
Ascii to Unicode. 16
unicode encoding usablilty problem 30
unicode in exception traceback 2

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top