How to read space separated file in python?

ganesh gajre · Nov 21, 2008

Hi all,

I want to read file which is mapping file. Used in to map character from ttf
to unicode.
eg

Map file contain data in the following way:

0 à¥¦
1 à¥§
2 à¥¨
3 à¥©
4 à¥ª
5 à¥«
6 à¥¬
7 à¥
8 à¥®
9 à¥¯

Like this. Please use any unicode editor to view the text if it not properly
shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=à¥¦

How can i do this?

please give me solution?

Regards,
Ginovation

Steven D'Aprano · Nov 21, 2008

Hi all,

I want to read file which is mapping file. Used in to map character from
ttf to unicode.
eg

Map file contain data in the following way:

0 à¥¦
1 à¥§
2 à¥¨
3 à¥©
4 à¥ª
5 à¥«
6 à¥¬
7 à¥
8 à¥®
9 à¥¯

Like this. Please use any unicode editor to view the text if it not
properly shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=à¥¦

How can i do this?

please give me solution?

Well, because you said please...

I assume the encoding of the second column is utf-8. You need something
like this:

# Untested.
column0 = []
column1 = []
for line in open('somefile', 'r'):
a, b = line.split()
column0.append(a)
column1.append(b.decode('utf-8'))

Peter Otten · Nov 21, 2008

ganesh said:
Hi all,

I want to read file which is mapping file. Used in to map character from
ttf to unicode.
eg

Map file contain data in the following way:

0 à¥¦
1 à¥§
2 à¥¨
3 à¥©
4 à¥ª
5 à¥«
6 à¥¬
7 à¥
8 à¥®
9 à¥¯

Like this. Please use any unicode editor to view the text if it not
properly shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=à¥¦

How can i do this?

please give me solution?

Read the file:

import codecs
pairs = [line.split() for line in codecs.open("ganesh.txt", encoding="utf-8")]
pairs[0]

Click to expand...

Click to expand...

[u'0', u'\u0966']

Create the conversion dictionary:

Do the translation:
à¥¦à¥§à¥§à¥¦à¥¯à¥®à¥à¥¬

You may have to use int(s) instead of ord(s) in your actual conversion code:
à¥¦à¥§à¥¯

Peter

Joe Strout · Nov 21, 2008

a, b = line.split()

Note that in a case like this, you may want to consider using
partition instead of split:

a, sep, b = line.partition(' ')

This way, if there happens to be more than one space (for example,
because the Unicode character you're mapping to happens to be a
space), it'll still work. It also better encodes the intention, which
is to split only on the first space in the line, rather than on every
space.

(It so happens I ran into exactly this issue yesterday, though my
delimiter was a colon.)

Cheers,
- Joe

Steve Holden · Nov 21, 2008

Joe said:
Note that in a case like this, you may want to consider using partition
instead of split:

a, sep, b = line.partition(' ')

This way, if there happens to be more than one space (for example,
because the Unicode character you're mapping to happens to be a space),
it'll still work. It also better encodes the intention, which is to
split only on the first space in the line, rather than on every space.

(It so happens I ran into exactly this issue yesterday, though my
delimiter was a colon.)

Joe:

In the special case of the None first argument (the default for the
str.split() method) runs of whitespace *are* treated as single
delimiters. So line.split() is not the same as line.split(' ').

regards
Steve

Joe Strout · Nov 21, 2008

In the special case of the None first argument (the default for the
str.split() method) runs of whitespace *are* treated as single
delimiters. So line.split() is not the same as line.split(' ').

Right -- so using split() gives you the wrong answer for two different
reasons. Try these:
ValueError: need more than 1 value to unpack
is some extra stuff"
ValueError: too many values to unpack

Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).

Cheers,
- Joe

Gabriel Genellina · Nov 21, 2008

En Fri said:
Right -- so using split() gives you the wrong answer for two different
reasons. Try these:

ValueError: need more than 1 value to unpack

some extra stuff"
ValueError: too many values to unpack

Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).

split takes an additional argument too:

py> line = "3 x and here is some extra stuff"
py> a, b = line.split(None, 1)
py> a
'3'
py> b
'x and here is some extra stuff'

But it still fails if the line contains no spaces. partition is more
robust in those cases

Steve Holden · Nov 21, 2008

Joe Strout wrote:
[...]

Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).

I believe if you read the OP's post again you will see that he specified
two non-space items per line.

You really *love* being right, don't you? ;-) You say partition "...
better encodes the intention, which is to split only on the first space
in the line, rather than on every space". Your mind-reading abilities
are clearly superior to mine.

Anyway, sorry to have told you something you already knew. It's true
that partition has its place, and is too often overlooked. Particularly
by me.

regards
Steve

How to change key name in json file with python	0	Oct 2, 2022
How to read a make file in python and access its elements	2	Jul 22, 2013
Cyrillic text from file - set utf8 in cmd, unknown characters output anyway	0	Nov 11, 2022
How to make a Python script to audio read a text file on phone ?	1	Mar 17, 2013
How to Create a random password generator in a separate window	4	May 26, 2022
Data saving in condition of changing reality	0	Apr 29, 2022
How to read ansic file into a pre-defined class?	1	Jan 8, 2011
How to read ansic file into a pre-defined class?	0	Jan 8, 2011

How to read space separated file in python?

ganesh gajre

Steven D'Aprano

Peter Otten

Joe Strout

Steve Holden

Joe Strout

Gabriel Genellina

Steve Holden

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads