How to use the method loadtxt() of numpy neatly?

C

chao dong

HI, everybody. When I try to use numpy to deal with my dataset in the style of csv, I face a little problem.

In my dataset of the csv file, some columns are string that can not convert to float easily. Some of them can ignore, but other columns I need to change the data to a enum style.

for example, one column just contain three kinds : S,Q,C. Each of them can declare one meaning, so I must convert them to a dict just like {1,2,3}

Now the question is, when I use numpy.loadtxt, I must do all things above in just one line and one fuction. So as a new user in numpy, I don't know how to solve it.

Thank you.
 
R

rusi

HI, everybody. When I try to use numpy to deal with my dataset in the style of csv, I face a little problem.
In my dataset of the csv file, some columns are string that can not convert to float easily. Some of them can ignore, but other columns I need to change the data to a enum style.
for example, one column just contain three kinds : S,Q,C. Each of them can declare one meaning, so I must convert them to a dict just like {1,2,3}

What does "dict like {1,2,3}" mean??

On recent python thats a set
On older ones its probably an error.
So you can mean one of:
1. Set([1,2,3])
2. List: [1,2,3]
3. Tuple: (1,2,3)
4. Dict: {"S":1, "Q":2, "C":3}
5. An enumeration (on very recent pythons)
6. A simulation of an enum using classes (or somesuch)
7. Something else

Now the question is, when I use numpy.loadtxt, I must do all things above in just one line and one fuction. So as a new user in numpy, I don't know how to solve it.

I suggest you supply a couple of rows of your input
And the corresponding python data-structures you desire
Someone should then suggest how to go about it
 
P

Peter Otten

chao said:
HI, everybody. When I try to use numpy to deal with my dataset in the
style of csv, I face a little problem.

In my dataset of the csv file, some columns are string that can not
convert to float easily. Some of them can ignore, but other columns I
need to change the data to a enum style.

for example, one column just contain three kinds : S,Q,C. Each of them
can declare one meaning, so I must convert them to a dict just like
{1,2,3}

Now the question is, when I use numpy.loadtxt, I must do all things
above in just one line and one fuction. So as a new user in numpy, I
don't know how to solve it.

Thank you.

Here's a standalone demo:

import numpy

_lookup={"A": 1, "B": 2}
def convert(x):
return _lookup.get(x, -1)

converters = {
0: convert, # in column 0 convert "A" --> 1, "B" --> 2,
# anything else to -1
}


if __name__ == "__main__":
# generate csv
with open("tmp_sample.csv", "wb") as f:
f.write("""\
A,1,this,67.8
B,2,should,56.7
C,3,be,34.5
A,4,skipped,12.3
""")

# load csv
a = numpy.loadtxt(
"tmp_sample.csv",
converters=converters,
delimiter=",",
usecols=(0, 1, 3) # skip third column
)
print a

Does that help?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top