Text Processing

Y

Yigit Turgut

Hi all,

I have a text file containing such data ;

A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04

But I only need Section B, and I need to change the notation to ;

8.000e-02 = 0.08
0.000e+00 = 0.00
4.000e-02 = 0.04

Text file is approximately 10MB in size. I looked around to see if
there is a quick and dirty workaround but there are lots of modules,
lots of options.. I am confused.

Which module is most suitable for this task ?
 
D

Dave Angel

Hi all,

I have a text file containing such data ;

A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04

But I only need Section B, and I need to change the notation to ;

8.000e-02 = 0.08
0.000e+00 = 0.00
4.000e-02 = 0.04

Text file is approximately 10MB in size. I looked around to see if
there is a quick and dirty workaround but there are lots of modules,
lots of options.. I am confused.

Which module is most suitable for this task ?
You probably don't need anything but sys (to parse the command options)
and os (maybe).

open the file
for eachline
if one of the header lines, continue
separate out the part you want
print it, formatted as you like

Then just run the script with its stdout redirected, and you've got your
new file

The details depend on what your experience with Python is, and what
version of Python you're running.
 
J

Jérôme

Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
Hi all,

I have a text file containing such data ;

A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04

But I only need Section B, and I need to change the notation to ;

8.000e-02 = 0.08
0.000e+00 = 0.00
4.000e-02 = 0.04

Text file is approximately 10MB in size. I looked around to see if
there is a quick and dirty workaround but there are lots of modules,
lots of options.. I am confused.

Which module is most suitable for this task ?

You could try to do it yourself.

You'd need to know what seperates the datas. Tabulation character ? Spaces ?

Exemple :

Input file
----------

A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04


Python code
-----------

# Open file
with open('test1.plt','r') as f:

b_values = []

# skip as many lines as needed
line = f.readline()
line = f.readline()
line = f.readline()

while line:
#start = line.find(u"\u0009", 0) + 1 #seek Tab
start = line.find(" ", 0) + 4 #seek 4 spaces
#end = line.find(u"\u0009", start)
end = line.find(" ", start)
b_values.append(float(line[start:end].strip()))
line = f.readline()

print b_values

It gets trickier if the amount of spaces is not constant. I would then try
with regular expressions. Perhaps would regexp be more efficient in any case.
 
N

Nick Dokos

Jérôme said:
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:


You could try to do it yourself.

Does it have to be python? If not, I'd go with something similar to

sed 1,2d foo.data | awk '{printf("%.2f\n", $2);}'

Nick
 
A

Alexander Kapps

Does it have to be python? If not, I'd go with something similar to

sed 1,2d foo.data | awk '{printf("%.2f\n", $2);}'

Why sed and awk:

awk 'NR>2 {printf("%.2f\n", $2);}' data.txt

And in Python:

f = open("data.txt")
f.readline() # skip header
f.readline() # skip header
for line in f:
print "%02s" % float(line.split()[1])
 
Y

Yigit Turgut

Does it have to be python? If not, I'd go with something similar to
    sed 1,2d foo.data | awk '{printf("%.2f\n", $2);}'

Why sed and awk:

awk 'NR>2 {printf("%.2f\n", $2);}' data.txt

And in Python:

f = open("data.txt")
f.readline()    # skip header
f.readline()    # skip header
for line in f:
     print "%02s" % float(line.split()[1])

@Jerome ; Your suggestion provided floating point error, it might need
some slight modificiation.

@Nick ; Sorry mate, it needs to be in Python. But I noted solution in
case if I need for another case.

@Alexander ; Works as expected.

Thank you all for the replies.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,008
Latest member
HaroldDark

Latest Threads

Top