extracting numbers from a file, excluding fixed words

D

dawenliu

Hi, I have a file with this content:
xxxxxxx xxxxxxxxxx xxxxx xxxxxxx
1
0
0
0
1
1
0
(many more 1's and 0's to follow)
yyyyy yyyyyy yyy yyyyyy yyyyy yyy

The x's and y's are FIXED and known words which I will ignore, such as
"This is the start of the file" and "This is the end of the file". The
digits 1 and 0 have UNKNOWN length. I want to extract the digits and
store them in a file. Any suggestions will be appreciated.
 
D

Daniel Bowett

dawenliu said:
Hi, I have a file with this content:
xxxxxxx xxxxxxxxxx xxxxx xxxxxxx
1
0
0
0
1
1
0
(many more 1's and 0's to follow)
yyyyy yyyyyy yyy yyyyyy yyyyy yyy

The x's and y's are FIXED and known words which I will ignore, such as
"This is the start of the file" and "This is the end of the file". The
digits 1 and 0 have UNKNOWN length. I want to extract the digits and
store them in a file. Any suggestions will be appreciated.

Open the file and read each line 1 at a time. If the line doesn't equal
x or y then add the line to a list

f = open("file.txt")

numbers = []

for eachline in f.xreadlines():
if (eachline <> x) or (eachline <> y):
numbers.append(eachline)
 
K

Kent Johnson

dawenliu said:
Hi, I have a file with this content:
xxxxxxx xxxxxxxxxx xxxxx xxxxxxx
1
0
0
0
1
1
0
(many more 1's and 0's to follow)
yyyyy yyyyyy yyy yyyyyy yyyyy yyy

The x's and y's are FIXED and known words which I will ignore, such as
"This is the start of the file" and "This is the end of the file". The
digits 1 and 0 have UNKNOWN length. I want to extract the digits and
store them in a file. Any suggestions will be appreciated.

Off the top of my head (not tested):

inf = open('input.txt')
out = open('output.txt', 'w')

skips = [
'xxxxxxx xxxxxxxxxx xxxxx xxxxxxx',
'yyyyy yyyyyy yyy yyyyyy yyyyy yyy',
]

for line in inf:
for skip in skips:
if skip in line:
continue
out.write(line)

inf.close()
out.close()

Kent
 
D

dawenliu

Thanks Kent. The code looks reasonable, but the result is that, the
output file comes out identical as the input file, with all the xxxx
and yyyy remaining inside.
 
D

dawenliu

I've changed the code a little bit and works fine now:
inf = open('input.txt')
out = open('output.txt', 'w')

skips = [
'xxxxxxx xxxxxxxxxx xxxxx xxxxxxx',
'yyyyy yyyyyy yyy yyyyyy yyyyy yyy']

for line in inf:
flag = 0
for skip in skips:
if skip in line:
flag = 1
continue
if flag == 0:
out.write(line)

inf.close()
out.close()
 
A

Alex Martelli

dawenliu said:
Hi, I have a file with this content:
xxxxxxx xxxxxxxxxx xxxxx xxxxxxx
1
0
0
0
1
1
0
(many more 1's and 0's to follow)
yyyyy yyyyyy yyy yyyyyy yyyyy yyy

The x's and y's are FIXED and known words which I will ignore, such as
"This is the start of the file" and "This is the end of the file". The
digits 1 and 0 have UNKNOWN length. I want to extract the digits and
store them in a file. Any suggestions will be appreciated.

[[warning, untested code...]]

infile = open('infile.txt')
oufile = open('oufile.txt', 'w')
for line in infile:
if line.strip().isdigit(): oufile.write(line)
oufile.close()
infile.close()


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top