if / else, how to include next line

M

Marc Hoeppner

Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts) and...and
this is very I am stuck ...it should also included the next 8 lines. The
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of KOGs) and
write them together with the associated lines into a new file. I guess
the issue at the moment is, that when I open the ARGV[0], it reads one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly appreciated

/Marc
 
R

Robert Dober

Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts) and...and
this is very I am stuck ...it should also included the next 8 lines. The
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of KOGs) and
write them together with the associated lines into a new file. I guess
the issue at the moment is, that when I open the ARGV[0], it reads one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly appreciated

/Marc
You can read all files indicated by args or stdin with ARGF
ARGF.readlines will slurp in the whole file, but if it is large that
might be a problem.

Let us try a robust solution first

count = 0
ARGF.each do |line| ## treating a line at turn
puts line unless count.zero?
count = [0, count-1].max
if line =~ /your keyword or something/ then
puts line
count = 8
end
end

Now that is one ruby solution, maybe you want to use grep, sorry if
this is noise but maybe this helps too.
grep -A8 youre_keywoarde_blease ...

HTH
Robert
 
S

Stefano Crocco

Alle mercoled=C3=AC 11 luglio 2007, Marc Hoeppner ha scritto:
Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts) and...and
this is very I am stuck ...it should also included the next 8 lines. The
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of KOGs) and
write them together with the associated lines into a new file. I guess
the issue at the moment is, that when I open the ARGV[0], it reads one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly appreciated

/Marc

You can try this:

require 'enumerator'
text.each_slice(9) do |a|
puts a.join "\n" if a.first =3D=3D key
end

This will pass the array the lines grouped in arrays of 9 lines each (i.e, =
in=20
the first iteration the array will contain the lines from KOG0003 to Pl=20
000001000, the second time from KOG0009 to Pl 001000, and so on). The line=
=20
with the key word will be the first of each array, so you print the array=20
only if it is equal to the key you chose.

I hope this helps

Stefano
 
J

John Joyce

Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts) and...and
this is very I am stuck ...it should also included the next 8
lines. The
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of KOGs)
and
write them together with the associated lines into a new file. I guess
the issue at the moment is, that when I open the ARGV[0], it reads one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly
appreciated

/Marc

You can use a heredoc to maintain all the whitespace.
Here's some info:
http://blog.jayfields.com/2006/12/ruby-multiline-strings-here-doc-
or.html
Then just assign the heredoc to a variable
last_8_lines = <<-heredoc_ender
KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000
heredoc_ender

When you want to use it, just :
puts last_8_lines

You can also use string interpolation with heredocs


#{keyword_string}

John Joyce
 
J

John Joyce

Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts)
and...and
this is very I am stuck ...it should also included the next 8
lines. The
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of
KOGs) and
write them together with the associated lines into a new file. I
guess
the issue at the moment is, that when I open the ARGV[0], it reads
one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly
appreciated

/Marc

You can use a heredoc to maintain all the whitespace.
Here's some info:
http://blog.jayfields.com/2006/12/ruby-multiline-strings-here-doc-
or.html
Then just assign the heredoc to a variable
last_8_lines = <<-heredoc_ender
KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000
heredoc_ender

When you want to use it, just :
puts last_8_lines

You can also use string interpolation with heredocs


#{keyword_string}

John Joyce
Oops, I misunderstood the OP!
You might also use ri to look up "readlines"
your loop mechanism could be done many ways.
but Robert gave one way.
Definitely read one line at a time, searching for the string.
if found, start counting.

8 lines

assign the search line to a variable, keep reusing that variable
until you get a match,
after the match, the next 8 lines are each just concatenated onto the
end of the variable.
 
R

Robert Dober

Alle mercoled=EC 11 luglio 2007, Marc Hoeppner ha scritto:
Hi everyone,

absolute beginner here, so bare with me ;)

The following issue came up and I can't figure out how to solve.
Probably nothing very complicated, tho.

I want to load a text file via ARGV[0] and then search it for a key
word. If this key word is found, it should be printed (puts) and...and
this is very I am stuck ...it should also included the next 8 lines. Th= e
file structure looks like this:

KOG0003
Sc 000000000
Dr 000000000
Ar 001010011
Ca 000100100
Ho 001010010
Sa 010000000
An 100000100
Pl 000001000
KOG0009
Sc 000100
Dr 100000
Ar 001001
Ca 101000
Ho 101010
Sa 010000
An 100000
Pl 001000

So lets say I want to select for a specific KOG (or a list of KOGs) and
write them together with the associated lines into a new file. I guess
the issue at the moment is, that when I open the ARGV[0], it reads one
line at a time - cant quite figure out how to include the "go to next
line and puts it, too" part.

Any help pointing me in the right direction would be greatly appreciate= d

/Marc

You can try this:

require 'enumerator'
text.each_slice(9) do |a|
puts a.join "\n" if a.first =3D=3D key
end

That is nice code, I did not want to rely on the exact format of the
input as I learned to distrust specifications ;). But you do what OP
asked for.
Allow me to recall a small hint David Black has given some time ago

puts a
is equivalent to
puts a.join("\n")
This will pass the array the lines grouped in arrays of 9 lines each (i.e= , in
the first iteration the array will contain the lines from KOG0003 to Pl
000001000, the second time from KOG0009 to Pl 001000, and so on). The lin= e
with the key word will be the first of each array, so you print the array
only if it is equal to the key you chose.

I hope this helps

Stefano
Cheers
Robert

--=20
I always knew that one day Smalltalk would replace Java.
I just didn't know it would be called Ruby
-- Kent Beck
 
S

Stefano Crocco

Alle mercoled=EC 11 luglio 2007, Robert Dober ha scritto:
Allow me to recall a small hint David Black has given some time ago

puts a
is equivalent to
puts a.join("\n")

You're right. I didn't think to try it.

Stefano
 
M

Marc Hoeppner

count = 0
ARGF.each do |line| ## treating a line at turn
puts line unless count.zero?
count = [0, count-1].max
if line =~ /your keyword or something/ then
puts line
count = 8
end
end

Ok, I started with this one, but run into one problem - again probably
simple...
The way this is written it asks for a regexp as a keyword. Now, I do
have a text file, in which I have a number of key words, one in each
line. What I wasnt able to figure out thus far was how to change the
above code so that it would expect a variable instead of a regexp.

When I defined a variable within the code var1 = 'KOG0019' and tried to
use it as an argument ala if = var1 then it did something..., but
instead of finding the corresponding line and copying the following at
lines it copies like 12000 lines and puts the keyword in every second
line...

/Marc
 
R

Robert Dober

count = 0
ARGF.each do |line| ## treating a line at turn
puts line unless count.zero?
count = [0, count-1].max
if line =~ /your keyword or something/ then
puts line
count = 8
end
end

Ok, I started with this one, but run into one problem - again probably
simple...
The way this is written it asks for a regexp as a keyword. Now, I do
have a text file, in which I have a number of key words, one in each
line. What I wasnt able to figure out thus far was how to change the
above code so that it would expect a variable instead of a regexp.

When I defined a variable within the code var1 = 'KOG0019' and tried to

I had no idea what you wanted exactly, here are some possibilites
(a) always the same string
if line == "KOG0019"
(b) containing the same string
if line =~ /KOG0019/
or my favorite idiom
if /KOG0019/ === line
(c) KOG followed by digits something
if /KOG\d+/ === line
(d) KOG and exactly 4 digits
if /KOG\d{4}/ === line
(e) KOG and some digits with no more text thereafter
if /KOG\d+\s*$/ === line

This should get you going;) if not try to learn more about Regexps; by
scanning this list or reading the perl man page about regexps (the
baisc stuff is the same and gets you a long way already). The ultimate
guide seems to be Friedl's "Mastering Regular Expressions".

Cheers
Robert
 
M

Marc Hoeppner

Hm, maybe I was in fact a bit vague here, sorry.

I think grep will do the job (sometimes its just simpler than one
expected...), but now just out of curiosity:

I have one file, structured as shown above. It contains some 600
"entries" (that is a KOG-number followed by 8 lines of information). Out
of these 600-someting entries, I only need a particular subset. The
information on which KOG-entries I need are in a second text file
(basically the KOG-numbers, one in each line), like so:

KOG0019
KOG0101
KOG0245
etc

Ideally,the program would read this keyword-file, store the information
in a variable (an array, I guess) and then use this array to identify
the entries in the "main" file and puts them into stdout.

*scratches head*
 
R

Robert Dober

Hm, maybe I was in fact a bit vague here, sorry.

I think grep will do the job (sometimes its just simpler than one
expected...), but now just out of curiosity:

I have one file, structured as shown above. It contains some 600
"entries" (that is a KOG-number followed by 8 lines of information). Out
of these 600-someting entries, I only need a particular subset. The
information on which KOG-entries I need are in a second text file
(basically the KOG-numbers, one in each line), like so:

KOG0019
KOG0101
KOG0245
etc

Ideally,the program would read this keyword-file, store the information
in a variable (an array, I guess) and then use this array to identify
the entries in the "main" file and puts them into stdout.

*scratches head*

Let us assume you call your prog like this

ruby extractor.rb <keynames> <data>

keynames = File.readlines( ARGV.shift )
key_hash = {} # a set
keynames.each do | line | key_hash[line.strip] = true end

File.open(ARGV.first){ ## Preferable to assure automatic closure of file
| f | ## that is the file we opened
count =0
f.each { |line|
...
if key_hash[ line.strip ] then
puts line
count = 8
end
}
}

and maybe stick to "{...}" or "do ... end"
just wanted to show you both

Cheers
Robert
}
}
 
S

Sebastian Hungerecker

Marc said:
Ideally,the program would read this keyword-file, store the information
in a variable (an array, I guess) and then use this array to identify
the entries in the "main" file and puts them into stdout.

Something like this should work:

# Store the contents of keywords.txt in an array:
keywords=File.readlines("keywords.txt")

# Read main.txt, then create an array of arrays where the first element
# contains the KOG-line and the second contains the text up to the next
# KOG-line (or the end of the file). Then convert the whole thing into a hash:
hash=Hash[*File.read("main.txt").scan(/(KOG\d+\n)(.+?)(?=KOG|\Z)/m).flatten]

# For each keyword puts the corresponding text from the hash:
keywords.each { |keyword| puts keyword, hash[keyword] }


HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,113
Latest member
Vinay KumarNevatia
Top