Split a multi-sequence file into individual files

E

ela

From google, no need to reinvent the wheel but this one line code is too
difficult to understand...

perl -ne 'BEGIN{ $/=">"; } if(/^\s*(\S+)/){ open(F,">$1.fsa")||warn"$1 write
failed:$!\n";chomp;print F ">", $_ }' fastafile

anybody helps?
 
T

Tad J McClellan

ela said:
From google, no need to reinvent the wheel but this one line code is too
difficult to understand...

perl -ne 'BEGIN{ $/=">"; } if(/^\s*(\S+)/){ open(F,">$1.fsa")||warn"$1 write
failed:$!\n";chomp;print F ">", $_ }' fastafile

anybody helps?


BEGIN{ $/=">"; } # set the Input Record Separator (perlvar.pod)
while ( <> ) { # -n wraps in a while-diamond loop
if( /^\s*(\S+)/ ){ # grab the first non-whitespace characters
open(F,">$1.fsa") || warn"$1 write failed:$!\n"; # open a file
chomp; # remove ">" from end of string
print F ">", $_; # print ">" at beginning of string
}
}
 
M

Mirco Wahab

Tad said:
BEGIN{ $/=">"; } # set the Input Record Separator (perlvar.pod)
while ( <> ) { # -n wraps in a while-diamond loop
if( /^\s*(\S+)/ ){ # grab the first non-whitespace characters
open(F,">$1.fsa") || warn"$1 write failed:$!\n"; # open a file
chomp; # remove ">" from end of string
print F ">", $_; # print ">" at beginning of string
}
}

I don't understand the purpose of the chomp,
maybe it needs to be in front of the if():

...
local $/ = '>';
while (<>) {
chomp;
if( /\s*(\S+)/ ) {
open my $fh, '>', "$1.fsa" or warn "$1 $!";
print $fh '>'.$_
}
}
...

Regards

M.
 
T

Tim Greer

Mirco said:
I don't understand the purpose of the chomp,
maybe it needs to be in front of the if():

...
local $/ = '>';
while (<>) {
chomp;
if( /\s*(\S+)/ ) {
open my $fh, '>', "$1.fsa" or warn "$1 $!";
print $fh '>'.$_
}
}
...

Regards

M.

perldoc -f chomp

Chomp removes any newline, if one exists (which it probably would on
<>).

It's the difference between (trying to) opening:
$1.fsa

and

$1
..fsa
 
T

Tim Greer

Tim said:
perldoc -f chomp

Chomp removes any newline, if one exists

Pardon... to be clear, it removes the new line at the end of the string
(not just any new line).
 
M

Mirco Wahab

Tim said:
perldoc -f chomp

Chomp removes any newline, if one exists (which it probably would on
<>).

No, it doesn't. It removes the $/, which is
here the '>'.
It's the difference between (trying to) opening:
$1.fsa

and

$1
.fsa

No way. In the above problem, it would on the
first record get the '>' in $1, which leads
to an open argument of ">>.fsa" which
creates a file '.fsa' that contains noting.

Regards

M.
 
T

Tim Greer

Mirco said:
No, it doesn't. It removes the $/, which is
here the '>'.

My newsreader is interpreting / / and <> for some reason (and I'm not
seeing what I should be seeing), so I didn't see all of the code for
what it was, I guess. I saw while (<>) { chomp; ... } and hence my
reply. Disregard if it wasn't relevant after all.
 
X

xhoster

Mirco Wahab said:
I don't understand the purpose of the chomp,

It is to remove the trailing ">", which is not wanted. In FASTA sequence
files, ">" is start of the next record, not the end of the current one.
maybe it needs to be in front of the if():

I don't see how that would make a difference. If the if fails, nothing
happens anyway. If the if succeeds, it makes no difference if the chomp
is done before or after.

Ah, but if the file starts out with the first character of ">", (which it
probably does) then the first record contains nothing but $/. By not
chomping the conditional is true you litter your file system with invisible
(on linux) empty files named .fsa. If you do chomp, the conditional is
false and nothing happens, which is what one wants. So yes, the chomp
should be before the if.


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top