Help: Print lines

Amy Lee · Apr 24, 2008

Hello,

I'm a newbie in Perl. And I face a problem when I process the data from a
file. My file is like is

CT1 XY0002658-96
0000222541
XY0002688-55
0000254147
CT5

ZZ0004854-00
0000475568
............

And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
0000222541', if match 'CT2' print 'XY0002688-55
0000254147'. However, when I use
if /CT1/
{
print;
}
just print the label, what if I hope print contents, what should I notice?

Thank you very much~

Regards,

Amy Lee

Gunnar Hjalmarsson · Apr 24, 2008

Amy said:
My file is like is

ZZ0004854-00
0000475568
...........

And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
0000222541', if match 'CT2' print 'XY0002688-55 0000254147'.

C:\home>type test.pl
while ( <DATA> ) {
if ( /CT2/ ) {
print scalar <DATA>;
print scalar <DATA>;
}
}

__DATA__

CT1 XY0002658-96
0000222541
XY0002688-55
0000254147
CT5

ZZ0004854-00
0000475568

C:\home>test.pl
XY0002688-55
0000254147

C:\home>

Amy Lee · Apr 24, 2008

C:\home>type test.pl
while ( <DATA> ) {
if ( /CT2/ ) {
print scalar <DATA>;
print scalar <DATA>;
}
}

__DATA__
ZZ0004854-00
0000475568

C:\home>test.pl
XY0002688-55
0000254147

C:\home>

Thank you very much. But I just have Learning Perl this book and I didn't
find out what "print scalar" is. And if the content dose not just contain
2 lines, multi lines, what should I do?

Thank you again.

Amy

Gunnar Hjalmarsson · Apr 24, 2008

Amy said:
Thank you very much. But I just have Learning Perl this book and I didn't
find out what "print scalar" is.

Assuming you know what print() is, please check out

perldoc -f scalar

And if the content dose not just contain
2 lines, multi lines, what should I do?

Then the above approach isn't sufficient. Something like this might do:

while ( <DATA> ) {
if ( /CT2/ ) {
while ( <DATA> ) {
last if /^>/;
print;
}
}
}

RedGrittyBrick · Apr 24, 2008

Amy said:
Thank you very much. But I just have Learning Perl this book and I didn't
find out what "print scalar" is.

It isn't "print scalar" it is "print X" where X is "scalar <DATA>"

perldoc -f scalar

And if the content dose not just contain
2 lines, multi lines, what should I do?

perldoc -q paragraph

------------------ 8< ------------------
#!/usr/bin/perl
#
use strict;
use warnings;

$/ = "\n >";
while (my $record = <DATA>) {
if ($record=~/CT2\n(.*)\n/s) { print $1 }
}

__DATA__

CT1 XY0002658-96
0000222541
XY0002688-55
0000254147
CT5

ZZ0004854-00
0000475568
------------------ 8< ------------------

RedGrittyBrick · Apr 24, 2008

RedGrittyBrick said:
#!/usr/bin/perl
#
use strict;
use warnings;

$/ = "\n >";
while (my $record = <DATA>) {
if ($record=~/CT2\n(.*)\n/s) { print $1 }
}

Or
$/ = "\n >";
while (<DATA>) {
print $1 if /CT5\n(.*)\n/s;
}

January Weiner · Apr 24, 2008

[snip]

Use bioperl to parse FASTA files

j.

Amy Lee · Apr 25, 2008

Or
$/ = "\n >";
while (<DATA>) {
print $1 if /CT5\n(.*)\n/s;
}

Thank you. But there's a problem I can't understand. What if I hope create
files like CT1 contains the CT1 label including; CT2 contains the CT2
label including and so on. However, I think I should read the label.

How to accomplish that?

Thank you very much~

Regards,

Amy Lee

RedGrittyBrick · Apr 25, 2008

Amy said:
Thank you. But there's a problem I can't understand. What if I hope create
files like CT1 contains the CT1 label including; CT2 contains the CT2
label including and so on. However, I think I should read the label.

How to accomplish that?

I'm not sure I understand what you mean - it would be clearer if you
give an example of the data.

Did you mean

CT1 XY0002658-96
0000222541
CT1
4444444444
5555555555
XY0002688-55
0000254147
CT1
CT2
5555555555
6666666666
7777777777
CT5

ZZ0004854-00
0000475568
CT2
CT5
5555555555
6666666666

If so, my suggested script would split the records OK because it uses
newline space greater-than as the record separator. It however would
select the wrong records because the selector is now insufficiently
precise. We want to match CT2 (say) only when occurs at the start of a
record. You can use the ^ character to anchor an expression to the
start. "/CT2.../" becomes "/^CT2.../"

Do read the documentation - you will be able to work a lot of this out
yourself.

perldoc perlre
perldoc perlop (look for "m/PATTERN")

Amy Lee · Apr 25, 2008

I'm not sure I understand what you mean - it would be clearer if you
give an example of the data.

Did you mean
ZZ0004854-00
0000475568
CT2
CT5
5555555555
6666666666

If so, my suggested script would split the records OK because it uses
newline space greater-than as the record separator. It however would
select the wrong records because the selector is now insufficiently
precise. We want to match CT2 (say) only when occurs at the start of a
record. You can use the ^ character to anchor an expression to the
start. "/CT2.../" becomes "/^CT2.../"

Do read the documentation - you will be able to work a lot of this out
yourself.

perldoc perlre
perldoc perlop (look for "m/PATTERN")

Thanks your reply. I suppose that I can write a bit if possible. My
meaning is if my file contains CT1, CT2, CT5 these three entry, then I
can make 3 files called CT1, CT2, CT5 and CT1 file contains the words in
CT1 entry, CT2 file contains the words in CT2 entry, and so on.

I will paste my codes if I meet questions.

Thank you again.

Regards,

Amy

A. Sinan Unur · Apr 25, 2008

Amy Lee wrote:
....

I'm not sure I understand what you mean - it would be clearer if you
give an example of the data.

....

Do read the documentation - you will be able to work a lot of this out
yourself.

She seems to be here to get fish.

As January Weiner noted, apparently her data is domain specific. I do
not know what FASTA files are but if the OP really is working with FASTA
files, using BioPerl would indeed be the right thing to do instead of
trying to re-write that functionality through piecemeal questions posted
on clpmisc.

Sinan

--
A. Sinan Unur <[email protected]>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Php combine identical lines in text file	4	Oct 11, 2023
Help: Duplicate and Unique Lines Problem	16	Sep 29, 2008
Why is Python telling me variable is local not global?	3	Sep 2, 2023
Help please	8	Jul 7, 2023
Html data exchange help	0	Jan 2, 2020
UTF-8 read & print?	6	Nov 25, 2012
Question about range of Lines	19	Sep 15, 2011
Help: Odd Output	6	Jan 26, 2009

Help: Print lines

Amy Lee

Gunnar Hjalmarsson

Amy Lee

Gunnar Hjalmarsson

RedGrittyBrick

RedGrittyBrick

January Weiner

Amy Lee

RedGrittyBrick

Amy Lee

A. Sinan Unur

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads