Help: Print lines

Discussion in 'Perl Misc' started by Amy Lee, Apr 24, 2008.

  1. Amy Lee

    Amy Lee Guest

    Hello,

    I'm a newbie in Perl. And I face a problem when I process the data from a
    file. My file is like is

    >CT1

    XY0002658-96
    0000222541
    >CT2

    XY0002688-55
    0000254147
    >CT5

    ZZ0004854-00
    0000475568
    ............

    And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
    0000222541', if match 'CT2' print 'XY0002688-55
    0000254147'. However, when I use
    if /CT1/
    {
    print;
    }
    just print the label, what if I hope print contents, what should I notice?

    Thank you very much~

    Regards,

    Amy Lee
    Amy Lee, Apr 24, 2008
    #1
    1. Advertising

  2. Amy Lee wrote:
    > My file is like is
    >
    >> CT1

    > XY0002658-96
    > 0000222541
    >> CT2

    > XY0002688-55
    > 0000254147
    >> CT5

    > ZZ0004854-00
    > 0000475568
    > ...........
    >
    > And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
    > 0000222541', if match 'CT2' print 'XY0002688-55 0000254147'.


    C:\home>type test.pl
    while ( <DATA> ) {
    if ( /CT2/ ) {
    print scalar <DATA>;
    print scalar <DATA>;
    }
    }

    __DATA__
    >CT1

    XY0002658-96
    0000222541
    >CT2

    XY0002688-55
    0000254147
    >CT5

    ZZ0004854-00
    0000475568

    C:\home>test.pl
    XY0002688-55
    0000254147

    C:\home>

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Apr 24, 2008
    #2
    1. Advertising

  3. Amy Lee

    Amy Lee Guest

    On Thu, 24 Apr 2008 09:44:37 +0200, Gunnar Hjalmarsson wrote:

    > Amy Lee wrote:
    >> My file is like is
    >>
    >>> CT1

    >> XY0002658-96
    >> 0000222541
    >>> CT2

    >> XY0002688-55
    >> 0000254147
    >>> CT5

    >> ZZ0004854-00
    >> 0000475568
    >> ...........
    >>
    >> And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
    >> 0000222541', if match 'CT2' print 'XY0002688-55 0000254147'.

    >
    > C:\home>type test.pl
    > while ( <DATA> ) {
    > if ( /CT2/ ) {
    > print scalar <DATA>;
    > print scalar <DATA>;
    > }
    > }
    >
    > __DATA__
    > >CT1

    > XY0002658-96
    > 0000222541
    > >CT2

    > XY0002688-55
    > 0000254147
    > >CT5

    > ZZ0004854-00
    > 0000475568
    >
    > C:\home>test.pl
    > XY0002688-55
    > 0000254147
    >
    > C:\home>


    Thank you very much. But I just have Learning Perl this book and I didn't
    find out what "print scalar" is. And if the content dose not just contain
    2 lines, multi lines, what should I do?

    Thank you again.

    Amy
    Amy Lee, Apr 24, 2008
    #3
  4. Amy Lee wrote:
    > On Thu, 24 Apr 2008 09:44:37 +0200, Gunnar Hjalmarsson wrote:
    >> Amy Lee wrote:
    >>> My file is like is
    >>>
    >>>> CT1
    >>> XY0002658-96
    >>> 0000222541
    >>>> CT2
    >>> XY0002688-55
    >>> 0000254147
    >>>> CT5
    >>> ZZ0004854-00
    >>> 0000475568
    >>> ...........
    >>>
    >>> And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
    >>> 0000222541', if match 'CT2' print 'XY0002688-55 0000254147'.

    >> C:\home>type test.pl
    >> while ( <DATA> ) {
    >> if ( /CT2/ ) {
    >> print scalar <DATA>;
    >> print scalar <DATA>;
    >> }
    >> }
    >>
    >> __DATA__
    >> >CT1

    >> XY0002658-96
    >> 0000222541
    >> >CT2

    >> XY0002688-55
    >> 0000254147
    >> >CT5

    >> ZZ0004854-00
    >> 0000475568
    >>
    >> C:\home>test.pl
    >> XY0002688-55
    >> 0000254147
    >>
    >> C:\home>

    >
    > Thank you very much. But I just have Learning Perl this book and I didn't
    > find out what "print scalar" is.


    Assuming you know what print() is, please check out

    perldoc -f scalar

    > And if the content dose not just contain
    > 2 lines, multi lines, what should I do?


    Then the above approach isn't sufficient. Something like this might do:

    while ( <DATA> ) {
    if ( /CT2/ ) {
    while ( <DATA> ) {
    last if /^>/;
    print;
    }
    }
    }

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Apr 24, 2008
    #4
  5. Amy Lee wrote:
    > On Thu, 24 Apr 2008 09:44:37 +0200, Gunnar Hjalmarsson wrote:
    >
    >> Amy Lee wrote:
    >>> My file is like is
    >>>
    >>>> CT1
    >>> XY0002658-96
    >>> 0000222541
    >>>> CT2
    >>> XY0002688-55
    >>> 0000254147
    >>>> CT5
    >>> ZZ0004854-00
    >>> 0000475568
    >>> ...........
    >>>
    >>> And I hope when some conditions match 'CT1', then can print its contents 'XY0002658-96
    >>> 0000222541', if match 'CT2' print 'XY0002688-55 0000254147'.

    >> C:\home>type test.pl
    >> while ( <DATA> ) {
    >> if ( /CT2/ ) {
    >> print scalar <DATA>;
    >> print scalar <DATA>;
    >> }
    >> }
    >>
    >> __DATA__
    >> >CT1

    >> XY0002658-96
    >> 0000222541
    >> >CT2

    >> XY0002688-55
    >> 0000254147
    >> >CT5

    >> ZZ0004854-00
    >> 0000475568
    >>
    >> C:\home>test.pl
    >> XY0002688-55
    >> 0000254147
    >>
    >> C:\home>

    >
    > Thank you very much. But I just have Learning Perl this book and I didn't
    > find out what "print scalar" is.


    It isn't "print scalar" it is "print X" where X is "scalar <DATA>"

    perldoc -f scalar

    > And if the content dose not just contain
    > 2 lines, multi lines, what should I do?


    perldoc -q paragraph

    ------------------ 8< ------------------
    #!/usr/bin/perl
    #
    use strict;
    use warnings;

    $/ = "\n >";
    while (my $record = <DATA>) {
    if ($record=~/CT2\n(.*)\n/s) { print $1 }
    }

    __DATA__
    >CT1

    XY0002658-96
    0000222541
    >CT2

    XY0002688-55
    0000254147
    >CT5

    ZZ0004854-00
    0000475568
    ------------------ 8< ------------------





    --
    RGB
    RedGrittyBrick, Apr 24, 2008
    #5
  6. RedGrittyBrick wrote:
    > #!/usr/bin/perl
    > #
    > use strict;
    > use warnings;
    >
    > $/ = "\n >";
    > while (my $record = <DATA>) {
    > if ($record=~/CT2\n(.*)\n/s) { print $1 }
    > }


    Or
    $/ = "\n >";
    while (<DATA>) {
    print $1 if /CT5\n(.*)\n/s;
    }

    >
    > __DATA__
    > >CT1

    > XY0002658-96
    > 0000222541
    > >CT2

    > XY0002688-55
    > 0000254147

    3333333333
    4444444444
    > >CT5

    > ZZ0004854-00
    > 0000475568




    --
    RGB
    RedGrittyBrick, Apr 24, 2008
    #6
  7. Amy Lee <> wrote:
    [snip]

    Use bioperl to parse FASTA files :)

    j.
    January Weiner, Apr 24, 2008
    #7
  8. Amy Lee

    Amy Lee Guest

    On Thu, 24 Apr 2008 11:52:24 +0100, RedGrittyBrick wrote:

    > RedGrittyBrick wrote:
    >> #!/usr/bin/perl
    >> #
    >> use strict;
    >> use warnings;
    >>
    >> $/ = "\n >";
    >> while (my $record = <DATA>) {
    >> if ($record=~/CT2\n(.*)\n/s) { print $1 }
    >> }

    >
    > Or
    > $/ = "\n >";
    > while (<DATA>) {
    > print $1 if /CT5\n(.*)\n/s;
    > }
    >
    >>
    >> __DATA__
    >> >CT1

    >> XY0002658-96
    >> 0000222541
    >> >CT2

    >> XY0002688-55
    >> 0000254147

    > 3333333333
    > 4444444444
    >> >CT5

    >> ZZ0004854-00
    >> 0000475568


    Thank you. But there's a problem I can't understand. What if I hope create
    files like CT1 contains the CT1 label including; CT2 contains the CT2
    label including and so on. However, I think I should read the label.

    How to accomplish that?

    Thank you very much~

    Regards,

    Amy Lee
    Amy Lee, Apr 25, 2008
    #8
  9. Amy Lee wrote:
    > On Thu, 24 Apr 2008 11:52:24 +0100, RedGrittyBrick wrote:
    >
    >> RedGrittyBrick wrote:
    >>> #!/usr/bin/perl
    >>> #
    >>> use strict;
    >>> use warnings;
    >>>
    >>> $/ = "\n >";
    >>> while (my $record = <DATA>) {
    >>> if ($record=~/CT2\n(.*)\n/s) { print $1 }
    >>> }

    >> Or
    >> $/ = "\n >";
    >> while (<DATA>) {
    >> print $1 if /CT5\n(.*)\n/s;
    >> }
    >>
    >>> __DATA__
    >>> >CT1
    >>> XY0002658-96
    >>> 0000222541
    >>> >CT2
    >>> XY0002688-55
    >>> 0000254147

    >> 3333333333
    >> 4444444444
    >>> >CT5
    >>> ZZ0004854-00
    >>> 0000475568

    >
    > Thank you. But there's a problem I can't understand. What if I hope create
    > files like CT1 contains the CT1 label including; CT2 contains the CT2
    > label including and so on. However, I think I should read the label.
    >
    > How to accomplish that?
    >


    I'm not sure I understand what you mean - it would be clearer if you
    give an example of the data.

    Did you mean
    >CT1

    XY0002658-96
    0000222541
    CT1
    4444444444
    5555555555
    >CT2

    XY0002688-55
    0000254147
    CT1
    CT2
    5555555555
    6666666666
    7777777777
    >CT5

    ZZ0004854-00
    0000475568
    CT2
    CT5
    5555555555
    6666666666

    If so, my suggested script would split the records OK because it uses
    newline space greater-than as the record separator. It however would
    select the wrong records because the selector is now insufficiently
    precise. We want to match CT2 (say) only when occurs at the start of a
    record. You can use the ^ character to anchor an expression to the
    start. "/CT2.../" becomes "/^CT2.../"

    Do read the documentation - you will be able to work a lot of this out
    yourself.

    perldoc perlre
    perldoc perlop (look for "m/PATTERN")

    --
    RGB
    RedGrittyBrick, Apr 25, 2008
    #9
  10. Amy Lee

    Amy Lee Guest

    On Fri, 25 Apr 2008 14:59:17 +0100, RedGrittyBrick wrote:

    > Amy Lee wrote:
    >> On Thu, 24 Apr 2008 11:52:24 +0100, RedGrittyBrick wrote:
    >>
    >>> RedGrittyBrick wrote:
    >>>> #!/usr/bin/perl
    >>>> #
    >>>> use strict;
    >>>> use warnings;
    >>>>
    >>>> $/ = "\n >";
    >>>> while (my $record = <DATA>) {
    >>>> if ($record=~/CT2\n(.*)\n/s) { print $1 }
    >>>> }
    >>> Or
    >>> $/ = "\n >";
    >>> while (<DATA>) {
    >>> print $1 if /CT5\n(.*)\n/s;
    >>> }
    >>>
    >>>> __DATA__
    >>>> >CT1
    >>>> XY0002658-96
    >>>> 0000222541
    >>>> >CT2
    >>>> XY0002688-55
    >>>> 0000254147
    >>> 3333333333
    >>> 4444444444
    >>>> >CT5
    >>>> ZZ0004854-00
    >>>> 0000475568

    >>
    >> Thank you. But there's a problem I can't understand. What if I hope create
    >> files like CT1 contains the CT1 label including; CT2 contains the CT2
    >> label including and so on. However, I think I should read the label.
    >>
    >> How to accomplish that?
    >>

    >
    > I'm not sure I understand what you mean - it would be clearer if you
    > give an example of the data.
    >
    > Did you mean
    > >CT1

    > XY0002658-96
    > 0000222541
    > CT1
    > 4444444444
    > 5555555555
    > >CT2

    > XY0002688-55
    > 0000254147
    > CT1
    > CT2
    > 5555555555
    > 6666666666
    > 7777777777
    > >CT5

    > ZZ0004854-00
    > 0000475568
    > CT2
    > CT5
    > 5555555555
    > 6666666666
    >
    > If so, my suggested script would split the records OK because it uses
    > newline space greater-than as the record separator. It however would
    > select the wrong records because the selector is now insufficiently
    > precise. We want to match CT2 (say) only when occurs at the start of a
    > record. You can use the ^ character to anchor an expression to the
    > start. "/CT2.../" becomes "/^CT2.../"
    >
    > Do read the documentation - you will be able to work a lot of this out
    > yourself.
    >
    > perldoc perlre
    > perldoc perlop (look for "m/PATTERN")

    Thanks your reply. I suppose that I can write a bit if possible. My
    meaning is if my file contains CT1, CT2, CT5 these three entry, then I
    can make 3 files called CT1, CT2, CT5 and CT1 file contains the words in
    CT1 entry, CT2 file contains the words in CT2 entry, and so on.

    I will paste my codes if I meet questions.

    Thank you again.

    Regards,

    Amy
    Amy Lee, Apr 25, 2008
    #10
  11. RedGrittyBrick <> wrote in
    news:4811e3b6$0$10634$:

    > Amy Lee wrote:


    ....

    >> Thank you. But there's a problem I can't understand. What if I hope
    >> create files like CT1 contains the CT1 label including; CT2 contains
    >> the CT2 label including and so on. However, I think I should read the
    >> label.
    >>
    >> How to accomplish that?
    >>

    >
    > I'm not sure I understand what you mean - it would be clearer if you
    > give an example of the data.
    >


    ....

    > Do read the documentation - you will be able to work a lot of this out
    > yourself.


    She seems to be here to get fish.

    As January Weiner noted, apparently her data is domain specific. I do
    not know what FASTA files are but if the OP really is working with FASTA
    files, using BioPerl would indeed be the right thing to do instead of
    trying to re-write that functionality through piecemeal questions posted
    on clpmisc.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://www.rehabitation.com/clpmisc/
    A. Sinan Unur, Apr 25, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jack
    Replies:
    9
    Views:
    2,655
  2. keto
    Replies:
    0
    Views:
    928
  3. David Cournapeau

    print a vs print '%s' % a vs print '%f' a

    David Cournapeau, Dec 30, 2008, in forum: Python
    Replies:
    0
    Views:
    347
    David Cournapeau
    Dec 30, 2008
  4. Gabriel Genellina
    Replies:
    5
    Views:
    276
    Bruno Desthuilliers
    Mar 6, 2009
  5. Wolfgang
    Replies:
    1
    Views:
    153
    Paul Lalli
    Feb 13, 2004
Loading...

Share This Page