Loop through a text file line by line

Discussion in 'Perl Misc' started by toomanyjoes@mail.utexas.edu, Jan 13, 2005.

  1. Guest

    Hi,

    I'm trying to loop through a .txt file and grab a number at the
    beginning of the line. I want to save this number in a variable and
    then compare it with the number at the beginning of the next line. For
    instance my text would look like this:

    1sampletextsampletextsamplet22tsampletextsampletextsampletext
    2sampletextsampletextsamplet22tsampletextsampletextsampletext
    3sampletextsampletextsamplet22tsampletextsampletextsampletext

    The lines are distinct from one other, each one ends with a newline
    character. How do I loop through a text file line by line?

    #Heres my code so far

    $myvar = "1sampletextsampletextsamplet22tsampletext"; #easy way out of
    opening the text file (I'm not sure how)
    if ($myvar =~ /^(\d+)/){
    $curNum = "$1"; #curNum should equal 1
    }
    print "$curNum";

    This only works for an individual line. I need to move to the next line
    and do the same thing, then compare $curNum with $newNum.
     
    , Jan 13, 2005
    #1
    1. Advertising

  2. Matt Garrish Guest

    Matt Garrish, Jan 13, 2005
    #2
    1. Advertising

  3. <> wrote:

    > I'm trying to loop through a .txt file and grab a number at the
    > beginning of the line. I want to save this number in a variable and
    > then compare it with the number at the beginning of the next line.



    > The lines are distinct from one other, each one ends with a newline
    > character.



    You should speak Perl rather than English, when possible,
    (as suggested in the Posting Guidelines) then you won't need
    to try and explain it in English.

    foreach ( "1sampletextsampletextsamplet22tsampletextsampletextsampletext\n",
    "2sampletextsampletextsamplet22tsampletextsampletextsampletext\n",
    "3sampletextsampletextsamplet22tsampletextsampletextsampletext\n"
    ) {


    > How do I loop through a text file line by line?



    By using a while() loop (see the "Compound statements"
    section in perlsyn.pod), and the Input operator (see the
    "I/O Operators" section in perlop.pod).


    > $myvar = "1sampletextsampletextsamplet22tsampletext"; #easy way out of



    Errr, I thought you said the strings had newlines on the end.

    That string does NOT have a newline at the end.


    There should be a "my " at the beginning of that line.

    You _are_ enabling strict (as suggested in the Posting Guidelines),
    aren't you?


    > opening the text file (I'm not sure how)



    You use the open() function to open a file.

    perldoc -f open


    If you use the __DATA__ token, then you can post file contents
    as part of your program, as suggested in the Posting Guidelines
    (that you have read so many times).


    > if ($myvar =~ /^(\d+)/){
    > $curNum = "$1"; #curNum should equal 1
    > }
    > print "$curNum";
    >
    > This only works for an individual line. I need to move to the next line
    > and do the same thing, then compare $curNum with $newNum.



    Errr, what is $newNum? It does not appear in your code anywhere...

    What kind of comparison do you need?

    Equal/not equal to the previous?
    one more than the previous?
    larger than the previous?
    smaller than the previous?
    ....


    ---------------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;

    my $curNum;
    while ( my $myvar = <DATA> ) {
    $curNum = $1 if $myvar =~ /^(\d+)/;
    }
    print "$curNum\n";

    __DATA__
    1sampletextsampletextsamplet22tsampletextsampletextsampletext
    2sampletextsampletextsamplet22tsampletextsampletextsampletext
    3sampletextsampletextsamplet22tsampletextsampletextsampletext
    ---------------------------------


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jan 13, 2005
    #3
  4. wrote in news:1105583533.605587.261600
    @z14g2000cwz.googlegroups.com:

    > The lines are distinct from one other, each one ends with a newline
    > character. How do I loop through a text file line by line?


    perldoc -f readline

    Sinan.


    --
    A. Sinan Unur
    d
    (remove '.invalid' and reverse each component for email address)
     
    A. Sinan Unur, Jan 13, 2005
    #4
  5. wrote :
    >
    > I'm trying to loop through a .txt file and grab a number at the
    > beginning of the line. I want to save this number in a variable and
    > ...


    Have you read my posting in your other thread?
    Take a look at the "sub printline", which does exactly what you are
    asking for.

    >
    > The lines are distinct from one other, each one ends with a newline
    > character. How do I loop through a text file line by line?
    >


    This is done within the while loop in the other posting.
    If you replace <DATA> with <> you can read from STDIN.
    You could also open a File with th open functioni (perldoc -f open).

    HTH

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 13, 2005
    #5
  6. On 13 Jan 2005 10:20:57 GMT, Martin Kissner <> wrote:

    >This is done within the while loop in the other posting.
    >If you replace <DATA> with <> you can read from STDIN.


    Not exactly. From 'perldoc perlop':

    | The null filehandle <> is special: it can be used to emulate the
    | behavior of sed and awk. Input from <> comes either from standard input,
    | or from each file listed on the command line. Here's how it works: the
    | first time <> is evaluated, the @ARGV array is checked, and if it is
    | empty, $ARGV[0] is set to "-", which when opened gives you standard
    | input. The @ARGV array is then processed as a list of filenames. The
    | loop


    HTH,
    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jan 14, 2005
    #6
  7. Michele Dondi wrote :
    > On 13 Jan 2005 10:20:57 GMT, Martin Kissner <> wrote:
    >
    >>This is done within the while loop in the other posting.
    >>If you replace <DATA> with <> you can read from STDIN.

    >
    > Not exactly. From 'perldoc perlop':
    > [...]


    In <> Arndt wrote to me:
    | The "diamond" operator <> is described in perlop and perlopentut.
    | You read from STDIN by using the normal Unix syntax:
    |
    | ./myperlscript.pl < file

    This worked for me.

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 14, 2005
    #7
  8. On 14 Jan 2005 10:24:09 GMT, Martin Kissner <> wrote:

    >> Not exactly. From 'perldoc perlop':
    >> [...]

    >
    >In <> Arndt wrote to me:
    >| The "diamond" operator <> is described in perlop and perlopentut.
    >| You read from STDIN by using the normal Unix syntax:
    >|
    >| ./myperlscript.pl < file
    >
    >This worked for me.


    So what?!?


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jan 14, 2005
    #8
  9. Michele Dondi wrote :
    > On 14 Jan 2005 10:24:09 GMT, Martin Kissner <> wrote:
    >
    >>> Not exactly. From 'perldoc perlop':
    >>> [...]

    >>
    >>In <> Arndt wrote to me:
    >>| The "diamond" operator <> is described in perlop and perlopentut.
    >>| You read from STDIN by using the normal Unix syntax:
    >>|
    >>| ./myperlscript.pl < file
    >>
    >>This worked for me.

    >
    > So what?!?


    I said: "If you replace <DATA> with <> you can read from STDIN."
    You said: "Not exactly. ..."
    I was not sure if I have expressed myself correctly so tried to clear
    up what I meant.

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 14, 2005
    #9
  10. Michele Dondi wrote :
    > On 14 Jan 2005 10:24:09 GMT, Martin Kissner <> wrote:
    >
    >>> Not exactly. From 'perldoc perlop':
    >>> [...]

    >>
    >>In <> Arndt wrote to me:
    >>| The "diamond" operator <> is described in perlop and perlopentut.
    >>| You read from STDIN by using the normal Unix syntax:
    >>|
    >>| ./myperlscript.pl < file
    >>
    >>This worked for me.

    >
    > So what?!?


    I was not sure if I have expressed myself correctly so tried to clear
    up what I meant.

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 14, 2005
    #10
  11. Guest

    I'm still having problems opening a file. I looked at the docs and this
    is what I came up with.

    open (FILE, "> C:\notes\gen.txt") || die ("error $!\n"); #See note 1

    while (<FILE>) { #See note 2

    s/\<[NR][a\d]\>/ /g #note 3

    if (^\w\D\D\s\d+:\d+) #note 4

    Note1: Although in all the examples I've looked at the actual file
    reference is always very vague and I have never seen a full path with
    the files extension in the quotes like I have done here.

    Note2: Here I want to loop through the file, is this correct?

    Note3: Here I want to find every instance (In the File) of a "<Ra> or a
    <N1>(2,3,etc.)" and replace it with nothing (basically remove it)

    Note4: Here I'm looking for a specific arrangement of characters at the
    beginning of the line it can be any character followed by 2 letters
    followed by a space followed by any number of digits separated by a
    colon. (ex. 1RA 12321:123214 would match as well as ADD 4:2) if a match
    occurs I want to remove the colon and separate these items with a
    delimiter. For Example (ex. 1RA/ 12321/123214)

    And thats as far as I've been able to get because I get an error
    opening the file. It looks to me like I've done it by the book. But I
    must be missing something. I'd appreciate any comments about my code.

    Thanks,
    Joe

    Martin Kissner wrote:
    > Michele Dondi wrote :
    > > On 14 Jan 2005 10:24:09 GMT, Martin Kissner <>

    wrote:
    > >
    > >>> Not exactly. From 'perldoc perlop':
    > >>> [...]
    > >>
    > >>In <> Arndt wrote to me:
    > >>| The "diamond" operator <> is described in perlop and perlopentut.
    > >>| You read from STDIN by using the normal Unix syntax:
    > >>|
    > >>| ./myperlscript.pl < file
    > >>
    > >>This worked for me.

    > >
    > > So what?!?

    >
    > I was not sure if I have expressed myself correctly so tried to clear
    > up what I meant.
    >
    > --
    > Epur Si Muove (Gallileo Gallilei)
     
    , Jan 14, 2005
    #11
  12. Guest

    I'm still having problems opening a file. I looked at the docs and this
    is what I came up with.

    open (FILE, "> C:\notes\gen.txt") || die ("error $!\n"); #See note 1

    while (<FILE>) { #See note 2

    s/\<[NR][a\d]\>/ /g #note 3

    if (^\w\D\D\s\d+:\d+) #note 4

    Note1: Although in all the examples I've looked at the actual file
    reference is always very vague and I have never seen a full path with
    the files extension in the quotes like I have done here.

    Note2: Here I want to loop through the file, is this correct?

    Note3: Here I want to find every instance (In the File) of a "<Ra> or a
    <N1>(2,3,etc.)" and replace it with nothing (basically remove it)

    Note4: Here I'm looking for a specific arrangement of characters at the
    beginning of the line it can be any character followed by 2 letters
    followed by a space followed by any number of digits separated by a
    colon. (ex. 1RA 12321:123214 would match as well as ADD 4:2) if a match
    occurs I want to remove the colon and separate these items with a
    delimiter. For Example (ex. 1RA/ 12321/123214)

    And thats as far as I've been able to get because I get an error
    opening the file. It looks to me like I've done it by the book. But I
    must be missing something. I'd appreciate any comments about my code.

    Thanks,
    Joe

    Martin Kissner wrote:
    > Michele Dondi wrote :
    > > On 14 Jan 2005 10:24:09 GMT, Martin Kissner <>

    wrote:
    > >
    > >>> Not exactly. From 'perldoc perlop':
    > >>> [...]
    > >>
    > >>In <> Arndt wrote to me:
    > >>| The "diamond" operator <> is described in perlop and perlopentut.
    > >>| You read from STDIN by using the normal Unix syntax:
    > >>|
    > >>| ./myperlscript.pl < file
    > >>
    > >>This worked for me.

    > >
    > > So what?!?

    >
    > I was not sure if I have expressed myself correctly so tried to clear
    > up what I meant.
    >
    > --
    > Epur Si Muove (Gallileo Gallilei)
     
    , Jan 14, 2005
    #12
  13. wrote:

    > I'm still having problems opening a file. I looked at the docs and this
    > is what I came up with.
    >
    > open (FILE, "> C:\notes\gen.txt") || die ("error $!\n"); #See note 1


    First, backslashes have special meaning in double-quoted strings. You're on
    Windows obviously, so you have options. You could "escape" the backslashes
    by doubling them like this:

    "C:\\notes\\gen.txt"

    Or, you could use standard forward slashes - Windows understands them just
    fine:

    "C:/notes/gen.txt"

    And last but *certainly* not least, why use double-quotes at all? You don't
    need to. In die() you *do* need them, because you're interpolating the
    value of $! into a string - but in open() you're not doing that, so you
    should be using single quotes:

    'C:\notes\gen.txt'

    Have a look at "perldoc perlop" for more.

    You should be using the three-argument form of open, where the mode and file
    name are separate:

    open (FILE, '>', 'C:\notes\gen.txt') or die(...);

    Have a look at "perldoc perlopentut" for more. And note that '>' is used to
    open a file for *output*. You want to open the file for input, so you
    should be using '<'.

    > while (<FILE>) { #See note 2
    > Note2: Here I want to loop through the file, is this correct?


    Well, it *would* be correct if FILE had been opened for input. With each
    iteration of the while() loop, a line is read from the file and assigned to
    $_. If you wanted to use an explicit variable, rather than the implicit $_,
    you could write it like this:

    while (my $line = <FILE>) {

    > And thats as far as I've been able to get because I get an error
    > opening the file.


    I'm guessing the error looks something like this:

    File C:
    otesgen.txt does not exist

    In a double-quoted string, "\n" is a newline. "\g" has no special meaning I
    can think of offhand, so it's interpreted as just a "g". Obviously, the
    filename doesn't have a newline in it - which is why you need to escape the
    backslashes, or better still don't use them at all, or use single quotes.

    (BTW, I shouldn't have to guess at the error message. As the posting
    guidelines point out, you should post the full text of any error
    messages...)

    sherm--

    --
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Hire me! My resume: http://www.dot-app.org
     
    Sherm Pendley, Jan 14, 2005
    #13
  14. wrote:

    > I'm still having problems opening a file. I looked at the docs and this
    > is what I came up with.


    Yeah, yeah, we get it. We heard you the FIRST TWO times you posted this.
    Once is enough, really.

    sherm--

    --
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Hire me! My resume: http://www.dot-app.org
     
    Sherm Pendley, Jan 14, 2005
    #14
  15. On 14 Jan 2005 16:16:09 GMT, Martin Kissner <> wrote:

    >>>This worked for me.

    >>
    >> So what?!?

    >
    >I was not sure if I have expressed myself correctly so tried to clear
    >up what I meant.


    The point is you said more or less "use

    while (<>) { #...

    if you want to read from _STDIN_". Well, this is plainly false, as
    explained in detail in the document page you have been referred to.
    Specifically you will be reading from STDIN iff at that point @ARGV is
    empty (or $ARGV[0] eq '-').

    Hence the fact that "this worked for me" (whatever you were referring
    to) was irrelevant.


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jan 15, 2005
    #15
  16. Michele Dondi wrote :
    > On 14 Jan 2005 16:16:09 GMT, Martin Kissner <> wrote:
    >
    >>>>This worked for me.
    >>>
    >>> So what?!?

    >>
    >>I was not sure if I have expressed myself correctly so tried to clear
    >>up what I meant.

    >
    > The point is you said more or less "use
    >
    > while (<>) { #...
    >
    > if you want to read from _STDIN_". Well, this is plainly false, as
    > explained in detail in the document page you have been referred to.
    > Specifically you will be reading from STDIN iff at that point @ARGV is
    > empty (or $ARGV[0] eq '-').


    Thanks for clarifying that.
    My intention was to share advice given from an user which was useful to
    me.
    Sorry for the mismatching explanation.

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 15, 2005
    #16
  17. On 14 Jan 2005 11:24:03 -0800, wrote:

    >I'm still having problems opening a file. I looked at the docs and this
    >is what I came up with.
    >
    >open (FILE, "> C:\notes\gen.txt") || die ("error $!\n"); #See note 1


    I hope not to repeat (too much) what other have already said as cmt to
    this:

    (1) better use lexical FHs nowadays,

    (2) better use the three args form of open() nowadays[1],

    (3) better use mode '<' when you want to _read_ from a file[2],

    (4) better use single quotes when no interpolation or escaping is
    required[3],

    (4') better _not_ try to use filenames containing "exotic" chars like
    "\n" under Windows, because they cannot exist there[4],

    (5) better use low precedence logical <or> for flow control and then

    (6) better (IMHO) avoid unnecessary parentheses, especially since Perl
    is so gentle to allow us to do so in the first place.

    >while (<FILE>) { #See note 2
    >
    >s/\<[NR][a\d]\>/ /g #note 3
    >
    >if (^\w\D\D\s\d+:\d+) #note 4
    >
    >Note1: Although in all the examples I've looked at the actual file
    >reference is always very vague and I have never seen a full path with
    >the files extension in the quotes like I have done here.


    See above!

    >Note2: Here I want to loop through the file, is this correct?


    Yes it is.

    >Note3: Here I want to find every instance (In the File) of a "<Ra> or a
    ><N1>(2,3,etc.)" and replace it with nothing (basically remove it)


    This is not _exactly_ what your regexp does (it does more!). I'd do
    either

    s/<(?:Ra|N\d)>//g;

    or

    s/<Ra>//g;
    s/<N\d)>//g;

    I _suspect_ that the latter, notwithstanding the fact that it takes
    two statements, may be slightly faster. I'm not benchmarking them
    anyway.

    Also note that you don't need to escape '<' and '>'.

    [snip rest]

    >Thanks,
    >Joe
    >
    >Martin Kissner wrote:
    >> Michele Dondi wrote :
    >> > On 14 Jan 2005 10:24:09 GMT, Martin Kissner <>

    >wrote:


    Please do NOT top-post, it's considered rude and is likely to have a
    negative effect -for you, that is- in the long run.


    [1] We've recently discovered that at least one respectable and
    esteemed Perl hacker doesn't agree on this point, and with reasons
    that eventually turned out to be far from trivial. As a general rule,
    especially to a newbie like you I'd still recommend what I've written
    above.

    [2] Or nothing at all if using the two-args form of open().

    [3] But please do not take this as a "law". Common sense and practice
    should suggest you when using one or the other (or alternative
    delimiter for each of them). And even then others' mileage may vary.

    [4] Well, I'm not sure about NTFS (don't have it!), but I suspect so.


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jan 15, 2005
    #17
  18. On 14 Jan 2005 11:25:23 -0800, wrote:

    >I'm still having problems opening a file. I looked at the docs and this
    >is what I came up with.

    [snip rest]

    Please do not post the same article twice. It's considered rude.


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jan 15, 2005
    #18
  19. Joe Smith Guest

    Martin Kissner wrote:

    > I said: "If you replace <DATA> with <> you can read from STDIN."
    > You said: "Not exactly. ..."
    > I was not sure if I have expressed myself correctly so tried to clear
    > up what I meant.


    If you replace <DATA> with <> you can read from STDIN, provided that
    @ARGV is empty. If there is anything in @ARGV, the items will be
    interpreted as file names and <> will read from them instead.

    Without qualifiers, your statement implied that <> always reads from
    STDIN, and that is not exactly true.
    -Joe
     
    Joe Smith, Jan 16, 2005
    #19
  20. Joe Smith wrote :
    >
    > If you replace <DATA> with <> you can read from STDIN, provided that
    > @ARGV is empty. If there is anything in @ARGV, the items will be
    > interpreted as file names and <> will read from them instead.
    >
    > Without qualifiers, your statement implied that <> always reads from
    > STDIN, and that is not exactly true.


    Thanks for that additional explanation.

    --
    Epur Si Muove (Gallileo Gallilei)
     
    Martin Kissner, Jan 16, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roedy Green
    Replies:
    3
    Views:
    446
    Mike Schilling
    Sep 13, 2008
  2. qtrimble
    Replies:
    6
    Views:
    3,150
    qtrimble
    Mar 1, 2010
  3. Robin Wenger
    Replies:
    191
    Views:
    3,308
  4. curious
    Replies:
    1
    Views:
    188
    Patrick Spence
    Oct 25, 2006
  5. Isaac Won
    Replies:
    9
    Views:
    419
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page