Need help understanding an Array push

Discussion in 'Perl Misc' started by Paul, Apr 10, 2007.

  1. Paul

    Paul Guest

    Hello there. I'm new to Perl and am trying to maintain a large
    complex script that generates some reports based on some text input
    files. I'm having difficulty understanding a section of the code that
    reads in from a text file with 3 columns of data to generate a new
    output file with 5 columns of data. I was hoping that someone might
    be able to help me understand a few lines that I find a bit complex
    right now. Here are the details..

    Sample Text input file:
    ---
    "Filename" "Field" "Content"
    "file0001.txt" "DESCRIPTION" "blah blah blah"
    "file0001.txt" "NAME" "FOO"
    "file0002.txt" "NAME" "BAR"
    ---

    Need help understanding what this section of Perl code does:

    1: foreach(@list) {
    2: ($filenm,$fld_type,$content) = (split /\"/)[1,3,5];
    3: $name =~ s/^\s+//;
    4: $name =~ s/\s+$//;
    5: push(@{$listcontent{"$filenm\{\{\{$fld_type"}},$content) if
    ($fld_type);
    6: }
    [snip]
    7: foreach(sort keys %listcontent) {
    8: ($newfilename, $fld_type) = split /\{\{\{/;
    [...]


    To start, @list is an array that holds the contents of the input file
    (sans the header) like so:
    ["\"file0001.txt\"\t\"DESCRIPTION\"\t\"blah blah blah\"\n",
    "\"file0001.txt\"\t\"NAME\"\t\"FOO\"\n",
    "\"file0002.txt\"\t\"NAME\"\t\"BAR\"\n"]

    What I know so far...

    - line 1 - 6 iterates through each row in the @list array. ok.
    - line 2 pulls out the content from each row into 3 variables. ok.
    - lines 3 and 4 strip off the leading and trailing white space. ok.

    - line 5. No idea. =(
    --> Guess: it looks like "listcontent" is an array that now holds some
    new arrangement of the data pulled out of the other array. I don't
    get what the open curly braces are doing. I tried to "print" this
    array after it ran but I didn't understand what I was seeing. It
    looked like some kind of convoluted programming mess so I thought I
    must have done something wrong.
    -> Can anyone please tell me what this new array looks like, or help
    me understand this line of code?

    - line 7 starts a new loop iterating through the new array...
    -> What does "sort keys" do? Well, okay, I know what 'sort' is, but I
    don't get the 'keys' part. It's not a subroutine in this script, so
    what's it sorting on/by here?

    - line 8 pulls out the content from each row in "listcontent" into 2
    new variables. I recognize the "\{\{\{" from line 5, but I don't get
    how this is getting separated here. i.e. I don't know what
    $newfilename looks like.
    - I suppose if I understand line 5 I'll have a better chance of
    understanding line 8.


    Can anyone please help? Thanks in advance.

    Paul.
    Paul, Apr 10, 2007
    #1
    1. Advertising

  2. Paul

    J. Gleixner Guest

    Paul wrote:
    > Hello there. I'm new to Perl and am trying to maintain a large
    > complex script that generates some reports based on some text input
    > files. I'm having difficulty understanding a section of the code that
    > reads in from a text file with 3 columns of data to generate a new
    > output file with 5 columns of data. I was hoping that someone might
    > be able to help me understand a few lines that I find a bit complex
    > right now. Here are the details..
    >
    > Sample Text input file:
    > ---
    > "Filename" "Field" "Content"
    > "file0001.txt" "DESCRIPTION" "blah blah blah"
    > "file0001.txt" "NAME" "FOO"
    > "file0002.txt" "NAME" "BAR"
    > ---
    >
    > Need help understanding what this section of Perl code does:
    >
    > 1: foreach(@list) {
    > 2: ($filenm,$fld_type,$content) = (split /\"/)[1,3,5];

    No need to escape it.

    > 3: $name =~ s/^\s+//;
    > 4: $name =~ s/\s+$//;
    > 5: push(@{$listcontent{"$filenm\{\{\{$fld_type"}},$content) if
    > ($fld_type);
    > 6: }
    > [snip]
    > 7: foreach(sort keys %listcontent) {
    > 8: ($newfilename, $fld_type) = split /\{\{\{/;
    > [...]
    >
    >
    > To start, @list is an array that holds the contents of the input file
    > (sans the header) like so:
    > ["\"file0001.txt\"\t\"DESCRIPTION\"\t\"blah blah blah\"\n",
    > "\"file0001.txt\"\t\"NAME\"\t\"FOO\"\n",
    > "\"file0002.txt\"\t\"NAME\"\t\"BAR\"\n"]
    >
    > What I know so far...
    >
    > - line 1 - 6 iterates through each row in the @list array. ok.
    > - line 2 pulls out the content from each row into 3 variables. ok.
    > - lines 3 and 4 strip off the leading and trailing white space. ok.
    >
    > - line 5. No idea. =(
    > --> Guess: it looks like "listcontent" is an array that now holds some


    It's a Hash of Lists (HoL).

    perldoc perldsc

    > new arrangement of the data pulled out of the other array. I don't
    > get what the open curly braces are doing.

    No idea. It's the key to the hash, but why someone chose that format
    is only known to you and the person who wrote it, based on this
    small piece of code, it's not needed.

    push(@{$listcontent{"$filenm:fld_type"}},$content) if $fld_type;

    then later..

    my ($newfilename, $fld_type) = split /:/;


    > -> Can anyone please tell me what this new array looks like, or help
    > me understand this line of code?


    Yes, Data::Dumper can.

    use Data::Dumper;
    print Dumper( \%listcontent );

    >
    > - line 7 starts a new loop iterating through the new array...


    No.. it iterates over the keys of a hash. Note, you'll
    only see unique keys, so the push is a waste, or you need
    to also iterate over the elements in the array, which might
    be what's happening in the rest of the code.


    > -> What does "sort keys" do? Well, okay, I know what 'sort' is, but I
    > don't get the 'keys' part. It's not a subroutine in this script, so
    > what's it sorting on/by here?


    perldoc -f keys

    > - line 8 pulls out the content from each row in "listcontent" into 2
    > new variables.


    It splits the key found in the listcontect hash.


    >I recognize the "\{\{\{" from line 5, but I don't get
    > how this is getting separated here. i.e. I don't know what
    > $newfilename looks like.


    perldoc -f split

    No idea why you'd have "\{\{\{" in there either. Poor choice of
    separator, IMHO.

    You could simply print it out..

    print "$newfilename\n";

    > - I suppose if I understand line 5 I'll have a better chance of
    > understanding line 8.


    use Data::Dumper;
    my %data;
    push( @{ $data{ 'key1' } }, 'value 1' );
    push( @{ $data{ 'key1' } }, 'value 2' );
    print Dumper \%data;
    $VAR1 = {
    'key1' => [
    'value 1',
    'value 2'
    ]
    };

    Read through perldoc perldsc, perldoc -f keys, perldoc -f split,
    and use Data::Dumper to examine the structure, and you should be
    able to answer all of your questions.
    J. Gleixner, Apr 10, 2007
    #2
    1. Advertising

  3. Paul wrote:
    > Hello there. I'm new to Perl and am trying to maintain a large
    > complex script that generates some reports based on some text input
    > files.


    Are warnings and strict enabled in this script?


    > I'm having difficulty understanding a section of the code that
    > reads in from a text file with 3 columns of data to generate a new
    > output file with 5 columns of data. I was hoping that someone might
    > be able to help me understand a few lines that I find a bit complex
    > right now. Here are the details..
    >
    > Sample Text input file:
    > ---
    > "Filename" "Field" "Content"
    > "file0001.txt" "DESCRIPTION" "blah blah blah"
    > "file0001.txt" "NAME" "FOO"
    > "file0002.txt" "NAME" "BAR"
    > ---
    >
    > Need help understanding what this section of Perl code does:
    >
    > 1: foreach(@list) {


    Why are you reading in the whole file into an array? Is it really required?


    > 2: ($filenm,$fld_type,$content) = (split /\"/)[1,3,5];


    Your variables should be lexically scoped inside the foreach loop:

    my ( $filenm, $fld_type, $content ) = ...

    You are assuming that the indices 1, 3 and 5 of the list returned from split
    will always contain valid data. You could do that without using "magic" numbers:

    my ( $filenm, $fld_type, $content ) = /"([^"]+)"/g;


    > 3: $name =~ s/^\s+//;
    > 4: $name =~ s/\s+$//;


    You are modifying the $name variable. Where did this variable come from and
    what does it contain and why are you modifying it here?


    > 5: push(@{$listcontent{"$filenm\{\{\{$fld_type"}},$content) if
    > ($fld_type);


    perldoc perldata
    perldoc perldsc
    perldoc perllol

    You are pushing $content onto the array @{ $listcontent{
    "$filenm\{\{\{$fld_type" } }. The array is stored in the hash %listcontent
    using the hash key "$filenm\{\{\{$fld_type". It is know in Perl as a HoA
    (hash of arrays.)


    > 6: }
    > [snip]
    > 7: foreach(sort keys %listcontent) {
    > 8: ($newfilename, $fld_type) = split /\{\{\{/;
    > [...]




    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
    John W. Krahn, Apr 10, 2007
    #3
  4. J. Gleixner wrote:
    > Paul wrote:
    >>
    >> - line 5. No idea. =(
    >> --> Guess: it looks like "listcontent" is an array that now holds some

    >
    > It's a Hash of Lists (HoL).


    perldoc -q "What is the difference between a list and an array"


    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
    John W. Krahn, Apr 10, 2007
    #4
  5. On 2007-04-10 16:14, J. Gleixner <> wrote:
    > Paul wrote:
    >> Need help understanding what this section of Perl code does:
    >>
    >> 1: foreach(@list) {
    >> 2: ($filenm,$fld_type,$content) = (split /\"/)[1,3,5];
    >> 5: push(@{$listcontent{"$filenm\{\{\{$fld_type"}},$content) if
    >> ($fld_type);
    >> 6: }
    >> [snip]
    >> 7: foreach(sort keys %listcontent) {
    >> 8: ($newfilename, $fld_type) = split /\{\{\{/;
    >> [...]
    >>
    >> new arrangement of the data pulled out of the other array. I don't
    >> get what the open curly braces are doing.

    > No idea. It's the key to the hash, but why someone chose that format
    > is only known to you and the person who wrote it, based on this
    > small piece of code, it's not needed.
    >
    > push(@{$listcontent{"$filenm:fld_type"}},$content) if $fld_type;


    I think you omitted a $ before fld_type here.


    > then later..
    >
    > my ($newfilename, $fld_type) = split /:/;


    What if there is a ":" in a file name?

    > No idea why you'd have "\{\{\{" in there either. Poor choice of
    > separator, IMHO.


    The programmer of the script probably thought that "{{{" was very
    unlikely to appear in a filename, whereas simpler separators like ":"
    might appear. He also seems not to have been aware of real or pseudo
    multidimensional hashes. I would have written that as:

    push(@{ $listcontent{$filenm}{$fld_type} }, $content) if $fld_type;

    and if I've had a good reason to "flatten" the hash (e.g., memory
    consumption) then I'd have used a pseudo multidimensional hash:

    push(@{ $listcontent{$filenm, $fld_type} }, $content) if $fld_type;

    This is also just joins $filenm and $fld_type with a separator, but
    that separator ($; (or $SUBSEP if you "use English;"), by default
    "\034") is even less probable in a filename and it's already there for
    that purpose so the reader doesn't have to think why "{{{" was chosen.

    hp

    --
    _ | Peter J. Holzer | I know I'd be respectful of a pirate
    |_|_) | Sysadmin WSR | with an emu on his shoulder.
    | | | |
    __/ | http://www.hjp.at/ | -- Sam in "Freefall"
    Peter J. Holzer, Apr 15, 2007
    #5
  6. On 2007-04-10 16:30, John W. Krahn <> wrote:
    > Paul wrote:
    >> Hello there. I'm new to Perl and am trying to maintain a large
    >> complex script that generates some reports based on some text input
    >> files.

    [...]
    >> Need help understanding what this section of Perl code does:
    >>
    >> 1: foreach(@list) {

    >
    > Why are you reading in the whole file into an array? Is it really
    > required?


    Just a bit of stylistic nitpicking:

    I think Paul made it rather clear that he didn't write the script but is
    trying to understand a script written by someone else. So asking "Why
    are YOU reading ..." seems a bit inappropriate here.

    hp

    --
    _ | Peter J. Holzer | I know I'd be respectful of a pirate
    |_|_) | Sysadmin WSR | with an emu on his shoulder.
    | | | |
    __/ | http://www.hjp.at/ | -- Sam in "Freefall"
    Peter J. Holzer, Apr 15, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. merrittr

    need to push pop strings on a stack

    merrittr, May 12, 2007, in forum: C Programming
    Replies:
    18
    Views:
    801
    merrittr
    May 14, 2007
  2. XyZaa
    Replies:
    0
    Views:
    552
    XyZaa
    Jul 19, 2007
  3. Daniel Berger

    Testing array.push(array)

    Daniel Berger, Nov 9, 2005, in forum: Ruby
    Replies:
    2
    Views:
    119
    Daniel Berger
    Nov 9, 2005
  4. Replies:
    5
    Views:
    128
    YANAGAWA Kazuhisa
    Mar 8, 2006
  5. samppi
    Replies:
    27
    Views:
    451
    David A. Black
    Dec 5, 2007
Loading...

Share This Page