Arrays instead of files into hashes

Discussion in 'Perl Misc' started by Francois Massion, Jan 12, 2009.

  1. I have code which compares 2 lists of words and churns out the
    difference. So far I have done this by reading the content of files
    into a hash, but I would like to use 2 arrays, says @array1 and
    @array2 instead of files a.txt and b.txt.

    I haven't managed to do this successfuly. Any suggestions? Her the
    functioning code with the files:

    #!/usr/bin/perl -w
    use warnings; use strict;
    use utf8; #
    binmode STDIN, ":utf8"; # input
    binmode STDOUT, ":utf8"; # output

    my %list2;
    open(WORDLIST2,'C:\temp\b.txt') or
    die("cannot open 'C:\temp\b.txt' because $! !\n");
    while (<WORDLIST2>) {
    $list2{$_}=1; #or any other value
    }
    open(WORDLIST1,'C:\temp\a.txt') or
    die("cannot open file1 because $! !\n");
    while (<WORDLIST1>){
    print unless exists($list2{$_});
    }
     
    Francois Massion, Jan 12, 2009
    #1
    1. Advertising

  2. Francois Massion

    Tim Greer Guest

    Francois Massion wrote:

    > I have code which compares 2 lists of words and churns out the
    > difference. So far I have done this by reading the content of files
    > into a hash, but I would like to use 2 arrays, says @array1 and
    > @array2 instead of files a.txt and b.txt.
    >
    > I haven't managed to do this successfuly. Any suggestions? Her the
    > functioning code with the files:



    The idea of using hashes instead of arrays, is better, why do you want
    to change it to arrays? You mean arrays to hold the data that is
    usually located in the {a,b}.txt files? Where is that information
    being generated from where you could have the data saved into an array,
    or are you asking if you can take the data from the files and put them
    into arrays? Either way, you shou;dn't want to do that. If you are
    using data generated from somewhere, then add that new value to a hash
    and see if it exists are you receive the data.

    There's no reason to use an array first. If you want to take the data
    from the files, just do what you originally were wanting to do, and use
    a hash, too. Or do you mean that you want to save the final, unique
    values into an array, after you've compared using a hash solution?
    Also, you mentioned you wanted to use arrays, but it's not working, so
    can you show the code you're trying to use with arrays and what you
    expect and what's not working? Sorry, I'm a little confused by what
    you're trying to accomplish, so please elaborate.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Jan 12, 2009
    #2
    1. Advertising

  3. Francois Massion

    Tim Greer Guest

    Tim Greer wrote:

    > and see if it exists are you receive the data.

    ^^^^^^^^^^

    "as you receive the data"
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Jan 12, 2009
    #3
  4. Francois Massion

    J. Gleixner Guest

    Francois Massion wrote:
    > I have code which compares 2 lists of words and churns out the
    > difference. So far I have done this by reading the content of files
    > into a hash, but I would like to use 2 arrays, says @array1 and
    > @array2 instead of files a.txt and b.txt.
    >
    > I haven't managed to do this successfuly. Any suggestions? Her the
    > functioning code with the files:
    >
    > #!/usr/bin/perl -w
    > use warnings; use strict;
    > use utf8; #
    > binmode STDIN, ":utf8"; # input
    > binmode STDOUT, ":utf8"; # output
    >
    > my %list2;
    > open(WORDLIST2,'C:\temp\b.txt') or
    > die("cannot open 'C:\temp\b.txt' because $! !\n");
    > while (<WORDLIST2>) {
    > $list2{$_}=1; #or any other value
    > }
    > open(WORDLIST1,'C:\temp\a.txt') or
    > die("cannot open file1 because $! !\n");
    > while (<WORDLIST1>){
    > print unless exists($list2{$_});
    > }


    If you already have @array1, you could do the following to
    create a hash that has the unique values from @array1.

    my %list2 = map { $_ => 1 } @array1;

    See also: perldoc -q "How do I compute the difference of two arrays"
     
    J. Gleixner, Jan 12, 2009
    #4
  5. Francois Massion

    Guest

    On Mon, 12 Jan 2009 10:53:23 -0800 (PST), Francois Massion <> wrote:

    >I have code which compares 2 lists of words and churns out the
    >difference. So far I have done this by reading the content of files
    >into a hash, but I would like to use 2 arrays, says @array1 and
    >@array2 instead of files a.txt and b.txt.
    >
    >I haven't managed to do this successfuly. Any suggestions? Her the
    >functioning code with the files:
    >
    >#!/usr/bin/perl -w
    >use warnings; use strict;
    >use utf8; #
    >binmode STDIN, ":utf8"; # input
    >binmode STDOUT, ":utf8"; # output
    >
    >my %list2;
    >open(WORDLIST2,'C:\temp\b.txt') or
    > die("cannot open 'C:\temp\b.txt' because $! !\n");
    >while (<WORDLIST2>) {
    > $list2{$_}=1; #or any other value
    >}
    >open(WORDLIST1,'C:\temp\a.txt') or
    > die("cannot open file1 because $! !\n");
    >while (<WORDLIST1>){
    > print unless exists($list2{$_});
    >}


    This some real old code. True beginner code.
    It resolves into the common elements of two arrays.
    I know your looking for the difference but you could
    extrapolate difference from common. It works for what I
    use it for, merging directories, renaming dups.

    If you want the whole program for usage exampl just say so.
    Good luck!

    sln

    #######################################################
    # Get Common Elements (from two N-dimensioned Array's)
    # IN - Refs to the NxN arrays to compare,
    # sort flag and the compare field.
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #
    sub GetCommonElementsNxM($$$$)
    {
    my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    $Fld = 0 unless defined $Fld;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Left};
    @{$A_Right} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = undef;
    my $rlast = undef;
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element->[$Fld]) eq uc($llast->[$Fld]));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element->[$Fld]) lt uc($right_element->[$Fld]));
    $cnt++;
    next if (uc($right_element->[$Fld]) eq uc($rlast->[$Fld]));
    if (uc($left_element->[$Fld]) eq uc($right_element->[$Fld]))
    {
    # push (@Dup, $right_element->[$Fld]); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (@Dup);
    return (@Ndx);
    }


    #######################################################
    # Get Common Elements from single Array's
    # IN - Refs to the Nx1 arrays to compare, sort flag
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #######################################################
    sub GetCommonElements($$$)
    {
    my ($A_Left,$A_Right,$Srtflg) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a) cmp uc($b)} @{$A_Left};
    @{$A_Right} = sort {uc($a) cmp uc($b)} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = '';
    my $rlast = '';
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element) eq uc($llast));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element) lt uc($right_element));
    $cnt++;
    next if (uc($right_element) eq uc($rlast));
    if (uc($left_element) eq uc($right_element))
    {
    # push (@Dup, $right_element); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (@Dup);
    return (@Ndx);
    }
     
    , Jan 12, 2009
    #5
  6. Francois Massion <> wrote:

    > I would like to use 2 arrays, says @array1 and
    > @array2 instead of files a.txt and b.txt.



    > while (<WORDLIST2>) {
    > $list2{$_}=1; #or any other value
    > }



    my @b_txt = <WORDLIST2>;
    close WORDLIST2;
    foreach ( @b_txt ) {
    $list2{$_} = 1;
    }


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Jan 12, 2009
    #6
  7. Francois Massion

    Guest

    On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan <> wrote:

    >Francois Massion <> wrote:
    >
    >> I would like to use 2 arrays, says @array1 and
    >> @array2 instead of files a.txt and b.txt.

    >
    >
    >> while (<WORDLIST2>) {
    >> $list2{$_}=1; #or any other value
    >> }

    >
    >
    > my @b_txt = <WORDLIST2>;

    Isin't slurp a bit rich?
    [snip]

    sln
     
    , Jan 13, 2009
    #7
  8. Francois Massion

    Tim Greer Guest

    wrote:

    > On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    > <> wrote:
    >
    >>Francois Massion <> wrote:
    >>
    >>> I would like to use 2 arrays, says @array1 and
    >>> @array2 instead of files a.txt and b.txt.

    >>
    >>
    >>> while (<WORDLIST2>) {
    >>> $list2{$_}=1; #or any other value
    >>> }

    >>
    >>
    >> my @b_txt = <WORDLIST2>;

    > Isin't slurp a bit rich?
    > [snip]
    >
    > sln


    Probably, but that's what the OP asked for. I can't imagine why, but
    perhaps they just need to elaborate on their reasons to get a good
    answer. Personally, I didn't answer them, because I had to ask what
    they were wanting to do, since it didn't seem clear to me (or didn't
    seem to have a purpose and would just waste good processing).
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Jan 13, 2009
    #8
  9. Francois Massion

    Guest

    On Mon, 12 Jan 2009 16:41:12 -0800, Tim Greer <> wrote:

    > wrote:
    >
    >> On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    >> <> wrote:
    >>
    >>>Francois Massion <> wrote:
    >>>
    >>>> I would like to use 2 arrays, says @array1 and
    >>>> @array2 instead of files a.txt and b.txt.
    >>>
    >>>
    >>>> while (<WORDLIST2>) {
    >>>> $list2{$_}=1; #or any other value
    >>>> }
    >>>
    >>>
    >>> my @b_txt = <WORDLIST2>;

    >> Isin't slurp a bit rich?
    >> [snip]
    >>
    >> sln

    >
    >Probably, but that's what the OP asked for. I can't imagine why, but
    >perhaps they just need to elaborate on their reasons to get a good
    >answer. Personally, I didn't answer them, because I had to ask what
    >they were wanting to do, since it didn't seem clear to me (or didn't
    >seem to have a purpose and would just waste good processing).


    There's a thin line between yes and no, but if its not yes....

    sln
     
    , Jan 13, 2009
    #9
  10. Francois Massion

    Tim Greer Guest

    wrote:

    > On Mon, 12 Jan 2009 16:41:12 -0800, Tim Greer <>
    > wrote:
    >
    >> wrote:
    >>
    >>> On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    >>> <> wrote:
    >>>
    >>>>Francois Massion <> wrote:
    >>>>
    >>>>> I would like to use 2 arrays, says @array1 and
    >>>>> @array2 instead of files a.txt and b.txt.
    >>>>
    >>>>
    >>>>> while (<WORDLIST2>) {
    >>>>> $list2{$_}=1; #or any other value
    >>>>> }
    >>>>
    >>>>
    >>>> my @b_txt = <WORDLIST2>;
    >>> Isin't slurp a bit rich?
    >>> [snip]
    >>>
    >>> sln

    >>
    >>Probably, but that's what the OP asked for. I can't imagine why, but
    >>perhaps they just need to elaborate on their reasons to get a good
    >>answer. Personally, I didn't answer them, because I had to ask what
    >>they were wanting to do, since it didn't seem clear to me (or didn't
    >>seem to have a purpose and would just waste good processing).

    >
    > There's a thin line between yes and no, but if its not yes....
    >
    > sln


    Its (its) or it's (it is) not yes, and what isn't? I'd say it is yes
    (being it's "probably" going to be more resource intensive to grab all
    of the contents of WORLDLIST2 into an array), only to compare then,
    comparing it to a while loop to see if a hash value exists before
    adding it (if the OP wants to add it). I absolutely don't understand
    your cryptic response. I'm certain you were being sarcastic, but I
    don't know if you mean to have a question in there.
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Jan 13, 2009
    #10
  11. Francois Massion

    Guest

    On Mon, 12 Jan 2009 19:15:37 -0800, Tim Greer <> wrote:

    > wrote:
    >
    >> On Mon, 12 Jan 2009 16:41:12 -0800, Tim Greer <>
    >> wrote:
    >>
    >>> wrote:
    >>>
    >>>> On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    >>>> <> wrote:
    >>>>
    >>>>>Francois Massion <> wrote:
    >>>>>
    >>>>>> I would like to use 2 arrays, says @array1 and
    >>>>>> @array2 instead of files a.txt and b.txt.
    >>>>>
    >>>>>
    >>>>>> while (<WORDLIST2>) {
    >>>>>> $list2{$_}=1; #or any other value
    >>>>>> }
    >>>>>
    >>>>>
    >>>>> my @b_txt = <WORDLIST2>;
    >>>> Isin't slurp a bit rich?
    >>>> [snip]
    >>>>
    >>>> sln
    >>>
    >>>Probably, but that's what the OP asked for. I can't imagine why, but
    >>>perhaps they just need to elaborate on their reasons to get a good
    >>>answer. Personally, I didn't answer them, because I had to ask what
    >>>they were wanting to do, since it didn't seem clear to me (or didn't
    >>>seem to have a purpose and would just waste good processing).

    >>
    >> There's a thin line between yes and no, but if its not yes....
    >>
    >> sln

    >
    >Its (its) or it's (it is) not yes, and what isn't? I'd say it is yes
    >(being it's "probably" going to be more resource intensive to grab all
    >of the contents of WORLDLIST2 into an array), only to compare then,
    >comparing it to a while loop to see if a hash value exists before
    >adding it (if the OP wants to add it). I absolutely don't understand
    >your cryptic response. I'm certain you were being sarcastic, but I
    >don't know if you mean to have a question in there.


    This no joke! I don't write anything unles I need to!
    I said before this is beginner code. No sarcasm inteneded.
    I do not want to go through this. The prototypes are wrong along
    with alot of other stuff. Amazingly, this heards all the dup porn out
    there into safely stored data. Go figure... it works like a champ!!

    sln

    mrg_test1.pl
    ----------

    use strict;
    use warnings;
    require SMG;

    use File::stat;
    use File::Spec;
    use File::path;
    use File::Copy;
    use File::Find;
    use sort 'stable';

    my $VERSION = 1.00;

    my $current = sort::current();
    #print "\n==> sort : $current\n\n";

    # Test smerge alone
    #-----------------------
    if (1)
    {
    #
    my $ToDir = 'D:\agent\t1';
    my $FromDir = 'D:\agent\t2';
    my $PrefixName = time()."_";
    my $PrefixLevel = 5;
    my @Exclude = ('\.jbf$', '\.alb$'); # if all goes ok, these will be rmdir'd with the from, but not moved
    my $Duponly = 0;
    my $RmvFrom = 0;

    if (SMG::SafeMerge ($ToDir,$FromDir,$PrefixName,$PrefixLevel,\@Exclude,$Duponly,$RmvFrom)) {
    print "\nSafe Merge exited ok.\n";
    } else { print "\nSafe Merge had a headache.\n"; }

    }

    ---------------------
    smg.pm
    ----------
    package SMG;
    use strict;
    use warnings;
    use File::stat;
    use File::path;
    use File::Copy;
    use File::Find;
    use File::Spec;
    use sort 'stable';


    require Exporter;
    our @ISA = qw(Exporter);
    our @EXPORT = qw(SafeMerge GetCommonElementsNxM GetCommonElements);
    my $VERSION = 1.00;


    #my $current = sort::current();



    ####################################################################################
    # SAFE MERGE
    # The two major options -
    # 1. Merge 2 directories into 1, preserving/renaming duplicate named files.
    # 2. Compare 2 directories and just rename the duplicates of 1 directory.
    # Also -
    # - The prefix level will be on a per-file monitored basis.
    # - File/Dir names are cached and renamed in the "From" array,
    # - Move is done via os-rename function into the "ToDir".
    # - Remove "From" directory option (after rename).
    # - Duplicate's-renaming-only is done in-place in the "FromDir".
    # - Report output is created in the current directory (for now).
    # - Itteration scheme are within-directory, then cross-directory until no more dups.
    #-----------------------------------------------------------------------------------
    sub SafeMerge ($$$$$$$)
    {
    return 0 if (@_ < 2 || @_ > 7);
    my ($ToDir,$FromDir,$PrefixName,$PrefixLevel,$Exclude,$Duponly,$RmvFrom) = @_;

    $PrefixName = '' unless defined $PrefixName; # Renaming Prefix
    $PrefixLevel = 0 unless defined $PrefixLevel; # Number of times to += prefix

    $Exclude = [] unless defined $Exclude; # Exclude list (must be re)
    $Duponly = 0 unless defined $Duponly; # Flag to just rename dups in "from"
    $RmvFrom = 0 unless defined $RmvFrom; # Flag to remove "from" after operation

    my @To = ();
    my @From = ();
    my @ndx = ();
    my @Xfound = ();
    my $dupsize = @ndx;
    my $passes = 0;

    print "\n\n *******************************\n";
    print "********[ Safe Merge ]*********\n";
    print " *******************************\n";
    print "\nPrefix\t$PrefixName\nLevel\t$PrefixLevel\nChecks ... ";

    $ToDir = File::Spec->canonpath( "$ToDir" );
    $FromDir = File::Spec->canonpath( "$FromDir" );
    #print "$ToDir\n$FromDir";
    if ($ToDir eq $FromDir) { print "\rFrom/To are identical: $ToDir\n"; return 0; }
    if (!-e $ToDir) { print "\rTo directory does not exist: $ToDir\n"; return 0; }
    if (!-e $FromDir) { print "\rFrom directory does not exist: $FromDir\n"; return 0;}

    # Glob seems to be bothered by spaces in dir path names
    my $Sep = "\\".File::Spec->canonpath( "/" );
    {
    my $td = File::Spec->canonpath( "$ToDir/*" );
    my $fd = File::Spec->canonpath( "$FromDir/*" );
    if ($td =~ / /) {
    $td = "'$td'";
    }
    if ($fd =~ / /) {
    $fd = "'$fd'";
    }
    foreach (glob ("$td")) {
    /.+$Sep(.+)$/;
    # /.+\/(.+)$/;
    push (@To, [$1, 0]);
    }
    foreach (glob ("$fd")) {
    /.+$Sep(.+)$/;
    # /.+\/(.+)$/;
    my $ftmp = $1;
    for (@{$Exclude}) {
    if ($ftmp =~ /$_/) {
    # save in @Xfound if $RmvFrom is set (see below)
    print "\rExclude - $ftmp\n";
    push (@Xfound, $ftmp) if ($RmvFrom);
    $ftmp = undef;
    last;
    }
    }
    push (@From, [$ftmp, 0, '']) if defined ($ftmp);
    }
    }

    # Will return if "From" empty. Technically not an error.
    # If it is empty through exclusions (regexp filter), "From" could be rmdir'd below.
    if (@From == 0) { print "\rEmpty dir: $FromDir\n"; return 0; }
    print "OK\n";


    ###**************************************************************************
    ## Iterate until there are no more dups across directories after (D2) rename
    #
    do {
    @ndx = GetCommonElementsNxM(\@To,\@From, 1, 0);
    $dupsize = @ndx;
    $passes++;

    print "\nP A S S \# $passes\n---------------------\n";
    print "Found $dupsize dups:\n";

    #**********************************
    # Check for cross-directory dups
    # -------------
    if ($dupsize)
    {
    my $errflg = 0;
    print "\n<DIR> $FromDir\n\n";
    for (my $i = 0; $i < $dupsize; $i++)
    {
    my $Fromref = $From[$ndx[$i]];
    my $pname = $PrefixName;

    print "\t$Fromref->[0] ... $PrefixName";

    #*****************************************
    # Rename and check for in-directory dups
    # ----------
    while (1)
    {
    # Level check
    if ($Fromref->[1] >= $PrefixLevel)
    {
    print "$Fromref->[0] <<level $Fromref->[1] exceeded>>\n";
    # Clear the name in From so this file doesn't get moved
    $Fromref->[0] = "";
    $errflg = 1;
    last;
    }
    my $stmp = uc($pname.$Fromref->[0]);
    my $found = 0;
    for (@From) { if (uc($_->[0]) eq $stmp) { $found = 1; last; } }

    if (!$found) {
    # OK to rename! Dup not found in "From"
    # Change the name in "From" to this new one
    # so it gets moved later. Increment its level of prefix
    # ---------------------
    print "$Fromref->[0]";
    print "\n";
    if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
    $Fromref->[0] = $pname.$Fromref->[0];
    $Fromref->[1] += 1;
    last;
    }
    else {
    # File exist with that name, add another level of prefix.
    # ---------------------
    print "$PrefixName";
    $pname = $pname.$PrefixName;
    if ($Fromref->[1] == 0) { $Fromref->[2] = $Fromref->[0]; }
    $Fromref->[1] += 1;
    }
    }
    }
    if ($errflg) {
    print "!!! Can't rename some files, raise the level or change the prefix.\n";
    #return 0;
    }
    }
    } while ($dupsize > 0);
    #
    ## End cross-dir iteration
    ###

    #*************************************************************
    # Check for rename of duplicates in D2 (only) without moving
    # ------------
    if ($Duponly)
    {
    my $rentried = 0;
    my $renok = 0;
    print "\nRename only ... ";
    for (@From) {
    my $res = 0;
    if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
    $rentried++;
    my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
    my $fr2 = File::Spec->canonpath( "$FromDir/$_->[0]" );
    if (!($res = rename ("$fr", "$fr2"))) {
    print "Rename error:\t$_->[2] ... $_->[0]\n";
    } else {
    $renok++; }
    }
    }
    if ($rentried) { print "$renok out of $rentried files OK\n"; }
    else { print "No duplicates found\n"; }
    return 1;
    }

    #**********************
    # Move "From" into "To"
    # -------------------
    #print @From."@From\n";
    my $sep = File::Spec->canonpath( "/" );
    print "\nMoving:\t$FromDir$sep* to $ToDir$sep\n";
    my $movcnt = 0;

    for (@From) {
    my $res = 0;
    if ($_->[1] > 0 && $_->[1] < $PrefixLevel) {
    my $fr = File::Spec->canonpath( "$FromDir/$_->[2]" );
    my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
    $res = File::Copy::move ("$fr", "$to");
    if ($res == 0) { print "Move error:\t$_->[2] ... $to\n"; }
    }
    if ($_->[1] == 0 && $_->[0] gt '') {
    my $fr = File::Spec->canonpath( "$FromDir/$_->[0]" );
    my $to = File::Spec->canonpath( "$ToDir/$_->[0]" );
    $res = File::Copy::move ("$fr", "$to");
    if ($res == 0) { print "Move error:\t$fr\n" }
    }
    $movcnt++ if ($res);
    }
    my $notmoved = @From - $movcnt;
    print "Moved:\t$movcnt out of ".@From." files.\n";

    #***************************************
    # Check if "FromDir" is to be deleted
    # --------------
    if ($RmvFrom) {
    print "Remove:\t$FromDir ... ";
    if ($notmoved > 0) {
    print "not deleted, contains $notmoved file(s) that couldn't be moved!\n";
    }
    else {
    # check if @Xfound has values (and their not directories)
    # if so, delete these first before trying to remove the From directory
    # for now, don't want to "unlink" a directory, and we're not doing a tree here
    # ----------------
    for (@Xfound) {
    my $fr = File::Spec->canonpath( "$FromDir/$_" );
    unlink ("$fr") if (!-d $fr) ; }
    my $fr = File::Spec->canonpath( "$FromDir" );
    if (!rmdir "$fr") {print "$!\n" } else {print "OK\n"; }
    }
    }
    return 1;
    }


    #######################################################
    # Get Common Elements (from two N-dimensioned Array's)
    # IN - Refs to the NxN arrays to compare,
    # sort flag and the compare field.
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #
    sub GetCommonElementsNxM($$$$)
    {
    my ($A_Left,$A_Right,$Srtflg,$Fld) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    $Fld = 0 unless defined $Fld;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Left};
    @{$A_Right} = sort {uc($a->[$Fld]) cmp uc($b->[$Fld])} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = undef;
    my $rlast = undef;
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element->[$Fld]) eq uc($llast->[$Fld]));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element->[$Fld]) lt uc($right_element->[$Fld]));
    $cnt++;
    next if (uc($right_element->[$Fld]) eq uc($rlast->[$Fld]));
    if (uc($left_element->[$Fld]) eq uc($right_element->[$Fld]))
    {
    # push (@Dup, $right_element->[$Fld]); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (@Dup);
    return (@Ndx);
    }


    #######################################################
    # Get Common Elements from single Array's
    # IN - Refs to the Nx1 arrays to compare, sort flag
    # OUT - Ndx's into Right_Array of matching elements
    # ---------------------------------------------------
    # Notes -
    # 1. Elements are assumed textual and case insensitive
    # 2. Ignores in-array duplicates
    # 3. Sort will be done if sort flag > 0
    #######################################################
    sub GetCommonElements($$$)
    {
    my ($A_Left,$A_Right,$Srtflg) = @_;
    $Srtflg = 0 unless defined $Srtflg;
    # my @Dup = ();
    my @Ndx = ();

    if ($Srtflg > 0) {
    @{$A_Left} = sort {uc($a) cmp uc($b)} @{$A_Left};
    @{$A_Right} = sort {uc($a) cmp uc($b)} @{$A_Right};
    } else {print "==> Common Elements : Not sorting arrays\n";}

    my $rpos = 0;
    my $rend = @{$A_Right};
    my $cnt = 0;
    my $llast = '';
    my $rlast = '';
    foreach my $left_element (@{$A_Left})
    {
    next if (uc($left_element) eq uc($llast));

    $rpos += $cnt;
    $cnt = 0;
    foreach my $right_element (@{$A_Right}[$rpos..($rend-1)])
    {
    last if (uc($left_element) lt uc($right_element));
    $cnt++;
    next if (uc($right_element) eq uc($rlast));
    if (uc($left_element) eq uc($right_element))
    {
    # push (@Dup, $right_element); # the string
    push (@Ndx, $rpos+$cnt-1); # the index into R_Array
    last;
    }
    $rlast = $right_element;
    }
    $llast = $left_element;
    last if ($rpos >= $rend);
    }
    # return (@Dup);
    return (@Ndx);
    }


    1;
     
    , Jan 13, 2009
    #11
  12. Francois Massion

    Tim Greer Guest

    wrote:

    > On Mon, 12 Jan 2009 19:15:37 -0800, Tim Greer <>
    > wrote:
    >
    >> wrote:
    >>
    >>> On Mon, 12 Jan 2009 16:41:12 -0800, Tim Greer <>
    >>> wrote:
    >>>
    >>>> wrote:
    >>>>
    >>>>> On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    >>>>> <> wrote:
    >>>>>
    >>>>>>Francois Massion <> wrote:
    >>>>>>
    >>>>>>> I would like to use 2 arrays, says @array1 and
    >>>>>>> @array2 instead of files a.txt and b.txt.
    >>>>>>
    >>>>>>
    >>>>>>> while (<WORDLIST2>) {
    >>>>>>> $list2{$_}=1; #or any other value
    >>>>>>> }
    >>>>>>
    >>>>>>
    >>>>>> my @b_txt = <WORDLIST2>;
    >>>>> Isin't slurp a bit rich?
    >>>>> [snip]
    >>>>>
    >>>>> sln
    >>>>
    >>>>Probably, but that's what the OP asked for. I can't imagine why,
    >>>>but perhaps they just need to elaborate on their reasons to get a
    >>>>good
    >>>>answer. Personally, I didn't answer them, because I had to ask what
    >>>>they were wanting to do, since it didn't seem clear to me (or didn't
    >>>>seem to have a purpose and would just waste good processing).
    >>>
    >>> There's a thin line between yes and no, but if its not yes....
    >>>
    >>> sln

    >>
    >>Its (its) or it's (it is) not yes, and what isn't? I'd say it is yes
    >>(being it's "probably" going to be more resource intensive to grab all
    >>of the contents of WORLDLIST2 into an array), only to compare then,
    >>comparing it to a while loop to see if a hash value exists before
    >>adding it (if the OP wants to add it). I absolutely don't understand
    >>your cryptic response. I'm certain you were being sarcastic, but I
    >>don't know if you mean to have a question in there.

    >
    > This no joke! I don't write anything unles I need to!
    > I said before this is beginner code. No sarcasm inteneded.
    > I do not want to go through this. The prototypes are wrong along
    > with alot of other stuff. Amazingly, this heards all the dup porn out
    > there into safely stored data. Go figure... it works like a champ!!
    >
    > sln
    >
    > mrg_test1.pl

    <snip irrelevant code>

    What are you yammering on about? Why did you paste a long portion of
    irrelevant code to this OP's question? Regardless, that code could be
    summarized and be much shorter, but since it's not relevant to the OPs
    problem, there's no reason to discuss it. You were doing okay for a
    couple for days, and now you're posting crazy nonsense again. And...
    talk about bloated code, geez!
    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Jan 13, 2009
    #12
  13. On 13 Jan., 01:41, Tim Greer <> wrote:
    > wrote:
    > > On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    > > <> wrote:

    >
    > >>Francois Massion <> wrote:

    >
    > >>> I would like to use 2 arrays, says @array1 and
    > >>> @array2 instead of files a.txt and b.txt.

    >
    > >>> while (<WORDLIST2>) {
    > >>>     $list2{$_}=1; #or any other value
    > >>> }

    >
    > >>    my @b_txt = <WORDLIST2>;

    > > Isin't slurp a bit rich?
    > > [snip]

    >
    > > sln

    >
    > Probably, but that's what the OP asked for.  I can't imagine why, but
    > perhaps they just need to elaborate on their reasons to get a good
    > answer.  Personally, I didn't answer them, because I had to ask what
    > they were wanting to do, since it didn't seem clear to me (or didn't
    > seem to have a purpose and would just waste good processing).
    > --
    > Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    > Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    > and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
    > Industry's most experienced staff! -- Web Hosting With Muscle!- ZitiertenText ausblenden -
    >
    > - Zitierten Text anzeigen -


    Thanks to all for your contributions. I'll have to try them out. The
    background for my asking (as a non-pro) is the following. I am doing
    terminology extraction for linguistic purposes. Thus I take a text,
    split it up in words or expressions and perform various "refining"
    operations in order to get only the clean interesting terms as they
    appear in a dictionary. Each operation is currently a small amateurish
    little script and the output is an array which I can display in a file
    or on screen.

    Now I want to automate all these single steps into one operation which
    means that instead of reading word lists from text files I would like
    to use the arrays generated by the previous step. This is the reason
    for the question above. I have 2 different files as the result of 2
    previous steps and the difference are the words which are interesting
    for my terminology work. I hope this helps.
     
    Francois Massion, Jan 13, 2009
    #13
  14. On 13 Jan., 19:09, Francois Massion <> wrote:
    > On 13 Jan., 01:41, Tim Greer <> wrote:
    >
    >
    >
    >
    >
    > > wrote:
    > > > On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    > > > <> wrote:

    >
    > > >>Francois Massion <> wrote:

    >
    > > >>> I would like to use 2 arrays, says @array1 and
    > > >>> @array2 instead of files a.txt and b.txt.

    >
    > > >>> while (<WORDLIST2>) {
    > > >>>     $list2{$_}=1; #or any other value
    > > >>> }

    >
    > > >>    my @b_txt = <WORDLIST2>;
    > > > Isin't slurp a bit rich?
    > > > [snip]

    >
    > > > sln

    >
    > > Probably, but that's what the OP asked for.  I can't imagine why, but
    > > perhaps they just need to elaborate on their reasons to get a good
    > > answer.  Personally, I didn't answer them, because I had to ask what
    > > they were wanting to do, since it didn't seem clear to me (or didn't
    > > seem to have a purpose and would just waste good processing).
    > > --
    > > Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    > > Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    > > and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
    > > Industry's most experienced staff! -- Web Hosting With Muscle!- Zitierten Text ausblenden -

    >
    > > - Zitierten Text anzeigen -

    >
    > Thanks to all for your contributions. I'll have to try them out. The
    > background for my asking (as a non-pro) is the following. I am doing
    > terminology extraction for linguistic purposes. Thus I take a text,
    > split it up in words or expressions and perform various "refining"
    > operations in order to get only the clean interesting terms as they
    > appear in a dictionary. Each operation is currently a small amateurish
    > little script and the output is anarraywhich I can display in a file
    > or on screen.
    >
    > Now I want to automate all these single steps into one operation which
    > means that instead of reading word lists from text files I would like
    > to use the arrays generated by the previous step. This is the reason
    > for the question above. I have 2 different files as the result of 2
    > previous steps and the difference are the words which are interesting
    > for my terminology work. I hope this helps.- Zitierten Text ausblenden -
    >
    > - Zitierten Text anzeigen -


    Thanks to all. Basically the solution to my problem seems to be:

    my %in_array2 = map { $_ => 1 } @array2;
    my @array3 = grep { !$in_array2{$_} } @array1;

    It works as expected. I just need to solve 2 problems not directly
    related to my question:
    i) When I output @array3 I get a space before each entry
    ii) no matter how I encode the files (ascii, unicode, "use utf8" etc)
    special characters (like the German Umlaut) are corrupt and the output
    file is in UTF8).

    I'll find a solution but maybe someone knows already the answer...
     
    Francois Massion, Jan 15, 2009
    #14
  15. Francois Massion <> wrote:


    > i) When I output @array3 I get a space before each entry



    If you are having trouble with output, it is easier to help if
    we can actually see the statement that generates the output...

    Are you maybe getting a space before each entry *except the first one*?

    If so, then remove the double quotes.


    perldoc -q space

    Why do I get weird spaces when I print an array of lines?


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Jan 15, 2009
    #15
  16. Francois Massion

    Guest

    On Thu, 15 Jan 2009 01:03:49 -0800 (PST), Francois Massion <> wrote:

    >On 13 Jan., 19:09, Francois Massion <> wrote:
    >> On 13 Jan., 01:41, Tim Greer <> wrote:
    >>
    >>
    >>
    >>
    >>
    >> > wrote:
    >> > > On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
    >> > > <> wrote:

    >>
    >> > >>Francois Massion <> wrote:

    >>
    >> > >>> I would like to use 2 arrays, says @array1 and
    >> > >>> @array2 instead of files a.txt and b.txt.

    >>
    >> > >>> while (<WORDLIST2>) {
    >> > >>>     $list2{$_}=1; #or any other value
    >> > >>> }

    >>
    >> > >>    my @b_txt = <WORDLIST2>;
    >> > > Isin't slurp a bit rich?
    >> > > [snip]

    >>
    >> > > sln

    >>
    >> > Probably, but that's what the OP asked for.  I can't imagine why, but
    >> > perhaps they just need to elaborate on their reasons to get a good
    >> > answer.  Personally, I didn't answer them, because I had to ask what
    >> > they were wanting to do, since it didn't seem clear to me (or didn't
    >> > seem to have a purpose and would just waste good processing).
    >> > --
    >> > Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    >> > Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    >> > and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
    >> > Industry's most experienced staff! -- Web Hosting With Muscle!- Zitierten Text ausblenden -

    >>
    >> > - Zitierten Text anzeigen -

    >>
    >> Thanks to all for your contributions. I'll have to try them out. The
    >> background for my asking (as a non-pro) is the following. I am doing
    >> terminology extraction for linguistic purposes. Thus I take a text,
    >> split it up in words or expressions and perform various "refining"
    >> operations in order to get only the clean interesting terms as they
    >> appear in a dictionary. Each operation is currently a small amateurish
    >> little script and the output is anarraywhich I can display in a file
    >> or on screen.
    >>
    >> Now I want to automate all these single steps into one operation which
    >> means that instead of reading word lists from text files I would like
    >> to use the arrays generated by the previous step. This is the reason
    >> for the question above. I have 2 different files as the result of 2
    >> previous steps and the difference are the words which are interesting
    >> for my terminology work. I hope this helps.- Zitierten Text ausblenden -
    >>
    >> - Zitierten Text anzeigen -

    >
    >Thanks to all. Basically the solution to my problem seems to be:
    >
    >my %in_array2 = map { $_ => 1 } @array2;
    >my @array3 = grep { !$in_array2{$_} } @array1;

    [snip]

    There are many solutions. The one you picked will take the longest.
    I just don't like this. I don't line grep to begin with, there is not a functional
    way to terminate it, let alone the hash lookup and map overhead.

    If speed isin't a concern, then any solution will do.

    sln
     
    , Jan 15, 2009
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben Holness

    Hashes of Hashes via subs

    Ben Holness, Oct 5, 2003, in forum: Perl
    Replies:
    8
    Views:
    591
    Ben Holness
    Oct 8, 2003
  2. Jeff
    Replies:
    5
    Views:
    110
  3. Edward Wijaya

    Joining 2 arrays into hashes

    Edward Wijaya, Jun 1, 2004, in forum: Perl Misc
    Replies:
    20
    Views:
    251
    Ben Morrow
    Jun 4, 2004
  4. Tim O'Donovan

    Hash of hashes, of hashes, of arrays of hashes

    Tim O'Donovan, Oct 27, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    235
  5. Keith Lee
    Replies:
    8
    Views:
    126
    robic0
    Feb 22, 2006
Loading...

Share This Page