Dirty Arrays and how to clean them up!

Discussion in 'Perl' started by spamm@realconsultants.com, Mar 1, 2005.

  1. Guest

    I have this array with duplicate entries. Hundreds to be in fact.

    For example:Array = 17177 9661 9661 9535 9533 9533 9533 9533 9533 9533
    9533 9533 9533 9533 9533 9533 9533 9533 9532 9532 9532 9532 9531 9096
    9096 9096 9095 9095 9095 8345 8345 8344 8344 8226 8226 8225 8225 8198
    8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198 8198
    8198 8198 8198 8198 8198 8198 8198 8198 8198 8198..............

    This is what i am doing. i am reading file1 that has these entries in
    the file mutiple times. I am comparing it against file2. if i do not
    find a match i write the record to file3 and then to an array. when i
    read file1 again and compare it against file2 i check to see if the
    recorrd has been written by rereading the array and looking for it.
    if it finds the same entry in the array then read the next record do
    not wirte to file3. if not then write the record to file3 and the
    array.

    i either want to only write once to the array per record or before i
    check remove dupliactes.

    here some code example:


    while file1
    while file2
    if record1 != record2
    if record2 is not in array
    write to file3
    write to new record to array
    last
    else
    next
    else
    next



    Here is another example that
    shows a poor example but you should get the idea. at least i
    hope!!!!

    @SORTED = ();
    @CLEANED = ();
    $X = 0;
    $R = 0;
    while ($R < 4)
    {
    while ($X <= 100)
    {
    $X++;
    # print "I am in X\n";
    # # i really do not want to write to the array if
    # # it already exists in the array.
    # # if array already has X then do not right to the
    # # array.
    push @CHECKED, $X;
    @SORTED = sort {$a <=> $b} @CHECKED;
    }
    # print "*********I am in R\n";
    $R++;
    $X = 0;
    next;
    }

    print "Array = @SORTED\n";

    foreach $Row (@SORTED)
    {
    $Hold = $Row;
    if ($Hold == $Row)
    {
    @CLEANED = shift @SORTED;
    }
    else
    {
    next;
    }
    # # How do i remove duplicates from the array?
    # # I know this is wrong but here is my delima!
    }

    print "Array = @CLEANED\n";
     
    , Mar 1, 2005
    #1
    1. Advertising

  2. Bob Guest

    I have been working on the solution myself, but i am still having
    problems. Here is where I am so far!

    Still lost!

    @SORTED = ();
    @CLEANED = ();
    $X = 0;
    $R = 0;
    while ($R < 4)
    {
    while ($X <= 100)
    {
    $X++;
    # print "I am in X\n";
    # # i really do not want to write to the array if
    # # it already exists in the array.
    # # if array already has X then do not right to the
    # # array.
    push @CHECKED, $X;
    @SORTED = sort {$a <=> $b} @CHECKED;
    }
    # print "*********I am in R\n";
    $R++;
    $X = 0;
    next;
    }

    print "Array = @SORTED\n";

    $Element2 = 0;

    for ($Element1 = 0; $Element1 < @SORTED; $Element++)
    {
    $Element2 = ($Element1 + 1);
    if ($SORTED[$Element1] == $SORTED[$Element2])
    {
    print "$SORTED[$Element1] == $SORTED[$Element2]\n";
    delete $SORTED[$Element2];
    $Element1--;
    next;
    }
    else
    {
    next;
    }
    }
    # # How do i remove duplicates from the array?
    # # I know this is wrong but here is my delima!

    print "Array = @CLEANED\n";
     
    Bob, Mar 2, 2005
    #2
    1. Advertising

  3. Bob Guest

    Sorry i cut and pasted only half of the code!

    @SORTED = ();
    @CLEANED = ();
    $X = 0;
    $R = 0;
    while ($R < 4)
    {
    while ($X <= 100)
    {
    $X++;
    # print "I am in X\n";
    # # i really do not want to write to the array if
    # # it already exists in the array.
    # # if array already has X then do not right to the
    # # array.
    push @CHECKED, $X;
    @SORTED = sort {$a <=> $b} @CHECKED;
    }
    # print "*********I am in R\n";
    $R++;
    $X = 0;
    next;
    }

    print "Array = @SORTED\n";

    $Element2 = 0;

    for ($Element1 = 0; $Element1 < @SORTED; $Element++)
    {
    $Element2 = ($Element1 + 1);
    if ($SORTED[$Element1] == $SORTED[$Element2])
    {
    print "$SORTED[$Element1] == $SORTED[$Element2]\n";
    delete $SORTED[$Element2];
    $Element1--;
    next;
    }
    else
    {
    next;
    }
    }
    # # How do i remove duplicates from the array?
    # # I know this is wrong but here is my delima!

    print "Array = @CLEANED\n";
     
    Bob, Mar 2, 2005
    #3
  4. ray d Guest

    bob

    the easiest way to do this is to use all elements of your dirty as keys
    to a hash. there are shorter forms to this solution, but i'll leave it
    verbose for clarity;

    hope this helps

    -r

    ###############################################################
    # see
    # http://perlmonks.com/index.pl?node=609
    # for reference
    ###############################################################

    use strict;

    my @dirty = ( 1, 2, 3, 2, 3, 4, 9, 8, 7, 6, 5, 4, 3, 2, 1);


    #
    # the hash we will use to track unique elements in @dirty
    #
    my %occurred;
    foreach my $dirtyElement ( @dirty )
    {
    $occurred{$dirtyElement} = 1;
    }


    #
    # we can now retrieve the unique elements from %occurred
    #
    my @unique = keys %occurred;

    print @unique;

    ###############################################################




    "Bob" <> wrote in news:1109737474.473572.207520
    @g14g2000cwa.googlegroups.com:

    > Sorry i cut and pasted only half of the code!
    >
    > @SORTED = ();
    > @CLEANED = ();
    > $X = 0;
    > $R = 0;
    > while ($R < 4)
    > {
    > while ($X <= 100)
    > {
    > $X++;
    > # print "I am in X\n";
    > # # i really do not want to write to the array if
    > # # it already exists in the array.
    > # # if array already has X then do not right to the
    > # # array.
    > push @CHECKED, $X;
    > @SORTED = sort {$a <=> $b} @CHECKED;
    > }
    > # print "*********I am in R\n";
    > $R++;
    > $X = 0;
    > next;
    > }
    >
    > print "Array = @SORTED\n";
    >
    > $Element2 = 0;
    >
    > for ($Element1 = 0; $Element1 < @SORTED; $Element++)
    > {
    > $Element2 = ($Element1 + 1);
    > if ($SORTED[$Element1] == $SORTED[$Element2])
    > {
    > print "$SORTED[$Element1] == $SORTED[$Element2]\n";
    > delete $SORTED[$Element2];
    > $Element1--;
    > next;
    > }
    > else
    > {
    > next;
    > }
    > }
    > # # How do i remove duplicates from the array?
    > # # I know this is wrong but here is my delima!
    >
    > print "Array = @CLEANED\n";
    >
     
    ray d, Mar 9, 2005
    #4
  5. Guest

    Here is my attempt:

    my %temp = ();
    print grep !$temp{$_}++, ((1..10) x 5);

    ....which is essentially the same concept as you described, but just a
    different approach.
     
    , Mar 12, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anonieko

    HttpHandlers - Learn Them. Use Them.

    Anonieko, Jun 15, 2006, in forum: ASP .Net
    Replies:
    5
    Views:
    521
    tdavisjr
    Jun 16, 2006
  2. Replies:
    8
    Views:
    513
  3. Philipp
    Replies:
    21
    Views:
    1,131
    Philipp
    Jan 20, 2009
  4. x
    Replies:
    3
    Views:
    296
    Ben Bacarisse
    Nov 3, 2009
  5. why the lucky stiff
    Replies:
    5
    Views:
    146
    why the lucky stiff
    Sep 22, 2004
Loading...

Share This Page