differences between hashes and arrays ?

Discussion in 'Perl Misc' started by Jack, Sep 22, 2006.

  1. Jack

    Jack Guest

    Hi folks I am not an expert in perl but correct me if I am wrong -

    is it true you can use a hash and work with it just as you would an
    array - what are the differences between them (besides in an array you
    can have a multidimensional array) ?

    why arent folks using hashes instead of arrays since (I believe) they
    are faster to access and take up the same or less memory than arrays..

    Thanks,

    Jack
     
    Jack, Sep 22, 2006
    #1
    1. Advertising

  2. Jack

    Mirco Wahab Guest

    Thus spoke Jack (on 2006-09-22 16:55):

    > is it true you can use a hash and work with it just as you would an
    > array - what are the differences between them (besides in an array you
    > can have a multidimensional array) ?


    There is a similar interface, but thats almost all.

    ==> http://en.wikipedia.org/wiki/Hash_table

    In short, a 'hash table' is a complicated organism
    which allows you to find an item (say: key, stored
    in there) in very very short time, which is the point.

    An "array" means otherwise "access by an index number",
    so array elements must be searched in order to
    find a particular item.

    compare:

    my $item = $hash{ 'ITEM_OF_INTEREST' }; # gets the item

    vs.

    SEARCH: foreach $element (@array) {
    if($element eq 'ITEM_OF_INTEREST') {
    $item = $element; # gets the item
    last SEARCH;
    }
    }

    > why arent folks using hashes instead of arrays since (I believe) they
    > are faster to access and take up the same or less memory than arrays..


    No, folks using hashes if hash tables are appropriate
    and arrays if arrays are ...

    Say, you have ten million 3D coordinate points
    which are in some order already (an order that
    means something to you). Why would you bother to
    put them in a hash table, if all you need is sub-
    sequent access to the 13244'st or 10'th to 1923331'th?

    Regards

    Mirco
     
    Mirco Wahab, Sep 22, 2006
    #2
    1. Advertising

  3. Jack <> wrote:

    > is it true you can use a hash and work with it just as you would an
    > array -



    No.


    > what are the differences between them



    Arrays are ordered and indexed with numbers.

    Hashes are UNordered and indexed with strings.


    > (besides in an array you
    > can have a multidimensional array) ?



    You can also have a multidimensional hash, so that is not
    one of the differences.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Sep 22, 2006
    #3
  4. Jack

    Paul Lalli Guest

    Jack wrote:
    > Hi folks I am not an expert in perl but correct me if I am wrong -


    You're wrong. :)

    > is it true you can use a hash and work with it just as you would an
    > array


    No.

    > - what are the differences between them


    A hash is by definition unordered. There is no ordering of any kind to
    the key/value pairs in a hash. An array is ordered - there is a very
    well defined sequence of which item follows which item. Therefore, you
    can treat an array like a stack or a queue (using the push/pop
    shift/unshift set of functions), while this is not possible with a
    hash. With a hash you can associate to distinct pieces of data in one
    structure, and can retrieve a specific value in very efficient time.

    An array is indexed by positive integers. A hash is keyed by any
    string, whether its contents are numeric or not.

    > (besides in an array you
    > can have a multidimensional array) ?


    Ironically, that is *not* one of the differences. You can have
    multidimensional hashes as well. In fact, you can have hashes of
    arrays, arrays of hashes, arrays of hashes of hashes, hashes of arrays
    of hashes, or any other combination you'd like.

    > why arent folks using hashes instead of arrays since (I believe) they
    > are faster to access and take up the same or less memory than arrays..


    Faster to access, yes. Same or less memory, no.

    These are not apples to apples. You use a hash and an array for
    completely different purposes.

    If you have an example of a program or algorithm you're working on, we
    could help you get rid of some of your misperceptions. . . .

    Paul Lalli
     
    Paul Lalli, Sep 22, 2006
    #4
  5. Jack

    Guest

    "Jack" <> wrote:
    > Hi folks I am not an expert in perl but correct me if I am wrong -
    >
    > is it true you can use a hash and work with it just as you would an
    > array


    That would depend on what you are doing. For some things, you could easily
    use them interchangably. For most things, you can't. If youre indices are
    not integers, you can't use arrays. When your indices are integers but are
    not densely packed but need to accessed as though they were, you cannot
    readily use hashes.


    > - what are the differences between them (besides in an array you
    > can have a multidimensional array) ?


    That is not a difference between them (in Perl).

    > why arent folks using hashes instead of arrays since (I believe) they
    > are faster to access and take up the same or less memory than arrays..


    Wrong on both counts. When arrays and hashes are interchangable, hashes
    are much slower and take substantially more memory.


    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Sep 22, 2006
    #5
  6. Jack

    Jack Guest

    Michele Dondi wrote:
    > On 22 Sep 2006 07:55:56 -0700, "Jack" <> wrote:
    >
    > >Hi folks I am not an expert in perl but correct me if I am wrong -

    >
    > Seeing your question and your considerations below, indeed you seem
    > not to be an expert in perl. So I guess you're not wrong.
    >
    > >is it true you can use a hash and work with it just as you would an
    > >array - what are the differences between them (besides in an array you
    > >can have a multidimensional array) ?

    >
    > You can also have multidimensional hashes. But both are not "real"
    > multidimentional beasts. However for the moment don't bother...
    >
    > Really hashes are associative arrays, i.e. they are a finite mapping
    > between a set of _keys_ and a set _values_. The keys are generic
    > strings. An array is a mapping between a finite set of natural numbers
    > of the form {0,... , n} (if you don't set $[ to something other than
    > 0, but again, don't bother!) and a set of values.
    >
    > >why arent folks using hashes instead of arrays since (I believe) they
    > >are faster to access and take up the same or less memory than arrays..

    >
    > You will certainly know that in perl conversions between numbers and
    > strings happen automagically all the time. So indeed you may use a
    > hash to implement an array. However hashes are called like that
    > because their implementation is based on hash tables that provide
    > means to quickly search through the keys. There's no particular
    > general advantage when the keys are simple numbers instead.
    >
    > So use an array when you need a mapping from a finite set of numbers
    > to some values, except possibly when the mapping is very sparse. How
    > much is "very", may depend on many factors, and I don't know the lower
    > limit exactly. Surely I would use a hash if I need to *explicitly* set
    > the mapping amongst say 100 numbers uniformely distribuited in
    > 0..100_000_000 and some values. I don't have the vaguest idea how well
    > the hashing algorithm would hash such keys, but the thing can be
    > investigated...
    >
    >
    > Michele
    > --
    > {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    > (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    > .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    > 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,


    Thanks for the comments all. For these applications, which is better
    suited, array or hash, and assume a very large volume of data and
    consider memory :
    1- iterate through a huge list (hash or array?) of values and compare
    each value in the list, 1 by 1, to a match condition test like
    m/^\d.*?\d$/ (all digits), and then keep the count of the values that
    pass the match test
    2- compare all values in column 1, file 1 (one value at a time ; assume
    10M records ) to determine if the value exists in the target list (hash
    or array?) from column 2 file 2 (assume column 2 is 20M records)

    Thank you,
    Jack
     
    Jack, Sep 22, 2006
    #6
  7. Jack

    Paul Lalli Guest

    Jack wrote:
    > Thanks for the comments all. For these applications, which is better
    > suited, array or hash, and assume a very large volume of data and
    > consider memory :
    > 1- iterate through a huge list (hash or array?) of values and compare
    > each value in the list, 1 by 1, to a match condition test like
    > m/^\d.*?\d$/ (all digits), and then keep the count of the values that
    > pass the match test


    array. You just want to iterate over all of them, in sequence. No
    reason for the overhead of a hash.

    my $count = grep /^\d.*?\d$/ @great_big_array;

    (by the way, you do know that your pattern does not match "all digits",
    right? It matches anything that starts with and ends with a digit, of
    at least length two. The "inside" of the value can be literally
    anything.)

    > 2- compare all values in column 1, file 1 (one value at a time ; assume
    > 10M records ) to determine if the value exists in the target list (hash
    > or array?) from column 2 file 2 (assume column 2 is 20M records)


    column1, file1 - array. You want to iterate over all of them, one at a
    time, in sequence.

    column2, file2 - hash. You want to access a specific element of your
    set, without knowing or caring where in the set that element falls.

    for my $value (@column1){
    if (exists $column2{$value}) {
    print "$value exists in column2 file2\n";
    } else {
    print "$vlaue does not exist in column2 file2\n";
    }
    }

    Paul Lalli
     
    Paul Lalli, Sep 22, 2006
    #7
  8. wrote:
    > "Jack" <> wrote:
    > > why arent folks using hashes instead of arrays since (I believe) they
    > > are faster to access and take up the same or less memory than arrays..

    >
    > Wrong on both counts. When arrays and hashes are interchangable, hashes
    > are much slower and take substantially more memory.


    Except in the rare and special case of a very sparsely populated array.
     
    Brian McCauley, Sep 22, 2006
    #8
  9. Jack

    Henry Law Guest

    Paul Lalli wrote:
    > Jack wrote:
    >> Hi folks I am not an expert in perl but correct me if I am wrong -

    >
    > You're wrong. :)
    >
    >> is it true you can use a hash and work with it just as you would an
    >> array

    >
    > No.


    Of course all the material about the difference in the ways that hashes
    and arrays is perfectly correct, but there is one sense in which the OP
    is correct - or at least has some method in his madness. Look:

    # -------------------------------------------
    #! /usr/bin/perl

    use strict; use warnings;

    use Data::Dumper;

    my @array = ("one",111,"two",22);
    # Or we could even have coded my %hash = ("one",111,"two",22);
    my %hash = @array;

    print Dumper(\%hash);
    # --------------------------------------------
    F:\>tryit.pl
    $VAR1 = {
    'one' => 111,
    'two' => 22
    };


    --

    Henry Law <>< Manchester, England
     
    Henry Law, Sep 22, 2006
    #9
  10. Jack

    Paul Lalli Guest

    Henry Law wrote:
    > Paul Lalli wrote:
    > > Jack wrote:
    > >> Hi folks I am not an expert in perl but correct me if I am wrong -

    > >
    > > You're wrong. :)
    > >
    > >> is it true you can use a hash and work with it just as you would an
    > >> array

    > >
    > > No.

    >
    > Of course all the material about the difference in the ways that hashes
    > and arrays is perfectly correct, but there is one sense in which the OP
    > is correct - or at least has some method in his madness. Look:
    >
    > # -------------------------------------------
    > #! /usr/bin/perl
    >
    > use strict; use warnings;
    >
    > use Data::Dumper;
    >
    > my @array = ("one",111,"two",22);
    > # Or we could even have coded my %hash = ("one",111,"two",22);
    > my %hash = @array;


    None of this shows "using a hash and array in the same way". This is
    populating an array with a list of values, and then populating a hash
    with the list of values that are currently contained in an array.
    %hash and @array have nothing to do with each other before or after
    this statement.

    Make sure you read and understand
    perldoc -q difference
    "What is the difference between a list and an array"

    Paul Lalli
     
    Paul Lalli, Sep 22, 2006
    #10
  11. Henry Law <> wrote:
    > Paul Lalli wrote:
    >> Jack wrote:
    >>> Hi folks I am not an expert in perl but correct me if I am wrong -

    >>
    >> You're wrong. :)
    >>
    >>> is it true you can use a hash and work with it just as you would an
    >>> array

    >>
    >> No.

    >
    > Of course all the material about the difference in the ways that hashes
    > and arrays is perfectly correct, but there is one sense in which the OP
    > is correct - or at least has some method in his madness. Look:
    >
    > # -------------------------------------------
    > #! /usr/bin/perl
    >
    > use strict; use warnings;
    >
    > use Data::Dumper;
    >
    > my @array = ("one",111,"two",22);



    Change that line to:

    my @array = ("two",22,"one",111);

    and run the program again.


    > # Or we could even have coded my %hash = ("one",111,"two",22);
    > my %hash = @array;
    >
    > print Dumper(\%hash);
    > # --------------------------------------------
    > F:\>tryit.pl
    > $VAR1 = {
    > 'one' => 111,
    > 'two' => 22
    > };



    You get the _same_ output!


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Sep 22, 2006
    #11
  12. Jack

    Uri Guttman Guest

    >>>>> "PL" == Paul Lalli <> writes:


    >> why arent folks using hashes instead of arrays since (I believe) they
    >> are faster to access and take up the same or less memory than arrays..


    PL> Faster to access, yes. Same or less memory, no.

    hmm, in general arrays have faster lookups than hashes. me thinks you
    confused the OP's statement. and arrays definitely use less storage
    (even excluding the key itself) for each element.

    PL> These are not apples to apples. You use a hash and an array for
    PL> completely different purposes.

    they are apples and bananas. :)

    PL> If you have an example of a program or algorithm you're working on, we
    PL> could help you get rid of some of your misperceptions. . . .

    or he should learn more perl before asking such deep questions. :)

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Sep 22, 2006
    #12
  13. Jack

    Paul Lalli Guest

    Uri Guttman wrote:
    > >>>>> "PL" == Paul Lalli <> writes:

    >
    >
    > >> why arent folks using hashes instead of arrays since (I believe) they
    > >> are faster to access and take up the same or less memory than arrays..

    >
    > PL> Faster to access, yes. Same or less memory, no.
    >
    > hmm, in general arrays have faster lookups than hashes. me thinks you
    > confused the OP's statement.


    Or I confused my answer. :) What I meant was that:

    my $there = $hash{$key} ? 1 : 0;
    is faster than:
    my $there = 0;
    for my $elem (@array) {
    if ($elem eq $value) {
    $there = 1;
    next;
    }
    }

    Paul Lalli
     
    Paul Lalli, Sep 22, 2006
    #13
  14. Jack wrote:
    > is it true you can use a hash and work with it just as you would an
    > array - what are the differences between them (besides in an array you
    > can have a multidimensional array) ?


    Both are mappings. Arrays map from a contiguous list of natural numbers,
    starting with 0 (assuming you didn't fool around with $[) into scalars.
    Hashes are a generalization and map arbitrary strings to scalars.

    > why arent folks using hashes instead of arrays since (I believe) they
    > are faster to access and take up the same or less memory than arrays..


    Very often you need organize your data in a sequence. This sequencing is
    naturally achieved by the natural order of number when they are used as
    domain of your mapping. For hashes it is possible to sort the keys of a
    hash and then access the values in sequence, but this operation is much more
    expensive.

    jue
     
    Jürgen Exner, Sep 23, 2006
    #14
  15. Mirco Wahab wrote:
    > An "array" means otherwise "access by an index number",
    > so array elements must be searched in order to
    > find a particular item.


    Not really. To access the 23rd element of an array in a typical
    implementation you would simply access
    StartOfArray + 23 * SizeOfArrayElement
    which is just adding a constant that is independant of the size of the
    array. No need to search (which would be O(n log n)).

    jue
     
    Jürgen Exner, Sep 23, 2006
    #15
  16. Jack

    Uri Guttman Guest

    >>>>> "PL" == Paul Lalli <> writes:

    PL> Uri Guttman wrote:
    >> >>>>> "PL" == Paul Lalli <> writes:

    >>
    >>
    >> >> why arent folks using hashes instead of arrays since (I believe) they
    >> >> are faster to access and take up the same or less memory than arrays..

    >>

    PL> Faster to access, yes. Same or less memory, no.
    >>
    >> hmm, in general arrays have faster lookups than hashes. me thinks you
    >> confused the OP's statement.


    PL> Or I confused my answer. :) What I meant was that:

    PL> my $there = $hash{$key} ? 1 : 0;
    PL> is faster than:
    PL> my $there = 0;
    PL> for my $elem (@array) {
    PL> if ($elem eq $value) {
    PL> $there = 1;
    PL> next;
    PL> }
    PL> }

    well then, why didn't you say that!! string key lookup is a very diff
    beast than element lookup. :)

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Sep 23, 2006
    #16
  17. Jack

    Uri Guttman Guest

    >>>>> "CM" == Chris Mattern <> writes:

    CM> In article <>, Uri Guttman wrote:
    >>
    >> hmm, in general arrays have faster lookups than hashes. me thinks you
    >> confused the OP's statement. and arrays definitely use less storage
    >> (even excluding the key itself) for each element.
    >>


    CM> That's true in most languages because in most languages arrays
    CM> are required to have all elements be the same type, making
    CM> looking up an element a simple matter of pointer arithmetic
    CM> (you see this in its most naked form in a higher-level language
    CM> in C, where arrays and pointers are the same thing, and array
    CM> lookups are defined in the language as pointer arithmetic.
    CM> This results in a lot of C's less cuddly quirks, but that's
    CM> another subject...). But in Perl, each element can be
    CM> entirely different and each element has its own size. There
    CM> are several different ways Perl can handle this problem; I
    CM> don't know Perl internals so I don't know how it does it.
    CM> But however it's done, it's going to take more cycles than
    CM> a simple homogenous array (and probably more space, too).

    perl's arrays ARE homogeneous but at a different level than scalar
    values. the SV's of an array are in a real c array and are indexed just
    like you say. otherwise how would perl lookup an array element? do you
    think it scans the array in some linked list format for the nth element?

    but my quoted statement is still true - perl arrays are faster and use
    less ram than hashes. this is easily tested with benchmark.pm and
    the devel::size (iirc the name correctly) module that shows ram
    usage. so you don't need to know perl internals to learn about array vs
    hash resource usage.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Sep 23, 2006
    #17
  18. On 2006-09-23 00:39, Chris Mattern <> wrote:
    > In article <>, Uri Guttman wrote:
    >>hmm, in general arrays have faster lookups than hashes. me thinks you
    >>confused the OP's statement. and arrays definitely use less storage
    >>(even excluding the key itself) for each element.

    >
    > That's true in most languages because in most languages arrays
    > are required to have all elements be the same type, making
    > looking up an element a simple matter of pointer arithmetic
    > (you see this in its most naked form in a higher-level language
    > in C, where arrays and pointers are the same thing,


    It seems people are just as confused about the difference between arrays
    and pointers in C as they are about the difference between arrays and
    lists in Perl :)

    Arrays and pointers in C are definitely not the same thing, An array is
    a group of elements of the same type, while a pointer, well, points to
    an element of a certain type. You notice the difference when you try to
    assign to an array (doesn't work) or when you use sizeof on a pointer
    (probably not what you wanted), and very spectacularly if you define a
    variable as "int a[10]" in one source file and declare it as "extern int
    *a" in another. However:

    > and array lookups are defined in the language as pointer arithmetic.


    This is true. (Actually it's the other way round: Pointer arithmetic is
    defined in terms as accesses to array elements)

    When you use an array as an rvalue in an expression, it is converted to
    a pointer to its first element. Thus you can use an array as an rvalue
    whereever you can use a pointer.

    (There is also the syntactic quirk that you cannot use arrays as
    parameters, but it's not an error to try - the compiler will just
    silently convert the array declaration to a pointer declaration)

    hp


    --
    _ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
    |_|_) | Sysadmin WSR | > ist?
    | | | | Was sonst wäre der Sinn des Erfindens?
    __/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
     
    Peter J. Holzer, Sep 23, 2006
    #18
  19. Jack

    Dr.Ruud Guest

    Chris Mattern schreef:
    > Jack:


    >> why arent folks using hashes instead of arrays since (I believe) they
    >> are faster to access and take up the same or less memory than
    >> arrays..

    >
    > Um, no, arrays are much more efficient than hashes.


    No, they are and they aren't. It all depends on what you use them for.

    <example type="bad">

    my %hary = (
    0 => q{foo},
    1 => q{bar},
    ) ;

    my @ash = (
    [q{dgd vggw ggg tgregdtg trgregfdsggt gtdeg trg}, q{foo}] ,
    [q{wgdeg ergrewg rewgrew grewg ewg ewgew gewrgewg}, q{bar}] ,
    ) ;

    </example>

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Sep 23, 2006
    #19
  20. Jürgen Exner wrote:
    > Mirco Wahab wrote:
    >
    >>An "array" means otherwise "access by an index number",
    >>so array elements must be searched in order to
    >>find a particular item.

    >
    >
    > Not really. To access the 23rd element of an array in a typical
    > implementation you would simply access
    > StartOfArray + 23 * SizeOfArrayElement
    > which is just adding a constant that is independant of the size of the
    > array. No need to search (which would be O(n log n)).


    How then do you know it's item #23?

    --
    Josef Möllers (Pinguinpfleger bei FSC)
    If failure had no penalty success would not be a prize
    -- T. Pratchett
     
    Josef Moellers, Sep 25, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paul Morrison
    Replies:
    1
    Views:
    1,754
    Joseph Dionne
    May 1, 2005
  2. Paul Morrison
    Replies:
    15
    Views:
    572
    Tim Rentsch
    May 6, 2005
  3. Nanime Puloski
    Replies:
    1
    Views:
    247
    Nobody
    Jul 29, 2009
  4. Rick Tan
    Replies:
    1
    Views:
    85
  5. Tim O'Donovan

    Hash of hashes, of hashes, of arrays of hashes

    Tim O'Donovan, Oct 27, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    230
Loading...

Share This Page