Hashes of hashes or just one hash ?

Discussion in 'Perl Misc' started by Perl Learner, Jun 8, 2005.

  1. Perl Learner

    Perl Learner Guest

    I am storing all the data from a HUGE file into 1 hash with long key
    names.
    for ex. my key would be something like

    NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

    I can also make 7 hashes, one inside the other:
    HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

    which one would be more efficient ?

    I was using 1 big hash instead of cascaded hashes as it would be a lot
    simpler. And also, sometimes, my data would stop at some random
    point..
    for example, for some NAMEs, I might only have

    NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
    might not have all the possible "sections"

    That's why I am using a single hash as it would take care of all
    conditions.

    I just wanted to ask you guys which one would be more efficient.

    thanks.
    Perl Learner, Jun 8, 2005
    #1
    1. Advertising

  2. Perl Learner wrote:
    > I am storing all the data from a HUGE file into 1 hash with long key
    > names.
    > for ex. my key would be something like
    >
    > NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE
    >
    > I can also make 7 hashes, one inside the other:
    > HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}


    <snip>

    > I just wanted to ask you guys which one would be more efficient.


    Creating one hash consumes less resources than creating seven hashes, of
    course. Which data structure is the most suitable in this case depends
    on how you are going to make use of the hash data.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jun 8, 2005
    #2
    1. Advertising

  3. Perl Learner

    Perl Learner Guest

    thanks for the quick response. using a single hash takes less
    resources ? that makes me happy.

    oh by the way, i am using it to compare two HUGE files.

    so

    NAMEK__PROPERTYA of file A against NAMEK__PROPERTYA of file B

    and...

    NAMEK__PROPERTYA__TABLE of fileA against the same in file B

    and so on..
    Perl Learner, Jun 8, 2005
    #3
  4. Perl Learner

    Arne Ruhnau Guest

    Perl Learner wrote:
    > I am storing all the data from a HUGE file into 1 hash with long key
    > names.
    > for ex. my key would be something like
    >
    > NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE
    >
    > I can also make 7 hashes, one inside the other:
    > HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

    <snip>
    > I was using 1 big hash instead of cascaded hashes as it would be a lot
    > simpler. And also, sometimes, my data would stop at some random
    > point..
    > for example, for some NAMEs, I might only have
    >
    > NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
    > might not have all the possible "sections"


    Although it depends on the way you will use your data, as Gunnar already
    pointed out, you could alternatively use a hash of arrays and bind your
    former hash-keys to array-indices. Thereby, you can overcome the mentioned
    "gaps" in your data, but have to be prepared to get undef back. You take as
    many keys as hash-keys as you can guarantee (seems as if NAME is always
    present) and then simply take LOL, like so:

    $hash->{name}[
    [Property, Relation, Set, Character, Condition, Table],
    [Property, Relation, Set, Character, Condition, Table],
    ];

    Of course, to get something that has name A and Relation C, you need

    grep { $_->[1] } @{ $hash->{A} }

    To make it more readable, you could <use constant> and sort of name your
    array-indices.

    But, again, it depends on the way you want to use your data. And I cannot
    tell you if this would be more efficient...

    Arne Ruhnau
    Arne Ruhnau, Jun 8, 2005
    #4
  5. Perl Learner

    Anno Siegel Guest

    Perl Learner <> wrote in comp.lang.perl.misc:
    > I am storing all the data from a HUGE file into 1 hash with long key
    > names.
    > for ex. my key would be something like
    >
    > NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE
    >
    > I can also make 7 hashes, one inside the other:
    > HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}
    >
    > which one would be more efficient ?
    >
    > I was using 1 big hash instead of cascaded hashes as it would be a lot
    > simpler. And also, sometimes, my data would stop at some random
    > point..
    > for example, for some NAMEs, I might only have
    >
    > NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
    > might not have all the possible "sections"
    >
    > That's why I am using a single hash as it would take care of all
    > conditions.


    Look up "multidimensional array emulation" in perlvar, it may be
    what you are looking for.

    Anno
    Anno Siegel, Jun 8, 2005
    #5
  6. Perl Learner

    Arne Ruhnau Guest

    Arne Ruhnau wrote:
    > you could alternatively use a hash of arrays and bind your
    > former hash-keys to array-indices. Thereby, you can overcome the mentioned
    > "gaps" in your data, but have to be prepared to get undef back. You take as
    > many keys as hash-keys as you can guarantee (seems as if NAME is always
    > present) and then simply take LOL, like so:
    >
    > $hash->{name}[
    > [Property, Relation, Set, Character, Condition, Table],
    > [Property, Relation, Set, Character, Condition, Table],
    > ];
    >
    > Of course, to get something that has name A and Relation C, you need
    >
    > grep { $_->[1] } @{ $hash->{A} }


    grep { $_->[1] eq 'C' } @{ $hash->{A} }

    *grmbl*

    Arne Ruhnau
    Arne Ruhnau, Jun 8, 2005
    #6
  7. Perl Learner

    Guest

    "Perl Learner" <> wrote:
    > I am storing all the data from a HUGE file into 1 hash with long key
    > names.
    > for ex. my key would be something like
    >
    > NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE
    >
    > I can also make 7 hashes, one inside the other:
    > HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}
    >
    > which one would be more efficient ?


    You didn't tell us what you are using these hashes for. If you don't
    actually use the data, then it would be more efficient to simply forgo
    both methods and not have any hashes at all.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Jun 8, 2005
    #7
  8. Perl Learner

    Guest

    "Perl Learner" <> wrote:
    > thanks for the quick response. using a single hash takes less
    > resources ? that makes me happy.
    >
    > oh by the way, i am using it to compare two HUGE files.


    How HUGE are they? To me, huge files are at least the size of
    main memory, if not more. Which means that even the more efficient
    of your hash method won't work.

    I'd use system tools to sort each file into a canonical order, and then
    use Perl (or even other system tools) to do the comparison on the
    canonicalized files in a memory efficient way.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Jun 8, 2005
    #8
  9. Perl Learner

    Bob Guest

    Perl Learner wrote:
    > I am storing all the data from a HUGE file into 1 hash with long key


    Sounds to me like you need to load this _HUGE_ data into a database.
    This is much better, and much quicker. you could use perl DBI interface
    to massage the data, clean it up, and get it in, and then use the DB.
    Perl is not really ideal for what you are describing, and those 'keys'
    that you are generating sound rather shakey to me. You could do so much
    more from the database, and just use Perl to issue SQL statements held
    in scalars.

    There are plenty of 'free' database, and MS have just realeased a
    'free' version of MS 2005 called SQL Express. You can create up to 4gb
    databases for nothing on a win32 machine.
    Bob, Jun 8, 2005
    #9
  10. Perl Learner

    Perl Learner Guest

    Thanks for the detailed replies folks.

    Here're some of my responses to the questions some of you had for me:

    1. what will i be using these hashes for?
    a. a quick answer is .. to easily compare corresponding values in two
    different "databases" as they have the same "key" (or address, if you
    will)

    the file i am reading in will be something like

    Cell (CELLNAME)
    {
    area :value
    capacitance: value
    pin(PIN)
    {
    capacitance:
    power
    {
    blah blah
    }
    timing
    {
    blah blah
    }
    blah blah
    }

    note that in these "blah blah"s i have skipped over a lot of things
    that i need. i have a lot of conditional information, 2D, 3D tables,
    or just as simple values.

    these tables have values at certain "whens" and at certain "paths" (and
    there are a few more other details)

    but, basically, that's the basic structure of the file..

    now i have 2 files like this that i want to compare. although both the
    files are "pretty much" of the same format, there are some minute
    differences.

    by "compare" i am talking about comparing the values (numbers) at the
    same "when"s and "paths" for the same "pins" and the same "cells" etc
    etc

    i have to extrapolate values from one file and project them to the same
    conditions as the other file, and then compare. basically, a lot of
    math involved.

    and then i want to graph certain things, etc etc.

    (i wanted to save you the long story. but in the process, i might have
    given too little info. sorry about that).


    2. how huge are these files?
    a. each file is about 20 MB. i was saying _HUGE_ because i haven't
    edited files this big . also the structure of the data in these files
    is v. complicated which was too overwhelming for me and i said HUGE.
    :)


    i managed to get the parser done and working (took about a week). i am
    using a single hash with a long key name.. something like
    CELL:ADDER__PIN:A__RELATEDPIN:CI__TIMING__TIMINGTYPE:pOSITIVE_UNATE__TIMINGSENSE:RISING__WHEN:!CI__RISETRANSITION

    ...err.. something like that.

    now if i use that key, i get a table back from the hash.

    if i just use

    CELL:ADDER__CAPACITANCE

    i get a single value (of capacitance) back from the hash

    since i have a lot of these things to deal with, i figured single hash
    would be the simplest.

    i am able to get it do its job in about 3 mins using a 64bit linux
    machine... and about 6-7 mins using a 32bit linux machine.

    although this is not a big deal... i was thinking it could maybe be
    done a little faster :)

    3. Perl is not very ideal for what you are describing ........
    a. you may be right. i am not that big of a programmer (i learned
    perl in 21 days :) sam's way). i havent done any SQL, database stuff
    before. back in the day, i remember fiddling with DBase 3 plus.. but
    that was it.

    file parsing seemed to be a little easy in perl and it can do my
    extrapolation math (basic + - * /, modulus, etc) so i figured perl was
    the deal. i mean.. it is working fine now and doing its job.

    my question was just subjective and i just wanted to know if there
    could be done a bit more efficiently in perl.

    i mean, i can just forget optimizing this. its only a 5 min wait for
    the results right ;-)


    thanks for all your comments folks.
    Perl Learner, Jun 9, 2005
    #10
  11. Perl Learner

    Guest

    What's that smell? I know that smell from somewhere... Oh, I remember
    - it's the smell an application that is just begging for a database on
    the back-end.

    Just a thought. The hash looks very database-ish. NULLs don't bother
    databases, and you have the power of SQL queries to retrieve
    information.
    , Jun 9, 2005
    #11
  12. Perl Learner

    Bob Guest

    Perl Learner wrote:
    > perl in 21 days :) sam's way). i havent done any SQL, database stuff

    You could learn enough SQL Express in two days :) - Basic SELECT
    stuff. Valuable for the rest of your programming life. Try:

    http://www.w3schools.com/sql/default.asp

    > before. back in the day, i remember fiddling with DBase 3 plus.. but
    > that was it.

    Yeah me too. And Clipper, and Foxpro. They are pants in comparison to
    what a real database could do for you.

    Databases are complex, and there is a lot behind them, but that does
    not mean that you should avoid them. No! Don't delay, start today! Not
    only that, but databases compliment Perl nicely, as I mentioned in the
    posting above.
    Bob, Jun 9, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rp
    Replies:
    1
    Views:
    517
    red floyd
    Nov 10, 2011
  2. Steven Hirsch

    Iterating over a hash of hash of hashes

    Steven Hirsch, Aug 19, 2008, in forum: Ruby
    Replies:
    0
    Views:
    150
    Steven Hirsch
    Aug 19, 2008
  3. Scott  Gilpin
    Replies:
    2
    Views:
    218
  4. Tim O'Donovan

    Hash of hashes, of hashes, of arrays of hashes

    Tim O'Donovan, Oct 27, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    210
  5. Replies:
    3
    Views:
    206
Loading...

Share This Page