Hashes of hashes or just one hash ?

Perl Learner · Jun 8, 2005

I am storing all the data from a HUGE file into 1 hash with long key
names.
for ex. my key would be something like

NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

I can also make 7 hashes, one inside the other:
HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

which one would be more efficient ?

I was using 1 big hash instead of cascaded hashes as it would be a lot
simpler. And also, sometimes, my data would stop at some random
point..
for example, for some NAMEs, I might only have

NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
might not have all the possible "sections"

That's why I am using a single hash as it would take care of all
conditions.

I just wanted to ask you guys which one would be more efficient.

thanks.

Gunnar Hjalmarsson · Jun 8, 2005

Perl said:
I am storing all the data from a HUGE file into 1 hash with long key
names.
for ex. my key would be something like

NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

I can also make 7 hashes, one inside the other:
HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

I just wanted to ask you guys which one would be more efficient.

Creating one hash consumes less resources than creating seven hashes, of
course. Which data structure is the most suitable in this case depends
on how you are going to make use of the hash data.

Perl Learner · Jun 8, 2005

thanks for the quick response. using a single hash takes less
resources ? that makes me happy.

oh by the way, i am using it to compare two HUGE files.

so

NAMEK__PROPERTYA of file A against NAMEK__PROPERTYA of file B

and...

NAMEK__PROPERTYA__TABLE of fileA against the same in file B

and so on..

Arne Ruhnau · Jun 8, 2005

Perl said:
I am storing all the data from a HUGE file into 1 hash with long key
names.
for ex. my key would be something like

NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

I can also make 7 hashes, one inside the other:
HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

I was using 1 big hash instead of cascaded hashes as it would be a lot
simpler. And also, sometimes, my data would stop at some random
point..
for example, for some NAMEs, I might only have

NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
might not have all the possible "sections"

Although it depends on the way you will use your data, as Gunnar already
pointed out, you could alternatively use a hash of arrays and bind your
former hash-keys to array-indices. Thereby, you can overcome the mentioned
"gaps" in your data, but have to be prepared to get undef back. You take as
many keys as hash-keys as you can guarantee (seems as if NAME is always
present) and then simply take LOL, like so:

$hash->{name}[
[Property, Relation, Set, Character, Condition, Table],
[Property, Relation, Set, Character, Condition, Table],
];

Of course, to get something that has name A and Relation C, you need

grep { $_->[1] } @{ $hash->{A} }

To make it more readable, you could <use constant> and sort of name your
array-indices.

But, again, it depends on the way you want to use your data. And I cannot
tell you if this would be more efficient...

Arne Ruhnau

Anno Siegel · Jun 8, 2005

Perl Learner said:
I am storing all the data from a HUGE file into 1 hash with long key
names.
for ex. my key would be something like

NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

I can also make 7 hashes, one inside the other:
HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

which one would be more efficient ?

I was using 1 big hash instead of cascaded hashes as it would be a lot
simpler. And also, sometimes, my data would stop at some random
point..
for example, for some NAMEs, I might only have

NAMEK__PROPERTYA__TABLE or even NAME__PROPERTYJ, basically it might or
might not have all the possible "sections"

That's why I am using a single hash as it would take care of all
conditions.

Look up "multidimensional array emulation" in perlvar, it may be
what you are looking for.

Anno

Arne Ruhnau · Jun 8, 2005

Arne said:
you could alternatively use a hash of arrays and bind your
former hash-keys to array-indices. Thereby, you can overcome the mentioned
"gaps" in your data, but have to be prepared to get undef back. You take as
many keys as hash-keys as you can guarantee (seems as if NAME is always
present) and then simply take LOL, like so:

$hash->{name}[
[Property, Relation, Set, Character, Condition, Table],
[Property, Relation, Set, Character, Condition, Table],
];

Of course, to get something that has name A and Relation C, you need

grep { $_->[1] } @{ $hash->{A} }

grep { $_->[1] eq 'C' } @{ $hash->{A} }

*grmbl*

Arne Ruhnau

xhoster · Jun 8, 2005

Perl Learner said:
I am storing all the data from a HUGE file into 1 hash with long key
names.
for ex. my key would be something like

NAMEK__PROPERTYA__RELATIONB__SET3__CHARACTER4__CONDITION__TABLE

I can also make 7 hashes, one inside the other:
HASH{NAMES}{PROPERTIES}{RELATIONS}{SETS}{CHARACTERS}{CONDITIONS}{TABLES}

which one would be more efficient ?

You didn't tell us what you are using these hashes for. If you don't
actually use the data, then it would be more efficient to simply forgo
both methods and not have any hashes at all.

Xho

xhoster · Jun 8, 2005

Perl Learner said:
thanks for the quick response. using a single hash takes less
resources ? that makes me happy.

oh by the way, i am using it to compare two HUGE files.

How HUGE are they? To me, huge files are at least the size of
main memory, if not more. Which means that even the more efficient
of your hash method won't work.

I'd use system tools to sort each file into a canonical order, and then
use Perl (or even other system tools) to do the comparison on the
canonicalized files in a memory efficient way.

Xho

Bob · Jun 8, 2005

Perl said:
I am storing all the data from a HUGE file into 1 hash with long key

Sounds to me like you need to load this _HUGE_ data into a database.
This is much better, and much quicker. you could use perl DBI interface
to massage the data, clean it up, and get it in, and then use the DB.
Perl is not really ideal for what you are describing, and those 'keys'
that you are generating sound rather shakey to me. You could do so much
more from the database, and just use Perl to issue SQL statements held
in scalars.

There are plenty of 'free' database, and MS have just realeased a
'free' version of MS 2005 called SQL Express. You can create up to 4gb
databases for nothing on a win32 machine.

Perl Learner · Jun 8, 2005

Thanks for the detailed replies folks.

Here're some of my responses to the questions some of you had for me:

1. what will i be using these hashes for?
a. a quick answer is .. to easily compare corresponding values in two
different "databases" as they have the same "key" (or address, if you
will)

the file i am reading in will be something like

Cell (CELLNAME)
{
area :value
capacitance: value
pin(PIN)
{
capacitance:
power
{
blah blah
}
timing
{
blah blah
}
blah blah
}

note that in these "blah blah"s i have skipped over a lot of things
that i need. i have a lot of conditional information, 2D, 3D tables,
or just as simple values.

these tables have values at certain "whens" and at certain "paths" (and
there are a few more other details)

but, basically, that's the basic structure of the file..

now i have 2 files like this that i want to compare. although both the
files are "pretty much" of the same format, there are some minute
differences.

by "compare" i am talking about comparing the values (numbers) at the
same "when"s and "paths" for the same "pins" and the same "cells" etc
etc

i have to extrapolate values from one file and project them to the same
conditions as the other file, and then compare. basically, a lot of
math involved.

and then i want to graph certain things, etc etc.

(i wanted to save you the long story. but in the process, i might have
given too little info. sorry about that).

2. how huge are these files?
a. each file is about 20 MB. i was saying _HUGE_ because i haven't
edited files this big . also the structure of the data in these files
is v. complicated which was too overwhelming for me and i said HUGE.

i managed to get the parser done and working (took about a week). i am
using a single hash with a long key name.. something like
CELL:ADDER__PIN:A__RELATEDPIN:CI__TIMING__TIMINGTYPE

OSITIVE_UNATE__TIMINGSENSE:RISING__WHEN:!CI__RISETRANSITION

...err.. something like that.

now if i use that key, i get a table back from the hash.

if i just use

CELL:ADDER__CAPACITANCE

i get a single value (of capacitance) back from the hash

since i have a lot of these things to deal with, i figured single hash
would be the simplest.

i am able to get it do its job in about 3 mins using a 64bit linux
machine... and about 6-7 mins using a 32bit linux machine.

although this is not a big deal... i was thinking it could maybe be
done a little faster

3. Perl is not very ideal for what you are describing ........
a. you may be right. i am not that big of a programmer (i learned
perl in 21 days

sam's way). i havent done any SQL, database stuff
before. back in the day, i remember fiddling with DBase 3 plus.. but
that was it.

file parsing seemed to be a little easy in perl and it can do my
extrapolation math (basic + - * /, modulus, etc) so i figured perl was
the deal. i mean.. it is working fine now and doing its job.

my question was just subjective and i just wanted to know if there
could be done a bit more efficiently in perl.

i mean, i can just forget optimizing this. its only a 5 min wait for
the results right ;-)

thanks for all your comments folks.

davidfilmer · Jun 8, 2005

What's that smell? I know that smell from somewhere... Oh, I remember
- it's the smell an application that is just begging for a database on
the back-end.

Just a thought. The hash looks very database-ish. NULLs don't bother
databases, and you have the power of SQL queries to retrieve
information.

Bob · Jun 9, 2005

Perl said:
perl in 21 days sam's way). i havent done any SQL, database stuff

You could learn enough SQL Express in two days

- Basic SELECT
stuff. Valuable for the rest of your programming life. Try:

http://www.w3schools.com/sql/default.asp

before. back in the day, i remember fiddling with DBase 3 plus.. but
that was it.

Yeah me too. And Clipper, and Foxpro. They are pants in comparison to
what a real database could do for you.

Databases are complex, and there is a lot behind them, but that does
not mean that you should avoid them. No! Don't delay, start today! Not
only that, but databases compliment Perl nicely, as I mentioned in the
posting above.

process multiple hashes	5	Jun 2, 2014
Sorting hash of hashes	3	Nov 21, 2011
Sorting hash of hashes	9	Feb 19, 2009
Hash of Hashes	5	Jan 25, 2007
Hashes	4	Oct 2, 2007
Parsing an Array of Hashes	3	Sep 22, 2008
Help with Hash of Hashes	1	Mar 1, 2006
Mini Web Server in C++ (Part One)	4	Oct 2, 2025

Hashes of hashes or just one hash ?

Perl Learner

Gunnar Hjalmarsson

Perl Learner

Arne Ruhnau

Anno Siegel

Arne Ruhnau

xhoster

xhoster

Bob

Perl Learner

davidfilmer

Bob

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads