Record Hash Data Structure (Newbie)

  • Thread starter rajasekaran.natarajan
  • Start date
R

rajasekaran.natarajan

Hi all,

I am trying to write a program which extracts the data feom a text file
and stores it in a data structure. But there is some mistake and I
couldnt able to get over with.

I am a newbie (my 25th program or so) in Perl and I have checked the
oreilly books and perldoc but I have some limitations to understand to
be frank. (i mean unable to get the thing correct after reading)

below I had given my script and if anybody can thorw some light I will
be greatfull. Thanks in advance.

#!/usr/sbin/perl -w
#extracts grid data from test.dat and stores in dataStructure
$file1 = "test.dat";
open(CLIST,"$file1") || die ("Cant open input file \"$file1\", Unable
to Find, Check the file name and path");
while($line= <CLIST>){
if ($line =~ m/^GRID/) {
($gridNo,$cp,$xcor,$ycor,$coor,$cd)= unpack("x8 A8 A8 A8 A8
A8",$line);
$record = {
GRIDNO => $gridNo,
cp => $cp,
xcor => $xcor,
ycor => $ycoor,
zcor => $zcoor,
cd => $cd,
};
$grid{ $record->{GRIDNO} } = $record;
}
while( $gridNo = each %grid) {
printf "%s is grid no\n",$grid->{$gridNo};
}
}
close(CLIST);
 
C

chris-usenet

I am trying to write a program which extracts the data from a text file
and stores it in a data structure. But there is some mistake and I
couldn't able to get over with.

* Have you tried with "use strict", and explicitly declaring the
variables? If not, why not?
* What does your sample data look like?
* What are you expecting to happen?
* What actually happens?

Chris
 
I

ioneabu

Hi all,

I am trying to write a program which extracts the data feom a text file
and stores it in a data structure. But there is some mistake and I
couldnt able to get over with.

I am a newbie (my 25th program or so) in Perl and I have checked the
oreilly books and perldoc but I have some limitations to understand to
be frank. (i mean unable to get the thing correct after reading)

below I had given my script and if anybody can thorw some light I will
be greatfull. Thanks in advance.

#!/usr/sbin/perl -w
#extracts grid data from test.dat and stores in dataStructure
$file1 = "test.dat";
open(CLIST,"$file1") || die ("Cant open input file \"$file1\", Unable
to Find, Check the file name and path");
while($line= <CLIST>){
if ($line =~ m/^GRID/) {
($gridNo,$cp,$xcor,$ycor,$coor,$cd)= unpack("x8 A8 A8 A8 A8
A8",$line);
$record = {
GRIDNO => $gridNo,
cp => $cp,
xcor => $xcor,
ycor => $ycoor,
zcor => $zcoor,
cd => $cd,
};
$grid{ $record->{GRIDNO} } = $record;
}
while( $gridNo = each %grid) {
printf "%s is grid no\n",$grid->{$gridNo};
}
}
close(CLIST);


I have no idea what your program is supposed to do and I think you have
some typos, but here is a version of your program that compiles with
perl -c program.pl. Clean it up and provide some test data. Learning
Perl is a great book to start with. I keep re-reading the first few
chapters and l learn something new every time.

#!/usr/bin/perl

use strict;
use warnings;

#extracts grid data from test.dat and stores in dataStructure
my $file1 = "test.dat";
open(CLIST,$file1) || die ("Cant open input file \"$file1\", Unable
to Find, Check the file name and path");
my $line;
my ($gridNo,$cp,$xcor,$ycor,$zcor,$cd);
my %grid;
while($line= <CLIST>){
if ($line =~ m/^GRID/) {
($gridNo,$cp,$xcor,$ycor,$zcor,$cd)= unpack("x8 A8 A8 A8 A8
A8",$line);
my $record = {
GRIDNO => $gridNo,
cp => $cp,
xcor => $xcor,
ycor => $ycor,
zcor => $zcor,
cd => $cd,
};
$grid{ $record->{GRIDNO} } = $record;
}
while( $gridNo = each %grid) {
printf "%s is grid no\n",$grid{$gridNo};
}

}

close(CLIST);

good luck!

wana
 
B

Brian McCauley

my ($gridNo,$cp,$xcor,$ycor,$zcor,$cd);
while($line= <CLIST>){
if ($line =~ m/^GRID/) {
($gridNo,$cp,$xcor,$ycor,$zcor,$cd)= unpack("x8 A8 A8 A8 A8
A8",$line);

You are suffering from a nasty case of premature declaration there. You
probably should get that seen to before it causes you too much
embarrassment.

Anyhow since you immediately puy this into a hash, a hash slice
assignment would be more natural.

@{my $record}{qw( GRIDNO cp xcor ycor zcor cd)} =
= unpack("x8 A8 A8 A8 A8 A8",$line);
 
I

ioneabu

Brian said:
You are suffering from a nasty case of premature declaration there. You
probably should get that seen to before it causes you too much
embarrassment.

Anyhow since you immediately puy this into a hash, a hash slice
assignment would be more natural.

@{my $record}{qw( GRIDNO cp xcor ycor zcor cd)} =
= unpack("x8 A8 A8 A8 A8 A8",$line);

I agree. I was not trying to fix the structure of the OP's code. I
just corrected typos and did the minimal amount of changes possible to
make it compile. I should not have touched it without more information.
 
R

rajasekaran.natarajan

Hi all

Thanks a lot for your comments and help.
I just cant forgive myself for my typos I am sorry for that.
Here is the corrected version and it is working.

`Learning perl` I am using but they didnt give much about the hash of
hash etc.


I have also given the input data format. I need to link the generated
hash data to the element hash which I am gonna do later in the second
part. Basically every element has four grids and each grid has x,y,z
corordinates. + 2 more field. which kind of format is better to store
this and retrieve it for almost a 300,000 elements.

Any help will be appreciated.

ps: I tried to use `use strict` but it makes life harder for a newbie
like me. any better detailed source abt the use strict stuff.

#!/usr/sbin/perl -w
#extracts grid data from test.dat and stores in hash of hash
# raj - 15/Feb/2005

$file1 = "test.dat";
($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd);
open(DECK,"$file1") || die ("Cant open input file \"$file1\", Unable to
Find, Check the file name and path");
while($line= <CLIST>){
if ($line =~ m/^GRID/) {
($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd)= unpack("A8 A8 A8 A8
A8 A8 A8",$line);
%grid = (
$gridNo => {cp => $cp, xcor => $xcor, ycor => $ycor, zcor => $zcor,},
);
}
foreach $gridNo (keys %grid) {
printf
"$gridNo,$grid{$gridNo}->{xcor},$grid{$gridNo}->{ycor},$grid{$gridNo}->{zcor}\n";
}
}
close(DECK);

#Input File Format for test.dat <this is a MSC/NASTRAN INPUT DECK>
#each input field has 8 columns/fields
#COR-is co-ordinat value
#<name> <NUBMER> <CPVALUE> <XCOR> <YCOR> <ZCOR> <CPVALUE> each 8
columns
#GRID 130421 780.0 -422.625267.99
#GRID 130422 780.0 -380.25 267.99
#GRID 130423 780.0 380.25 10.73996
#GRID 130424 780.0 423.562510.73996
#GRID 130425 780.0 466.5 10.73996
#------------------------------------------------------
#NEXTPART
#<ElementName> <ElementNo> <Grid1> <Grid2> <Grid3> <Grid4>
#CQUAD4 115236 119524 119525 119528 119527
#CQUAD4 115237 119527 119528 119522 119521
#CQUAD4 115238 119521 119522 119519 119518
 
R

rajasekaran.natarajan

Hi all!
Yeah I included another hash for element data for the second part. but
I do not know how to acess the hash %grid using the values I get from
%element hash (check the print loop)

my $file1 = "test.dat";
my ($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd);
open(DECK,"$file1") || die ("Cant open input file \"$file1\", Unable to
Find, Check the file name and path");
while($line= <DECK>){
if ($line =~ m/^GRID/) {
($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd)= unpack("A8 A8 A8 A8
A8 A8 A8",$line);
%grid = (
$gridNo => {cp => $cp, xcor => $xcor, ycor => $ycor, zcor => $zcor,},
);
}
elsif($line =~ m/^CQUAD4/) {
($eId,$elementNo,$grid1,$grid2,$grid3,$grid4) = unpack("A8 A8 x8 A8
A8 A8 A8",$line);
%element = (
$elementNo => {ETYPE=> $eId, con1 => $grid1, con2 => $grid2, con3=>
$grid3, con4 => $grid4},
);
}
foreach $elementNo (keys %element) {
print
"\n$element{$elementNo}->{ETYPE},$elementNo,$element{$elementNo}->{con1},$element{$elementNo}->{con2},$element{$elementNo}->{con3},$element{$elementNo}->{con4}
\n";
#here I wanted to print grid1-grid4 (from `con1`-`con4` value of
%element) and its x,y,z corordinates (x,y,z needs tobe picked from like
%grid{con1}->{xcor} is it possible
#I dont know how to do that I tried many ways.
}
}
close(DECK);

My Present Goal is to print like this

Element1 grid1 grid2 grid3 grid4 (-> this line is ok)
grid1 xcor ycor zcor -> these four lines I could not get.
grid2 xcor ycor zcor
grid3 xcor ycor zcor
grid4 xcor ycor zcor
Element2 grid1 grid2 grid3 grid4
grid1 xcor ycor zcor
grid2 xcor ycor zcor
grid3 xcor ycor zcor
grid4 xcor ycor zcor

I tried something like that - don`t laugh - it did not work

foreach $elementNo (keys %element) {
print
"\n$element{$elementNo}->{ETYPE},$elementNo,$element{$elementNo}->{con1},$element{$elementNo}->{con2},$element{$elementNo}->{con3},$element{$elementNo}->{con4}
\n";
@g = ($element{$elementNo}->{con1},$element{$elementNo}->{con2},
$element{$elementNo}->{con3}, $element{$elementNo}->{con4});
foreach $g (@g) {
print "$g,$grid{$g}->{xcor},$grid{$g}->{ycor},$grid{$g}->{zcor}\n";

}

Is there anyother way to do this stuff more efficiently.
Any help will be apprciated.
 
X

xhoster

I have also given the input data format. I need to link the generated
hash data to the element hash which I am gonna do later in the second
part. Basically every element has four grids and each grid has x,y,z
corordinates. + 2 more field. which kind of format is better to store
this and retrieve it for almost a 300,000 elements.

If you try to store this as 300,000 hashes, you may run into memory
problems due the large amount of overhead with hashes. It depends on
how much memory you have, of course. I might be tempted to switch to
parallel hashes for each field, each with 300,000 entries. More annoying
to work with in some ways, but more memory efficient.
ps: I tried to use `use strict` but it makes life harder for a newbie
like me. any better detailed source abt the use strict stuff.

We were all newbies at one point, and trust us, "use strict" does not make
your life harder. It may seem that way at first, but remember that you
came here for our wisdom, and this is the centerpiece of that wisdom. Use
strict.

For detailed info, see perldoc strict, and the section on "my" in
perldoc perlfunc. And perldoc perlsub.

Xho
 
R

rajasekaran.natarajan

Thanks a lot for Gibson and Xho,
this is too much of info, let me go bakc and do my desk and do the
homework to implement your suggestions. It needs sometime I hope.
Thanks again for Gibson for the minute details and the pain he had
taken to do the same.

Before I do: Let me clarify the problem and the Input Data better.

Regarding the Realtionship between element and Grid. imagine the
Element to be a square surface/plate on the space, then its corners are
GRIDS. so the corners define the geometry of the element. (in case the
surface is Triangular then it has three GRIDS loosely corners)
Hope this explains the relationship.

What I am trying to do is read the data and of element and grid data
store them and with the realtionship. So when i encounter element(i)
then I should be able to pick out its corners from array/hash/anything
like saying
GRID(j) of (elemet(i)) #j is 1-4 for square element.

Next post I will implement use strict and also limit my line width to
60 letters.
sorry for the inconvenience.

Thanks again


Jim said:
Hi all!
Yeah I included another hash for element data for the second part. but
I do not know how to acess the hash %grid using the values I get from
%element hash (check the print loop)

As Xho pointed out, a hash is probably not the best data structure for
300,000 records. In any case, you are not using hashes properly.

Where is 'use strict;' ?
my $file1 = "test.dat";
my ($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd);

Do not declare variables outside of a block that are used only within
that block.
open(DECK,"$file1") || die ("Cant open input file \"$file1\", Unable to
Find, Check the file name and path");

Don't quote unnecessarily; use variable file handles, use 3-argument
open, let Perl and the system tell you what is wrong:

open($fh,'<',$file1) or die("Can't open $file1: $!");

Here, declare some global arrays (not hashes) to hold your data:

my( @grid, @element );
while($line= <DECK>){

while( my $line = said:
if ($line =~ m/^GRID/) {
($identifier,$gridNo,$cp,$xcor,$ycor,$zcor,$cd)= unpack("A8 A8 A8 A8
A8 A8 A8",$line);

my( $identifier, ...
%grid = (
$gridNo => {cp => $cp, xcor => $xcor, ycor => $ycor, zcor => $zcor,},
);

You are assigning to %grid each time through the loop, thereby
overwriting your previous data. If you really want to use a hash, you
should be doing this:

$grid{$gridNo} = { id => $gridNo, cp => $cp, ... };

but you might want to use an array of hashes instead:

push(@grid, { id => $gridNo, cp => $cp, ... });

or even an array of arrays (which I would recommend for simplicity and
efficiency:

push(@grid, [ $gridNo, $cp, $xcor, $ycor, $zcor ] );
}
elsif($line =~ m/^CQUAD4/) {
($eId,$elementNo,$grid1,$grid2,$grid3,$grid4) = unpack("A8 A8 x8 A8
A8 A8 A8",$line);

my( $eId, ... ) = ...;
%element = (
$elementNo => {ETYPE=> $eId, con1 => $grid1, con2 => $grid2, con3=>
$grid3, con4 => $grid4},
);

Either
$element{$elementNo} =
or
push(@element, { ... } );
}
foreach $elementNo (keys %element) {
print

"\n$element{$elementNo}->{ETYPE},$elementNo,$element{$elementNo}->{con1},$elem
ent{$elementNo}->{con2},$element{$elementNo}->{con3},$element{$elementNo}->{co
n4}
\n";
#here I wanted to print grid1-grid4 (from `con1`-`con4` value of
%element) and its x,y,z corordinates (x,y,z needs tobe picked from like
%grid{con1}->{xcor} is it possible
#I dont know how to do that I tried many ways.
}
}
close(DECK);

In future posts, please try to make all of your lines no longer than
60-70 characters for readability.
My Present Goal is to print like this

Element1 grid1 grid2 grid3 grid4 (-> this line is ok)
grid1 xcor ycor zcor -> these four lines I could not get.
grid2 xcor ycor zcor
grid3 xcor ycor zcor
grid4 xcor ycor zcor
Element2 grid1 grid2 grid3 grid4
grid1 xcor ycor zcor
grid2 xcor ycor zcor
grid3 xcor ycor zcor
grid4 xcor ycor zcor

You seem to imply some sort of relation between the GRID lines and the
CQUAD4 lines, but you haven't explained what it is. That would
definitely influence the choice of data structure to use and the method
for fetching and printing. Try showing some relevant, sample data next
time. Use the special <DATA> file handle and include the data at the
end of your program after a '__DATA__' line.


----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World!
100,000 Newsgroups
 
R

rajasekaran.natarajan

use strict
I have implemented after reading man pages, i think now have some
prelim idea abt how to use my. (thanks for Gibson too).
parallel hash
parrellel hash you mean using references, I am not yet caught upto
references
(esp to arrays and hashes) may be I will try that after this iteration
with hashes. Actually I am doing this problem to learn Perl.
I also thought so, after reading abt hash performance in some webpage.
but I dont know much abt these kind of stuff, but I have a decent
machine HP VISUALISE3750 or SGI OCTANE with good config, hope these
will do.
 
X

xhoster

(e-mail address removed) wrote:

parrellel hash you mean using references, I am not yet caught upto
references
(esp to arrays and hashes) may be I will try that after this iteration
with hashes. Actually I am doing this problem to learn Perl.

Kind of the opposite, by parallel hashes I mean *not* using references,
or at least at one fewer level of them.

Assuming you are have a million of 3-D points, you could store them
something like:

$point{$name_of_point}{X}=34.589;
$point{$name_of_point}{Y}=4.998;
$point{$name_of_point}{Z}=9.876;

Here, you have %point is a hash with one million entries, and each
entry is a (reference to an) anonymous hash (with three entries each).
So you have 1,000,001 hashes.

Or, you could do:

$point_x{$name_of_point}=34.589;
$point_y{$name_of_point}=4.998;
$point_z{$name_of_point}=9.876;

Here, you have only 3 hashes (and none of them references). A lot less
overhead. But, the other way is usually better if you are not worried
about the memory overhead. For example, if you need to pass the whole
structure into a sub, you need to pass 3 references for this case rather
than 1 for the previous case. And if you need to change to 4D points
rather than 3D, there would be much more code to change in this case.

Other than memory, the one nice thing about the second way is that "strict"
will catch it if you commit a typo like $point_w{$name_of_point}, while it
won't catch it if you typo like $point{$name_of_point}{W}.

Xho
 
R

rajasekaran.natarajan

wow it is marvellous thanks Xho
I like the second approach it is great like point_x{name_of_point} = x;
this is will be damn efficinet if I compare the present one I am
employing.
I have gone too far already with the old stuff so I will rewrtie the
program
from the scratch using this. (it is ok I have plenty of idle time ;-)

I never seriously thoguth abt various ways of storing data. till now I
have used scrips (rather I call them as macros) just for simple task
which will not exceed 10 lines of the code. this is the first time I am
implelemnting something little more and deals with huge data. In our
application (Mech/aerospcae analysis) we process millions of records.
so I think this knowledge is unavoidable if I am going to do something
on them.

PS: I have implemented the old one without anyproblem and it is working
nicely still as the data increases it is taking quite some time. But I
have experienced another strange problem that I am posting it in a new
thread.

Thanks for all who helped me.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top