Confused by hashes/data structures

I

Ian Petts

I have a series of Squid logs I need extract last-used dates from for
a set of users.

I want to read in a list of usercodes and user names from a file, then
go through the squid logs and for each user found, add a 'last used'
date to the user list.

I am struggling to get my head around Perl's data structures. I am not
sure whether I need a hash of arrays, array of hashes or a hash of
hashes.

I have been fumbling around trying to work out how to [best] create
and maintain the proper data structure for this situation, but my head
is spinning. The following seems to work to a point, but I can't work
out how to change the date value after it is initially created. Here
is what I've come up so far:

#!/usr/bin/perl -w

# Temporarily disabled 'strict' as it increased my confusion while
# coming to terms with this.
#use strict;
use diagnostics;

# Create some base data. (Not required, but all part of the
# learning process).

@userdata = (
{
user => "jsmith",
desc => "John Smith",
date => 0,
},

{
user => "gjones",
desc => "Greg Jones",
date => 0,
},
);


# I can add users to the list, like this:
push @userdata, { user => "suser", desc => "Steve User", date => 0 };
push @userdata, { user => "ipetts", desc => "Ian Petts", date => 0 };
push @userdata, { user => "pcitizen", desc => "Pete Citizen", date =>
0 };

# Let's have a look and see if I have what I think I do.
# This should print the lot:
for $href (@userdata) {
print "{ ";
for $thing ( keys %$href ) {
print "$thing=$href->{$thing} ";
}
print "}\n";
}

print "\n\n";

# And this should print out whatever field I want.
# But how do I get a specific user?

for $href (@userdata) {
print $href->{"user"} . " ";
print $href->{"desc"} . " ";
print $href->{"date"} . "\n";
}

print "\n\n";

# Now let's assume I've found this user in the log and want to update
# their last-used date in the list.
# This is wrong :-(
$userdata->{"user"}->{"ipetts"}->{"date"} = 12345 ;

# And print it out for a look.
for $href (@userdata) {
print "{ ";
for $thing ( keys %$href ) {
print "$thing=$href->{$thing} ";
}
print "}\n";
}


Could someone please point me in the right direction?

Thanks,
Ian.
 
B

Bob Walton

Ian Petts wrote:

....

I want to read in a list of usercodes and user names from a file, then
go through the squid logs and for each user found, add a 'last used'
date to the user list.

I am struggling to get my head around Perl's data structures. I am not
sure whether I need a hash of arrays, array of hashes or a hash of
hashes.
....


@userdata = (
{
user => "jsmith",
desc => "John Smith",
date => 0,
},

{
user => "gjones",
desc => "Greg Jones",
date => 0,
},
);
....


# Now let's assume I've found this user in the log and want to update
# their last-used date in the list.
# This is wrong :-(
$userdata->{"user"}->{"ipetts"}->{"date"} = 12345 ;


Well, @userdata is an array, right? So it must be accessed using a
subscript, as in something like:

$userdata[3]

for the fourth user, for example. That value can continue being accessed:

$userdata[3]->{date}=12345;

Note also that there is no need for quotes around barewords used in hash
keys.

Now, of course, this brings up the question of how do you know what
subscript to use for a given user. If you want to look up stuff by
userID, you have the wrong data structure. You could build yourself a
data structure that would provide the subscript for a given userID as
follows:

my $i=0;my %subs;for(@userdata){$subs{$$_{user}}=$i++;}

Then you can do:

$userdata[$subs{gjones}]->{data}=12345;

for example. But note that in the event of multiple occurrences of the
same user, only the last one will be stored in %subs, so only the last
record for a given user will show up.

# And print it out for a look.
for $href (@userdata) {
print "{ ";
for $thing ( keys %$href ) {
print "$thing=$href->{$thing} ";
}
print "}\n";
}


You can save yourself a lot of grief when looking at data structures for
debugging purposes by using Data::Dumper:

use Data::Dumper;
print Dumper(\@userdata);

for example.
 
B

Ben Morrow

I am struggling to get my head around Perl's data structures. I am not
sure whether I need a hash of arrays, array of hashes or a hash of
hashes.

#!/usr/bin/perl -w

'use warnings' is better than -w.
# Temporarily disabled 'strict' as it increased my confusion while
# coming to terms with this.
#use strict;

The only thing you need to do is put 'my' before the first use of each
variable. I've put these in below, so you can see.
use diagnostics;

# Create some base data. (Not required, but all part of the
# learning process).

@userdata = (

my @userdata = (
{
user => "jsmith",
desc => "John Smith",
date => 0,

It's probably better to leave the date undefined if you haven't found
one for this user: a date of 0 means 00:00 on 1970-01-01, which isn't
very useful. Also, if you later print out the undefined value, you'll
get a warning, which is handy if someone's date failed to get set for
some reason.
},

{
user => "gjones",
desc => "Greg Jones",
date => 0,
},
);


# I can add users to the list, like this:
push @userdata, { user => "suser", desc => "Steve User", date => 0 };
push @userdata, { user => "ipetts", desc => "Ian Petts", date => 0 };
push @userdata, { user => "pcitizen", desc => "Pete Citizen", date =>
0 };

# Let's have a look and see if I have what I think I do.
# This should print the lot:
for $href (@userdata) {

for my $href (@userdata) {

Or, in fact, you'd be *much* better off using the Data::Dumper module:

use Data::Dumper;

print Dumper \@userdata;
print "{ ";
for $thing ( keys %$href ) {
print "$thing=$href->{$thing} ";
}
print "}\n";
}

print "\n\n";

# And this should print out whatever field I want.
# But how do I get a specific user?

Whenever you ask yourself 'how do I find a specific X', the answer is a
hash, and X is the key to the hash. So in this case, your data structure
should look like:

my %userdata = (
jsmith => {
desc => 'John Smith',
date => undef,
},
gjones => {
desc => 'Greg Jones',
date => undef,
},
);

I just put the 'date' entries in to make things clearer: you can leave
them out, and they'll default to undef. Now you can add a user like
this:

$userdata{suser} = { desc => 'Steve User', date => undef };

and find it like:

print "jsmith is called $userdata{jsmith}{desc}.\n";

To iterate over them all, use:

for my $user (keys $userdata) {
print "$user is called $userdata{$user}{desc}\n";
}
for $href (@userdata) {
print $href->{"user"} . " ";
print $href->{"desc"} . " ";
print $href->{"date"} . "\n";
}

print "\n\n";

# Now let's assume I've found this user in the log and want to update
# their last-used date in the list.
# This is wrong :-(
$userdata->{"user"}->{"ipetts"}->{"date"} = 12345 ;

# say you have the user in $user and the date in $date:

$userdata{$user}{date} = $date;
# And print it out for a look.
for $href (@userdata) {
print "{ ";
for $thing ( keys %$href ) {
print "$thing=$href->{$thing} ";
}
print "}\n";
}

Again, use Data::Dumper;

A rule-of-thumb I use when building data structures is 'if I'm using an
array, that keeps things in order: is the order important here?'. If it
isn't, chances are you should be using a hash instead.

Ben
 
D

David K. Wall

Since user names are unique (usually) I think I'd use a hash
of hashes instead, like:

my %userdata = ( jsmith =>{ desc => "John Smith",
date => 0,
},
gjones =>{ desc => "Greg Jones",
date => 0,
},
);

[snip]

[print out data]
while( sort( keys %userdata ) ){
# error trapping ignored
print <<END;
User: $_ Desc: $userdata{$_}{'desc} Date: $userdata{$_}{'date}
END

}

Below are a couple of alternate ways to code this, just in case they'll
help. (Besides, I'd already altered this from the original array of hashes
form before I saw Steve May's post :)


for my $user (keys %userdata) {
print "{ ";
my $href = $userdata{$user};
for my $attribute ( keys %$href ) {
print "$attribute=$href->{$attribute} ";
}
print "}\n";
}
print "\n\n";

# another way to write it... maybe this will make it a bit clearer
for my $user (keys %userdata) {
print "{ ";
for my $attribute ( keys %{$userdata{$user}} ) {
print "$attribute=$userdata{$user}{$attribute} ";
}
print "}\n";
}
print "\n\n";

Data::Dumper might interest you (Ian) as well, if you get tired of writing
code to do screen dumps of data structures. The Data Structures Cookbook is
highly recommended if you haven't already read it. (perldoc perldsc)
 
I

Ian Petts

Steve May said:
Since user names are unique (usually) I think I'd use a hash
of hashes instead, like:

That's fine. Like I said, I was confused by all the options and I
wasn't sure which angle to tackle it from.

This is really good stuff, Steve. Thank you very much. Easy to read
and I think I actually understand it :)

The only hitch I have now (I think) is when printing out the hash:
while( sort( keys %userdata ) ){
# error trapping ignored
print <<END;
User: $_ Desc: $userdata{$_}{'desc} Date: $userdata{$_}{'date}
END

}

Perl complains with the following:

--- 8< ---
Useless use of sort in scalar context at ./try2.pl line 28 (#1)
(W void) You used sort in scalar context, as in :

my $x = sort @y;

This is not very useful, and perl currently optimizes this away.
--- >8 ---

What's going on here?

Thanks to everyone who has replied to my original post. ALL of the
suggestions and help are very much appreciated.

Regards,
Ian.
 
E

Eric Bohlman

(e-mail address removed) (Ian Petts) wrote in

You want for (syn. foreach) rather than while there. Right now the loop
says "keep running (without setting $_ to anything) as long as the return
value of sort(...) in a scalar context is nonzero." I think you got thrown

The second-level key here is:

desc} Date: $userdata{$_}{

because the single quotes try to match.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top