DB_File (hash of array) problem

J

James

I am trying to write to a database a hash of array (as seen by
__DATA__ in the code).
But somehow the first element of the array is missing. Any idea why?
The second time, it is working correctly.

$ cat run.pl
use DB_File;
use vars qw($db $x %h $k $v $i $key $val);

($db) = @ARGV;

%h = ();
$x = tie %h, "DB_File", $db, O_RDWR|O_CREAT, 0640, $DB_HASH;
write_db();
read_db();
untie %h;
undef $x;

sub write_db {
print "=== write $db ===\n";
for (<DATA>)
{
($k, @v) = split;
print "$k -> @v\n";
$r = \@v;
for $i (0..$#v) {
$h{$k}->[$i] = $r->[$i];
}
}
}

sub read_db {
print "=== read $db ===\n";
for ( $status = $x->seq($key, $val, R_FIRST); $status == 0; $status =
$x->seq($key, $val, R_NEXT) )
{
print "$key -> @{$val}\n";
}
}

__DATA__
aa 1 2 3
cc 678 99
zz foo fee fuu fun

$ ./run.pl mydb
=== write mydb ===
aa -> 1 2 3
cc -> 678 99
zz -> foo fee fuu fun
=== read mydb ===
aa -> 2 3
cc -> 99
zz -> fee fuu fun



TIA
JL
 
B

Bart Lateur

James said:
I am trying to write to a database a hash of array (as seen by
__DATA__ in the code).
But somehow the first element of the array is missing. Any idea why?
for $i (0..$#v) {
$h{$k}->[$i] = $r->[$i];
}

Try a 1-based array for the DB.

for $i (0..$#v) {
$h{$k}->[$i+1] = $r->[$i];
}

The second time, it is working correctly.

Now that is just plain weird.
 
U

Uri Guttman

J> I am trying to write to a database a hash of array (as seen by
J> __DATA__ in the code).
J> But somehow the first element of the array is missing. Any idea why?
J> The second time, it is working correctly.

as bart says, that is wierd. but your code is wierd too.

and what do you mean by first element of which array? you have several
instances of @v. all of them miss the first element? is this before or
after you write the hash? you need to be more specific about errors like
this.

J> $ cat run.pl
J> use DB_File;
J> use vars qw($db $x %h $k $v $i $key $val);

use vars is very old and mostly obsolete. lexicals are used now and you
should declare them when first used.

J> ($db) = @ARGV;

and what if @ARGV is empty? check for this. also your names are very
short and not informative. also below i remove several unneeded vars as
well.


my $db_name = shift @ARGV or die "must pass in a db name" ;

J> %h = ();
no need to assign () as hashes are always empty when declared or first
used.

J> $x = tie %h, "DB_File", $db, O_RDWR|O_CREAT, 0640, $DB_HASH;

you don't check that for an error either.

J> write_db();

you are using globals all over. sure, this is a short program but it is
a bad habit to get into.

J> read_db();

J> untie %h;
J> undef $x;

no need for those as they will be cleared upon program exit

J> sub write_db {
J> print "=== write $db ===\n";

ever heard of indenting code?

J> for (<DATA>)

it is poor style to use $_ as much as you do here. named vars are better
and safer too ($_ is a global and can be modified elsewhere).

J> {
J> ($k, @v) = split;

you know indenting. just oddly done.

my( $key, @values ) = split;

J> print "$k -> @v\n";
J> $r = \@v;
J> for $i (0..$#v) {
J> $h{$k}->[$i] = $r->[$i];
J> }

no need for any of that code:

$h{$k] = [@v] ;

that is all you did there. if @v were declare in the loop with my, then
you could just do:

$h{$k] = \@values ;
J> }
J> }

J> sub read_db {
J> print "=== read $db ===\n";
J> for ( $status = $x->seq($key, $val, R_FIRST); $status == 0; $status =
J> $x->seq($key, $val, R_NEXT) )

you tied the hash so why not use the hash interface, keys, values,
each. no one uses the db interface as it is noisy

while( my( $key, $val ) = each( %h ) {

isn't that a bit easier to read? tie is nice in allowing a cleaner hash
api vs some clunky db api.

also it might eliminate your bug. i don't know the dbfile api so i can't
tell why you have an off by one error only on the first run.

J> {
J> print "$key -> @{$val}\n";
J> }

J> $ ./run.pl mydb
J> === write mydb ===
J> aa -> 1 2 3
J> cc -> 678 99
J> zz -> foo fee fuu fun
J> === read mydb ===
J> aa -> 2 3
J> cc -> 99
J> zz -> fee fuu fun

and the second run works fine? show that output. does it still work if
you delete the dbfile between runs?

uri
 
X

Xho Jingleheimerschmidt

James said:
I am trying to write to a database a hash of array (as seen by
__DATA__ in the code).
But somehow the first element of the array is missing. Any idea why?

I don't believe DB_File supports nested data structures.
The second time, it is working correctly.

$ cat run.pl
use DB_File;
use vars qw($db $x %h $k $v $i $key $val);

Ugg. Scope variables to the smallest scope you can.

You should use strict.

You seem to be accidentally using symbolic references.

The inner data never got stored in DB_File in the first place, it is
only stored in Perl's memory.


for $i (0..$#v) {
$h{$k}->[$i] = $r->[$i];
}

The first time through, an array is auto-vivified, and it contains
$r->[0]. When a reference to this array is stuffed into $h{$k}, it gets
stringified to something like 'ARRAY(0x825c3dc)' because the tied hash
only accepts strings, not array references. At that point, the array
and the string become disconnected from each other, and the value of
that auto-vivified array, the copy of $r->[0], is lost.

The second subsequent time, you are using a symbol reference to a
variable with the peculiar name 'ARRAY(0x825c3dc)', into which you stuff
the remaining values.
sub read_db {
print "=== read $db ===\n";
for ( $status = $x->seq($key, $val, R_FIRST); $status == 0; $status =
$x->seq($key, $val, R_NEXT) )
{
print "$key -> @{$val}\n";

At this point, you are pulling the values out of the peculiarly named
variable using symbolic references.

If you separate your program so the perl instance that reads the DB is
not the same one that created it, you will find the values never got
stored to the DB in the first place.

Xho
 
J

James

James said:
I am trying to write to a database a hash of array (as seen by
__DATA__ in the code).
But somehow the first element of the array is missing. Any idea why?

I don't believe DB_File supports nested data structures.
The second time, it is working correctly.
$ cat run.pl
use DB_File;
use vars qw($db $x %h $k $v $i $key $val);

Ugg.  Scope variables to the smallest scope you can.

You should use strict.

You seem to be accidentally using symbolic references.

The inner data never got stored in DB_File in the first place, it is
only stored in Perl's memory.
        for $i (0..$#v) {
                $h{$k}->[$i] = $r->[$i];
        }

The first time through, an array is auto-vivified, and it contains
$r->[0].  When a reference to this array is stuffed into $h{$k}, it gets
stringified to something like 'ARRAY(0x825c3dc)' because the tied hash
only accepts strings, not array references.  At that point, the array
and the string become disconnected from each other, and the value of
that auto-vivified array, the copy of $r->[0], is lost.

The second subsequent time, you are using a symbol reference to a
variable with the peculiar name 'ARRAY(0x825c3dc)', into which you stuff
the remaining values.
sub read_db {
print "=== read $db ===\n";
for ( $status = $x->seq($key, $val, R_FIRST); $status == 0; $status =
$x->seq($key, $val, R_NEXT) )
{
        print "$key -> @{$val}\n";

At this point, you are pulling the values out of the peculiarly named
variable using symbolic references.

If you separate your program so the perl instance that reads the DB is
not the same one that created it, you will find the values never got
stored to the DB in the first place.

Xho

Thanks for your reply. Where can I find a reference to your statement,

"because the tied hash only accepts strings, not array references."

Anyway, I've re-written the run.pl script and run twice, see below.
The second time, it looks as though the reference to an array seems
working, but I may be wrong.


$ cat run.pl
use strict vars;
use DB_File;
my ($db) = @ARGV;
my %h = ();
my $x = tie %h, "DB_File", $db, O_RDWR|O_CREAT, 0640, $DB_HASH;

print "=== write $db ===\n";
my $k;
my @v;
my $r;
for (<DATA>)
{
@v = ();
($k, @v) = split;
$r = \@v;
for my $i (0..$#v)
{
$h{$k}->[$i] = $r->[$i];
print "($k -> $i -> ", $h{$k}->[$i], ") ";
}
print "\n";
}

print "=== read $db ===\n";
my $st;
my $key;
my $val;
my @val;
for ( $st = $x->seq($key, $val, R_FIRST); $st == 0; $st = $x-
seq($key, $val, R_NEXT) )
{
@val = @{$val};
for my $i (0..$#val) {
print "($key -> $i -> $val[$i]) ";
}
print "\n";
}

untie %h;
undef $x;

__DATA__
aa 1 2 3
cc 678 99
zz foo fee fuu fun


$ rm testdb
$ ./run.pl testdb
=== write testdb ===
(aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> ) (cc -> 1 -> 99)
(zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
=== read testdb ===
(aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> ) (cc -> 1 -> 99)
(zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

$ ./run.pl testdb
=== write testdb ===
(aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> 678) (cc -> 1 -> 99)
(zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
=== read testdb ===
(aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> 678) (cc -> 1 -> 99)
(zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)



JL
 
X

Xho Jingleheimerschmidt

James said:
Thanks for your reply. Where can I find a reference to your statement,

"because the tied hash only accepts strings, not array references."

I don't have a reference, just empirical evidence. (Use Data::Dumper
to dump your hash, and it shows stringified used-to-be-references.)

Also, DB_File is a Perl wrapper around a C library. I would not expect
a C library to support Perl nested structures, and if the Perl wrapper
took great pains to emulate such support, it probably would have been
mentioned.
Anyway, I've re-written the run.pl script and run twice, see below.
The second time, it looks as though the reference to an array seems
working, but I may be wrong.

What you need to do is prevent the hash from getting created during one
execution. An easy way to do that is to wrap
if (@ARGV<=1) { ...}
around the part that populates (writes) the hash. Then you can suppress
the population of the hash by supplying an extra argument.

Or just break the populating and the reading into different scripts.


What you will find is that the nested parts of the hash are being
written into Perl's memory, not onto disk.
$ cat run.pl
use strict vars;

The issue you are having is not covered by strict vars, but rather is
covered by strict refs.

$ rm testdb
$ ./run.pl testdb
=== write testdb ===
(aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> ) (cc -> 1 -> 99)
(zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
=== read testdb ===
(aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> ) (cc -> 1 -> 99)
(zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

At this point, the strings that looks like references but aren't have
been written to disk under DB_File.
$ ./run.pl testdb
=== write testdb ===
(aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> 678) (cc -> 1 -> 99)
(zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

Here, the array is not autovivified, because the hash (being tied to
previously populated disk file) already has entries, strings that look
like references, but aren't. So now the first value of each set is
picked up and stuffed into the funnily named variables using symbolic
references, rather than being lost as before.
=== read testdb ===
(aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
(cc -> 0 -> 678) (cc -> 1 -> 99)
(zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

Now you are printing out the things you just stuffed into memory.

If you change the code so it just ties the hash and skips the "writing"
part and goes right to read, you will find you get no output.

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top