D
dalyea
I am running into a memory fault error when processing the
contents of a database table into a lookup hash.
The problem is that I store item equivalencies (IE) in a db table IE
with primary key (a, b). a and b are integers and correspond
to productId values.
For IE products a and b, I insert as primary key (a, b) and, notably,
leave out (b, a). This obviously reduces the IE table size by 50%.
To build a single lookup data structure prior to writing to a lookup
file,
I try to create hash %ie which for each row has:
$ie{a}{b}=1;
$ie{b}{a}=1;
Then to get all the IE for a product a, one just looks at the slice
%ie{a} and gets all the values b1, b2, b3, ... for it.
The IE table has 1.4 million rows for 80,000 products, and sometimes
(b, a) is written [which is not a problem really]. When I create %ie,
it must
be upwards of 2.7 million rows, and hence, I am getting a memory fault
error.
The problem is, can I use a better (meaning smaller) data structure
to capture the IE data?
I could lookup each product's IE data as needed, but that would be
80,000+ single queries to the database. I don't even want to try
that.
Also, my algorithm for matching products as IE pretty much puts
all combinations of (a, b, c, d) in the table for products a b c d.
There are 4c2 or 6 combinations, not counting order, but up to 12
combinations if pairs such as (a, b) and (b, a) are entered. But it
is
possible that (a, b) and (b, c) exist, but not (a, c). Therefore, my
%ie
structure is technically incomplete as it is today. (Though I know
from the way I entered IE data that is probably 99% complete.)
The point is, a single query won't necessarily get all the IE data
for a single product. The whole data structure should be pre-loaded
all at once, as I'm trying to do.
Any ideas?
Thanks,
David
contents of a database table into a lookup hash.
The problem is that I store item equivalencies (IE) in a db table IE
with primary key (a, b). a and b are integers and correspond
to productId values.
For IE products a and b, I insert as primary key (a, b) and, notably,
leave out (b, a). This obviously reduces the IE table size by 50%.
To build a single lookup data structure prior to writing to a lookup
file,
I try to create hash %ie which for each row has:
$ie{a}{b}=1;
$ie{b}{a}=1;
Then to get all the IE for a product a, one just looks at the slice
%ie{a} and gets all the values b1, b2, b3, ... for it.
The IE table has 1.4 million rows for 80,000 products, and sometimes
(b, a) is written [which is not a problem really]. When I create %ie,
it must
be upwards of 2.7 million rows, and hence, I am getting a memory fault
error.
The problem is, can I use a better (meaning smaller) data structure
to capture the IE data?
I could lookup each product's IE data as needed, but that would be
80,000+ single queries to the database. I don't even want to try
that.
Also, my algorithm for matching products as IE pretty much puts
all combinations of (a, b, c, d) in the table for products a b c d.
There are 4c2 or 6 combinations, not counting order, but up to 12
combinations if pairs such as (a, b) and (b, a) are entered. But it
is
possible that (a, b) and (b, c) exist, but not (a, c). Therefore, my
%ie
structure is technically incomplete as it is today. (Though I know
from the way I entered IE data that is probably 99% complete.)
The point is, a single query won't necessarily get all the IE data
for a single product. The whole data structure should be pre-loaded
all at once, as I'm trying to do.
Any ideas?
Thanks,
David