M
Marc Espie
I'm looking at a script that handles a huge amount of data... basically,
the filenames from +4000 packages in order to recognize conflicts.
Currently, it builds a big hash through a loop that constructs
$all_conflict like this:
my $file= File::Spec->canonpath($self->fullname());
push ${$all_conflict->{$file}}, $pkgname;
I end up with a hash of 250M.
I can't really use Devel::Size with any benefit, since all the data
is one single chunk (all the other data in the program amount to <2~3M)
I expect the $pkgname strings to be shared. In fact, I tried replacing
$pkgname with \$pkgname, with negative size results (+20M).
I've tried splitting the path along `frequent' directories, with inconclusive
gains (most of the files live under /usr/local, /etc, or /var/www):
-20M at most.
I'm looking for bright ideas to try and reduce the size used... without
being too detrimental in terms of speed...
the filenames from +4000 packages in order to recognize conflicts.
Currently, it builds a big hash through a loop that constructs
$all_conflict like this:
my $file= File::Spec->canonpath($self->fullname());
push ${$all_conflict->{$file}}, $pkgname;
I end up with a hash of 250M.
I can't really use Devel::Size with any benefit, since all the data
is one single chunk (all the other data in the program amount to <2~3M)
I expect the $pkgname strings to be shared. In fact, I tried replacing
$pkgname with \$pkgname, with negative size results (+20M).
I've tried splitting the path along `frequent' directories, with inconclusive
gains (most of the files live under /usr/local, /etc, or /var/www):
-20M at most.
I'm looking for bright ideas to try and reduce the size used... without
being too detrimental in terms of speed...