: I think "plover"'s comments are referring to Unix systems. Recent AS
: installs come with everything needed for DB_File. I'm not sure how
: Tie::File is implemented -- since it deals with an ordinary text file,
: it needs to rewrite the entire file if any given line changed length
: during a modification. For large files (where tied hashes to DBM-type
: files are useful), rewriting the entire file has to be time-consuming.
: I have found tied hashes are very advantageous for typical CGI programs
: which need to open/access/modify/close quickly on very large sets of
: data, but where the complexity of a full-blown external database and its
: connection overhead is not required (and yes, I know about mod_perl to
: avoid that overhead).
I wrote about performance issues in the other thread. Hashes risk to
be memory-intensive. I cannot analyse if the program crashes, because
my colleague is not a computer specialist; I have to program as conservative
as possible.
: >
: > As far as the performance is concerned, I think that for all MY concerns,
: > Tie::File is much more practical than DB_File as it only keeps in memory
: > the line/record I am working on, so things should actually speed up at
: > start and end of program runs.
: Also note that Tie::File (in its docs) states that by default the
: *entire file* following that record is written every time a record is
: changed. Deferred writing can be turned on, but even then the entire
: file following the first modified record is written when the connection
: is shut down. So if record 0 of a 1000000 record file is changed, the
: entire file is rewritten.
Not 100% exactly, it rewrites everything _after_ the record in question.
However, even seek operations over 10 files, each with 19000 records,
may take up to 12 seconds on a reasonable machine (1000MHz, 256MB).
: Yet another alternative is to use fixed-length records in a file. Then
: the offset into the file is computable, and one may use seek() and
: tell() to hop around as one pleases. And one need not even waste space
: on record separator characters. Of course, that only works if the
: length of the biggest record to be stored has a reasonable upper bound
: which isn't much bigger than the typical record length.
Not in my case, there is a ratio of 1 to 50 between shortest and longest
record.
Thanks again for the discussion,
Oliver.