deleteing records from DB_File doesnt decrease file size?

Discussion in 'Perl Misc' started by botfood, Sep 16, 2006.

  1. botfood

    botfood Guest

    I am working on a DB cleanup tool that purges records from a tie()ed
    file created with DB_File. I have a first pass done on the tool, and
    seems to be functioning. On a test DB it correctly reports that it
    found about 2500 'old' records according to my criteria,and deleted
    them...

    problem is that the file itself, which was about 10.3MB for 14k
    records, did not change size after 2500 records were deleted. I sort of
    expected some reduction in file size?!

    Is there some kind of 'compact' or cleanup utility that I need to run
    on the database to squeeze out the empty holes or something?


    snippets shown... not working code
    =======
    ....
    use DB_File;
    tie ( %tempHash , 'DB_File' , "${cfgRelPath_cgi2DB}/${dbfile}" ) ;


    then later in a loop I delete a specific record with something like
    this

    delete($tempHash{$tempKey}) ;

    =================
    botfood, Sep 16, 2006
    #1
    1. Advertising

  2. botfood

    Guest

    botfood wrote:
    > I am working on a DB cleanup tool that purges records from a tie()ed
    > file created with DB_File. I have a first pass done on the tool, and
    > seems to be functioning. On a test DB it correctly reports that it
    > found about 2500 'old' records according to my criteria,and deleted
    > them...
    >
    > problem is that the file itself, which was about 10.3MB for 14k
    > records, did not change size after 2500 records were deleted. I sort of
    > expected some reduction in file size?!
    >
    > Is there some kind of 'compact' or cleanup utility that I need to run
    > on the database to squeeze out the empty holes or something?
    >
    >
    > snippets shown... not working code
    > =======
    > ....
    > use DB_File;
    > tie ( %tempHash , 'DB_File' , "${cfgRelPath_cgi2DB}/${dbfile}" ) ;
    >
    >
    > then later in a loop I delete a specific record with something like
    > this
    >
    > delete($tempHash{$tempKey}) ;
    >
    > =================

    Sometimes the DB optimisations means that record 'deletions' only mean
    that the record is marked as unused. This is rather like in some file
    systems, when you delete a file, although the file is nolonger listed
    in the directory, other programs that look at the underlying disk
    structure may well be able to pick up data that was in the deleted
    file.

    One way to recover the space may be to just copy active records to
    another DB file, double check the copy, then delete the original. (But
    not if the DB can be accessed concurrently etc).

    - Paddy.
    , Sep 16, 2006
    #2
    1. Advertising

  3. botfood

    botfood Guest

    wrote:
    > One way to recover the space may be to just copy active records to
    > another DB file, double check the copy, then delete the original. (But
    > not if the DB can be accessed concurrently etc).
    > ----------------


    seems to work!
    I changed my utility to write the 'keepers' to a new DB, and the 'old'
    records to another; then deleted the orginal, renamed the purge one and
    archived the old record db. This procedure DID result in a smaller file
    when I was done.

    thanks.

    The REASON this was required hasnt gone away, and I'm not quite sure
    what to do about it long term. Its not a perl issue, but is a web
    server memory issue.... I have a site running on Apache with this
    pretty big database, grew to about 20k records with a total file size
    that grew to around 12MB. It recently acted very strangely and would
    not write any new records. My best guess at this point is that settings
    like the Apache::SizeLimit have something to do with it, but it remains
    to be seen whether the Host is willing to alter config files for me.

    ThePERL part of this is that I'm open to what people may suggest as
    ways to reduce the memory used by any single process accessing the DB.
    Especially if I have a report that needs to go thru the whole DB, how
    can I reduce the hit on the server? Would it work to split out record
    contents into a couple different 'tables' and pull them in if required?

    I had thought that by tie()ing to a file on disk,I'd avoid eating up
    the RAM and process memory except, but I dont really understand memory
    and paging and all that.....

    comments? ideas?

    d
    botfood, Sep 16, 2006
    #3
  4. botfood

    J. Gleixner Guest

    botfood wrote:

    > The REASON this was required hasnt gone away, and I'm not quite sure
    > what to do about it long term. Its not a perl issue, but is a web
    > server memory issue.... I have a site running on Apache with this
    > pretty big database, grew to about 20k records with a total file size
    > that grew to around 12MB.


    That sounds more like a pretty small database.

    > It recently acted very strangely and would
    > not write any new records. My best guess at this point is that settings
    > like the Apache::SizeLimit have something to do with it, but it remains
    > to be seen whether the Host is willing to alter config files for me.

    They/You should be able to determine if it's a memory issue, before
    blindly changing things.

    >
    > ThePERL part of this is that I'm open to what people may suggest as
    > ways to reduce the memory used by any single process accessing the DB.
    > Especially if I have a report that needs to go thru the whole DB, how
    > can I reduce the hit on the server? Would it work to split out record
    > contents into a couple different 'tables' and pull them in if required?
    >
    > I had thought that by tie()ing to a file on disk,I'd avoid eating up
    > the RAM and process memory except, but I dont really understand memory
    > and paging and all that.....
    >
    > comments? ideas?


    Only a small part of the DBM is in memory at one time, so I'd doubt it's
    a memory issue with DBM access. The file size of the DBM really doesn't
    matter. More than likely, if it is a memory issue, it's in how you are
    using that data (e.g. looping through the records and storing it in a
    data structure) that is causing a problem. Since you don't provide any
    code, that's only a guess.

    Since it wouldn't write new records, it might have been corrupted, so be
    sure to lock the DBM appropriately. See the documentation for "Locking:
    The Trouble with fd" in perldoc DB_File for possible solutions.
    J. Gleixner, Sep 18, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Emilio

    Problem deleteing files

    Emilio, Oct 23, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    360
    Rajesh.V
    Oct 23, 2003
  2. Clarence
    Replies:
    3
    Views:
    411
    Joris Gillis
    Aug 11, 2005
  3. Blue Ocean

    Deleteing automatic variables

    Blue Ocean, Jul 10, 2004, in forum: C++
    Replies:
    3
    Views:
    371
    Howard
    Jul 12, 2004
  4. Replies:
    1
    Views:
    396
    Thomas Matthews
    Jul 12, 2004
  5. Replies:
    3
    Views:
    432
    alex23
    May 27, 2008
Loading...

Share This Page