removing data in files

B

blubzouf

I am searching some info about accessing files with stdio functions.

I am able to open a file, read in it with freaf, write in it with
fwrite, modifying its data in "r+" mode ( without truncation nor
appending ), but I have never found how to remove a piece of data
inside a file, like having sequences "aaaa" "xxxx" "bbbb", and needing
to removi second sequence to end with "aaaa" "bbbb", WITHOUT rewriting
the whole file...

Is that possible ? really..
for example, how databases like sqlite do to delete rows so quickly in
big files ? maybe they don't rewrite all the data, do they ?

Somebody told me about "externals data structures" but I didn't found
what he wanted to mean.
 
R

Richard Bos

I am able to open a file, read in it with freaf, write in it with
fwrite, modifying its data in "r+" mode ( without truncation nor
appending ), but I have never found how to remove a piece of data
inside a file, like having sequences "aaaa" "xxxx" "bbbb", and needing
to removi second sequence to end with "aaaa" "bbbb", WITHOUT rewriting
the whole file...

Ah, ye antient and extincte art of faqqe-readinge...
for example, how databases like sqlite do to delete rows so quickly in
big files ? maybe they don't rewrite all the data, do they ?

Nope. Generally, they just mark the row "deleted", and only really
delete the marked rows in one great periodical purging operation, or
they re-use marked rows.

Richard
 
R

Rod Pemberton

I am searching some info about accessing files with stdio functions.

I am able to open a file, read in it with freaf, write in it with
fwrite, modifying its data in "r+" mode ( without truncation nor
appending ), but I have never found how to remove a piece of data
inside a file, like having sequences "aaaa" "xxxx" "bbbb", and needing
to removi second sequence to end with "aaaa" "bbbb", WITHOUT rewriting
the whole file...

Is that possible ? really..

No. At some point, the entire file has to be rewritten with the changes.
for example, how databases like sqlite do to delete rows so quickly in
big files ? maybe they don't rewrite all the data, do they ?

The answer to how quickly they do it is: linked-lists.

Text editors and spreadsheets usually break the data up into smaller pieces,
such as a "line" or "cell". These smaller pieces are then inserted into a
linked-list of data structures, called "nodes", where one of the elements of
the structure is a "line" or a "cell". Each node also contains other
elements, such as pointers to the other data structures, to be able to
create the linked-list. To delete a "line" or "cell", they delete the node
from the linked list (by changing the pointers in two nodes to "skip" the
deleted node). To delete text within a single "line" or "cell", they use
functions like memcpy or memset to directly manipulate the contents of the
"line" or "cell".


Rod Pemberton
 
A

Al Balmer

No. At some point, the entire file has to be rewritten with the changes.


The answer to how quickly they do it is: linked-lists.

<OT> An even quicker method used by some databases is to simply mark
the row "deleted". The slot may then be reused, or cleaned up at some
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,830
Latest member
ZADIva7383

Latest Threads

Top