Stripping specific bytes of a file without opening the entire file...

B

balgach

Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just
the last 512 bytes of each file. i was wondering if there is a way to
strip out this information without loading the entire file into memory.
Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3

Cheers,
Adam.
 
S

Shezan Baig

Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just
the last 512 bytes of each file. i was wondering if there is a way to
strip out this information without loading the entire file into memory.
Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3

Cheers,
Adam.

Try 'istream::seekg'

Hope this helps,
-shez-
 
J

JH Trauntvein

Greetings all,

I have a group of rather large files (by group i mean close to 2x10^7
files, each 12-15megs) now i need information which is stored in just
the last 512 bytes of each file. i was wondering if there is a way to
strip out this information without loading the entire file into memory.
Right now im doing it with fopen() and fread() and it takes several
hours to process all the files. obviously i know exactly where the
information is, just wondering if i can somehow just open up the
trailing 512-1024 bytes. or even physically strip this info off the
file on the disc before opening it? thanks for the help. ps. im
working on a linux box, running fedora cora 2, using gcc v3.3.3


Your best bet is fseek() which allows you to position the file pointer
at a specified offset in the file. I don't think that you can avoid
opening the file in any event.

Regards,

Jon Trauntvein
 
F

ffld

yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.
 
T

Thomas Matthews

yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.

Try using platform specific functions to prevent the loading
of the file into memory. Just remember about portability
problems.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library
 
S

Shezan Baig

yeah im trying to not open the entire file into memory. im using
fseek() once its called from fopen() but it still loads the entire
file into memory, which is a nusance. any other ideas?

Cheers,
Adam.

Why will seek functions (either fseek() or istream::seekg()) load the
entire file into memory? What kind of filesystem are you using? I
would think that if the filesystem provides enough information about
the physical layout of files on disk (e.g., inodes in unix), it
shouldn't need to load the entire file into memory.

Am I wrong?

-shez-
 
F

ffld

which platform specific functions are you thinking of? im running
fedora core 2 (redhat linux) on an ext3 file system. portability is
not a concern of mine. i just need it to run on ext3 filesystems.

cheers,
adam.
 
M

mango_maniac

which platform specific functions are you thinking of? im running
fedora core 2 (redhat linux) on an ext3 file system. portability is
not a concern of mine. i just need it to run on ext3 filesystems.

cheers,
adam.

This may not help you, but I just did some tests...on my system
(winXP), using fstream to open and seekg() to move around does not
cause the file to load into memory. I tested by opening a 30MB file,
seeking around, then seeking to the begining and loading the whole
thing into memory, and seeing how long it took/how much disk usage was
required between each step. The only operation that took any
significant time was loading the file. I then tested where I just
loaded a small fraction of the file, and it took much less time than
loading the full file.

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top