Stripping specific bytes of a file without opening the entire file...

Discussion in 'C++' started by balgach@gmail.com, Feb 5, 2005.

  1. Guest

    Greetings all,

    I have a group of rather large files (by group i mean close to 2x10^7
    files, each 12-15megs) now i need information which is stored in just
    the last 512 bytes of each file. i was wondering if there is a way to
    strip out this information without loading the entire file into memory.
    Right now im doing it with fopen() and fread() and it takes several
    hours to process all the files. obviously i know exactly where the
    information is, just wondering if i can somehow just open up the
    trailing 512-1024 bytes. or even physically strip this info off the
    file on the disc before opening it? thanks for the help. ps. im
    working on a linux box, running fedora cora 2, using gcc v3.3.3

    Cheers,
    Adam.
     
    , Feb 5, 2005
    #1
    1. Advertising

  2. Shezan Baig Guest

    wrote:
    > Greetings all,
    >
    > I have a group of rather large files (by group i mean close to 2x10^7
    > files, each 12-15megs) now i need information which is stored in

    just
    > the last 512 bytes of each file. i was wondering if there is a way

    to
    > strip out this information without loading the entire file into

    memory.
    > Right now im doing it with fopen() and fread() and it takes several
    > hours to process all the files. obviously i know exactly where the
    > information is, just wondering if i can somehow just open up the
    > trailing 512-1024 bytes. or even physically strip this info off the
    > file on the disc before opening it? thanks for the help. ps. im
    > working on a linux box, running fedora cora 2, using gcc v3.3.3
    >
    > Cheers,
    > Adam.


    Try 'istream::seekg'

    Hope this helps,
    -shez-
     
    Shezan Baig, Feb 5, 2005
    #2
    1. Advertising

  3. wrote:
    > Greetings all,
    >
    > I have a group of rather large files (by group i mean close to 2x10^7
    > files, each 12-15megs) now i need information which is stored in

    just
    > the last 512 bytes of each file. i was wondering if there is a way

    to
    > strip out this information without loading the entire file into

    memory.
    > Right now im doing it with fopen() and fread() and it takes several
    > hours to process all the files. obviously i know exactly where the
    > information is, just wondering if i can somehow just open up the
    > trailing 512-1024 bytes. or even physically strip this info off the
    > file on the disc before opening it? thanks for the help. ps. im
    > working on a linux box, running fedora cora 2, using gcc v3.3.3



    Your best bet is fseek() which allows you to position the file pointer
    at a specified offset in the file. I don't think that you can avoid
    opening the file in any event.

    Regards,

    Jon Trauntvein
     
    JH Trauntvein, Feb 5, 2005
    #3
  4. Guest

    yeah im trying to not open the entire file into memory. im using
    fseek() once its called from fopen() but it still loads the entire
    file into memory, which is a nusance. any other ideas?

    Cheers,
    Adam.
     
    , Feb 5, 2005
    #4
  5. Re: Stripping specific bytes of a file without opening the entirefile...

    wrote:
    > yeah im trying to not open the entire file into memory. im using
    > fseek() once its called from fopen() but it still loads the entire
    > file into memory, which is a nusance. any other ideas?
    >
    > Cheers,
    > Adam.
    >


    Try using platform specific functions to prevent the loading
    of the file into memory. Just remember about portability
    problems.

    --
    Thomas Matthews

    C++ newsgroup welcome message:
    http://www.slack.net/~shiva/welcome.txt
    C++ Faq: http://www.parashift.com/c -faq-lite
    C Faq: http://www.eskimo.com/~scs/c-faq/top.html
    alt.comp.lang.learn.c-c++ faq:
    http://www.comeaucomputing.com/learn/faq/
    Other sites:
    http://www.josuttis.com -- C++ STL Library book
    http://www.sgi.com/tech/stl -- Standard Template Library
     
    Thomas Matthews, Feb 5, 2005
    #5
  6. Shezan Baig Guest

    wrote:
    > yeah im trying to not open the entire file into memory. im using
    > fseek() once its called from fopen() but it still loads the entire
    > file into memory, which is a nusance. any other ideas?
    >
    > Cheers,
    > Adam.


    Why will seek functions (either fseek() or istream::seekg()) load the
    entire file into memory? What kind of filesystem are you using? I
    would think that if the filesystem provides enough information about
    the physical layout of files on disk (e.g., inodes in unix), it
    shouldn't need to load the entire file into memory.

    Am I wrong?

    -shez-
     
    Shezan Baig, Feb 5, 2005
    #6
  7. Guest

    which platform specific functions are you thinking of? im running
    fedora core 2 (redhat linux) on an ext3 file system. portability is
    not a concern of mine. i just need it to run on ext3 filesystems.

    cheers,
    adam.
     
    , Feb 5, 2005
    #7
  8. Guest

    wrote:
    > which platform specific functions are you thinking of? im running
    > fedora core 2 (redhat linux) on an ext3 file system. portability is
    > not a concern of mine. i just need it to run on ext3 filesystems.
    >
    > cheers,
    > adam.


    This may not help you, but I just did some tests...on my system
    (winXP), using fstream to open and seekg() to move around does not
    cause the file to load into memory. I tested by opening a 30MB file,
    seeking around, then seeking to the begining and loading the whole
    thing into memory, and seeing how long it took/how much disk usage was
    required between each step. The only operation that took any
    significant time was loading the file. I then tested where I just
    loaded a small fraction of the file, and it took much less time than
    loading the full file.

    HTH
     
    , Feb 6, 2005
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jason Collins
    Replies:
    3
    Views:
    6,067
    Jason Collins
    Feb 18, 2004
  2. UJ
    Replies:
    2
    Views:
    7,572
    John Timney \(ASP.NET MVP\)
    Jun 27, 2005
  3. Stephane CHAZELAS

    Re: Stripping multiline C comments without using Lex

    Stephane CHAZELAS, Feb 4, 2004, in forum: C Programming
    Replies:
    3
    Views:
    951
    Jens Schweikhardt
    Feb 5, 2004
  4. Yandos
    Replies:
    12
    Views:
    5,146
    Pete Becker
    Sep 15, 2005
  5. Replies:
    5
    Views:
    321
Loading...

Share This Page