Efficiently Read file Headers

T

ts8807385

Hey guys,

I have a process where I'm throwing files out based on their file
header. This works fine, but when I have a lot of files (millions)
it's slow. What I do now is open each file and push the first ten
bytes into a vector I call 'header_bytes'. I basically do fd.get() ten
times while incrementing an int and pushing_back into the vector.

I then have a bunch of if statements that look similar to the below
code for about 12 common files headers (jpegs, pngs, wavs, riffs, etc)
that I want to exclude from further processing:

if (byte1 == 10 and byte2 == 14 and byte3 == 12)
return false;
else if ()
return false;
else if ()
return false;
else
//process the file further
return true;

As I said, this works fine. When I only have to process a few thousand
files, I'm done quickly. How can I speed it up?

Thanks,
Tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,241
Latest member
Lisa1997

Latest Threads

Top