WinXP: Ostream Operator Corrupting MFT(?)

T

TungstenCoil

Hello,

Has anyone encountered a problem with, or have any ideas around, the
..\$MFT on NTFS becoming corrupted while using std:: ofstream? I had
some code that would run for a few days and suddenly crash the system,
and Windows XP would report the MFT was corrupt. The only thing to do
was reinstall Windows (it won't boot, even in safe mode).

I've pared my code down to the offending portion... my code writes many
million small text files. When I reach approximately 25+ million files,
the corruption happens. I am no where near filling the disk. I've
eliminated short file names and increased the size of the MFT. This
problem happens on many different machine hardware configurations,
single/hyper/multi-processor machines with single/multiple/RAID drives
as well as flat directory and nested subdirectory situations (I've
tried a lot of stuff to narrow this down...).

A few more points:
I am well under the NTFS limit of >4billion files. As well, I've disabled short file names and expanded the size of the MFT as suggested by MSDN and http://www.ntfs.com
I know it seems unusual, but the project dictates things be handled this way. In a nutshell, my module manages the placement of these small files within the directory structure; the requirements are such that many million small files will be placed within the structure.
During testing, the hardware was crashing, not the software (though it ceased doing anything useful). I slowly whittled the problem down to a point where I was simply outputting 25+ million simple ASCII text files of the appropriate size (4 - 16K).
It's been tested on many hardware combinations; all crash. I agree that it should not, but it is, repeatedly and reproducably.


void WriteFlatFileOnly(const std::string &test_directory, const int
iterations){

srand((unsigned)time( NULL ) );
std :: ofstream dummy_file;
std :: string file_name = "";
int file_size = 0;

int error = _mkdir (test_directory.c_str() );


for (int i = 0; i < iterations; i++){ // create a grand total of
iterations files
file_name = test_directory + "/";
char buffer[100];
file_name += _itoa(i, buffer, 10);
file_name += "DummyFile.txt";

// START WRITE


try{
dummy_file.open(file_name.c_str(), std :: ios :: out | std :: ios ::
binary);
}
catch (std::exception &error){
std::cerr << error.what()<<std::endl;
return;
}

file_size = ( (rand() % 12000) + 4000); // generate random file size
between 4K and 16K
for (int i = 0; i < file_size; i++){
try{
dummy_file << "A"; // output a byte
}
catch (std::exception &error){
std::cerr << error.what()<<std::endl;
return;
}
} // for (int i ...)
try{
dummy_file.close();
}
catch (std::exception &error){
std::cerr << error.what()<<std::endl;
return;
}


// END WRITE
}

// END CODE
This example outputs them to one directory file. This was the eventual "simplest example" that I worked down to - this, as well as the original code that creates a robust directory/subdirectory system, crashes the MFT eventually. Both do so at the same approximate time.
 
D

David Lindauer

TungstenCoil said:
Hello,

Has anyone encountered a problem with, or have any ideas around, the
.\$MFT on NTFS becoming corrupted while using std:: ofstream? I had
some code that would run for a few days and suddenly crash the system,
and Windows XP would report the MFT was corrupt. The only thing to do
was reinstall Windows (it won't boot, even in safe mode).

I read through your description... and the first thing that jumped out is you are really tasking the file system in
a way that probably hasn't been tested before.

I don't know enough about C++ to know what kind of things are happening in the paging file while this is going on. e.g. are there a lot of implicit new() and delete() calls in the streaming file ops that could be thrashing and/or extending the paging file? Could some other conflict with the paging file be a problem here?

The way I would proceed is to back out of C++ entirely and write some equivalent code using probably the
unix I/O calls. That is raw enough that if it still happens you've ruled out something happening in the C rtl/C++ stl.
If it doesn't happen you know the OS is capable of handling the load.

If that works you might see if there is some way to unsync the C++ and C file I/O buffers, maybe it has something to do with the syncing. Another possibility is to download the free borland C++ compiler (or gcc) and see if compiling your test code still exhibits the problem, e.g. is there a problem specifically in the visual studio stl?

David
 
B

Ben Pope

TungstenCoil said:
Hello,

Has anyone encountered a problem with, or have any ideas around, the
.\$MFT on NTFS becoming corrupted while using std:: ofstream? I had
some code that would run for a few days and suddenly crash the system,
and Windows XP would report the MFT was corrupt. The only thing to do
was reinstall Windows (it won't boot, even in safe mode).

I've pared my code down to the offending portion... my code writes many
million small text files. When I reach approximately 25+ million files,

You have a design error. Requiring the use of millions of files is
always going to be wrong.*

What makes you think you need so many files?

*All generalisations are false.

Ben Pope
 
T

TungstenCoil

LOL I agree; however, I am (truly) the low man on the totem pole here
and my cries are going unheeded....

So, to those interested, the code has now been tested both with the
bit-shift (<<) operator as will as with a pointer and the write()
function... crash away....

I think my next step is to code something similar up in either Java or
VB and see if it crashes machines... or wait until my PM decided we
should proceed down an alternate route :)
 
T

TB

TungstenCoil sade:
Hello,

Has anyone encountered a problem with, or have any ideas around, the
.\$MFT on NTFS becoming corrupted while using std:: ofstream? I had
some code that would run for a few days and suddenly crash the system,
and Windows XP would report the MFT was corrupt. The only thing to do
was reinstall Windows (it won't boot, even in safe mode).

I've pared my code down to the offending portion... my code writes many
million small text files. When I reach approximately 25+ million files,
the corruption happens. I am no where near filling the disk. I've
eliminated short file names and increased the size of the MFT. This
problem happens on many different machine hardware configurations,
single/hyper/multi-processor machines with single/multiple/RAID drives
as well as flat directory and nested subdirectory situations (I've
tried a lot of stuff to narrow this down...).

A few more points:

Are you sure? There is a documented bug that limits the number of files
to "only" 4 million. If you are on such a machine, then weird things might
happen if you try to breach it.
 
T

Tom

TungstenCoil sade:

Are you sure? There is a documented bug that limits the number of files
to "only" 4 million. If you are on such a machine, then weird things might
happen if you try to breach it.

Million not billion?

Either way ... wow!! That's a lot of files!! Try scrolling through
that in explorer. HA! I am curious. How is your program used? My guess
is password cracking or some type of spamming algorithm. I hope I am
wrong!

Perhaps the algorithm could be used to "test" the string and only
store successful results? Seeding a random number generator would
allow you to define the ranges of what had been tested already so that
repeats are minimized. Retesting might be faster than sorting and
proving uniqueness.

Millions of files .. I am still dizzy.

- Tom
 
R

red floyd

TB said:
Are you sure? There is a documented bug that limits the number of files
to "only" 4 million. If you are on such a machine, then weird things might
happen if you try to breach it.

What's the KB on this? I couldn't find it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top