Opening large text file makes program unhappy.

H

hazmaz

I am writing a program will have a function where it opens a text file,
and fills a list box line by line from the text file. It works fine,
except for when I tried to load a dictionary which is ~ 50mb, it
appears to freeze. When I press Ctrl Alt Del I can see that it is still
loading the file - the mem usage if shooting up. Its been going for a
while and the memory usage is currently at verging on 600,000K. If i
click on the program it stops responding.

Is there a way to stop it from behaving like this? (I think I remember
reading somthing about DoEvents but I'm not sure). Also, is there a way
to stop the entire text file loading into the RAM, or processing it bit
by bit. Eventally it will process the loaded data so I guess there will
be no real need to load it into the text box.

Thanks, and sorry for the ignorance
 
I

inmatarian

I am writing a program will have a function where it opens a text file,
and fills a list box line by line from the text file. It works fine,
except for when I tried to load a dictionary which is ~ 50mb, it
appears to freeze. When I press Ctrl Alt Del I can see that it is still
loading the file - the mem usage if shooting up. Its been going for a
while and the memory usage is currently at verging on 600,000K. If i
click on the program it stops responding.

Is there a way to stop it from behaving like this? (I think I remember
reading somthing about DoEvents but I'm not sure). Also, is there a way
to stop the entire text file loading into the RAM, or processing it bit
by bit. Eventally it will process the loaded data so I guess there will
be no real need to load it into the text box.

Thanks, and sorry for the ignorance

Check your documentation for a function that yields control back to the
OS. When doing memory and time intensive things, you should periodically
(read every few thousand lines) call this yielding function.

Inmatarian
2993 Forever.
 
M

mlimber

I am writing a program will have a function where it opens a text file,
and fills a list box line by line from the text file. It works fine,
except for when I tried to load a dictionary which is ~ 50mb, it

Millibits? How is that even possible? ;-)
appears to freeze. When I press Ctrl Alt Del I can see that it is still
loading the file - the mem usage if shooting up. Its been going for a
while and the memory usage is currently at verging on 600,000K. If i
click on the program it stops responding.

Can you post the function here so we can see it? See this FAQ:

http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.8
Is there a way to stop it from behaving like this?

Certainly. There's also a good chance your program has a bug in it.
(I think I remember
reading somthing about DoEvents but I'm not sure).

Not in standard C++, which is the topic of this group. You might ask in
a group dedicated to your platform. See this FAQ:

http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.9
Also, is there a way
to stop the entire text file loading into the RAM, or processing it bit
by bit. Eventally it will process the loaded data so I guess there will
be no real need to load it into the text box.

Of course there are alternate strategies. Try some out and ask here
again if you have a C++ *language* question, not an algorithmic one
(for that, try comp.programming or similar).

Cheers! --M
 
H

hazmaz

Thanks a lot for the replys, they're really helpful. That makes sense
about giving control back to the os every few thousand lines. Oops
sorry not mb, MB :)
Here is the code:

int ReadFile( HWND *hWnd )
{

fstream file( "words.txt" ,ios::in);
if( file.fail() )
{
MessageBox(NULL,"Error finding file. Make sure words.txt is in the
same directory as this program","Status",MB_OK);
PostQuitMessage(0);
}
vector<string> lines;
string line;
int n = 0;

while( !file.eof() )
{
getline( file, line );
lines.push_back( line );
n++;
}
file.close();

int lno = n;
int x = 0;

for( x; x < lno; x++ )
{
SendDlgItemMessage(*hWnd, IDC_WORDLIST, LB_ADDSTRING, 0,
(LPARAM)lines[x].c_str());
}

return 0;
}


IDC_WORDLIST is a list box. So would I tell it to take a break every
time x is divisible by 3000? Also, how would I tell it to take a break
- is there something like sleep( n ) in c++. Sorry I'm so ignorant, I
only started with C++ a few days ago, before that I'd only used PHP,
tiny bit of perl, javascript, vb.

Would this question count as language or alogrithmic? Sorry if its in
the wrong place, this i my first post in a newsgroup ever :(
 
M

mlimber

Thanks a lot for the replys, they're really helpful. That makes sense
about giving control back to the os every few thousand lines. Oops
sorry not mb, MB :)
Here is the code:

int ReadFile( HWND *hWnd )
{

fstream file( "words.txt" ,ios::in);

How about just ifstream?
if( file.fail() )

Prefer:

if( !file )
{
MessageBox(NULL,"Error finding file. Make sure words.txt is in the
same directory as this program","Status",MB_OK);
PostQuitMessage(0);
}
vector<string> lines;
string line;
int n = 0;

while( !file.eof() )
{
getline( file, line );

Bad form (cf.
http://www.parashift.com/c++-faq-lite/input-output.html#faq-15.2). What
if there is some other failure? Prefer this:

while( getline( file, line ) )
{
lines.push_back( line );

n++;
}
file.close();

Unnecessary. fstreams are closed automatically, and unless you have a
good reason for closing it before the end of scope, it's standard
practice to omit this.
int lno = n;
int x = 0;

for( x; x < lno; x++ )
{
SendDlgItemMessage(*hWnd, IDC_WORDLIST, LB_ADDSTRING, 0,
(LPARAM)lines[x].c_str());
}

This part is off-topic here, but there may well be ways to handle this
better. Ask in a Microsoft newsgroup, some of which are listed here:

http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.9
return 0;
}


IDC_WORDLIST is a list box. So would I tell it to take a break every
time x is divisible by 3000? Also, how would I tell it to take a break
- is there something like sleep( n ) in c++. Sorry I'm so ignorant, I
only started with C++ a few days ago, before that I'd only used PHP,
tiny bit of perl, javascript, vb.

Try asking in a group dedicated to your platform. In standard C++,
sleeping only wastes time. <OT>On many platforms, sleeping lets other
threads/processes run, which may or may not do what you want. If you
created a "worker thread" (hint: search the docs for that), it
might. said:
Would this question count as language or alogrithmic?

Half and half?

Cheers! --M
 
A

Alf P. Steinbach

* (e-mail address removed):
Thanks a lot for the replys, they're really helpful. That makes sense
about giving control back to the os every few thousand lines.

Generally it can, but in this specific case it probably won't.

Oops sorry not mb, MB :)
Here is the code:

int ReadFile( HWND *hWnd )
{

fstream file( "words.txt" ,ios::in);
if( file.fail() )
{
MessageBox(NULL,"Error finding file. Make sure words.txt is in the
same directory as this program","Status",MB_OK);
PostQuitMessage(0);
}
vector<string> lines;
string line;
int n = 0;

while( !file.eof() )
{
getline( file, line );
lines.push_back( line );
n++;
}
file.close();

int lno = n;
int x = 0;

for( x; x < lno; x++ )
{
SendDlgItemMessage(*hWnd, IDC_WORDLIST, LB_ADDSTRING, 0,
(LPARAM)lines[x].c_str());
}

return 0;
}

In addition to the comments from M.Limber:

* Reading line by line is generally extremely inefficient. In theory
the std::ifstream abstraction will fix some of that for you, but in
practice it adds to the mess instead. For a 50 MiB file I'd read
the contents raw into a suitably large pre-allocated buffer, or I'd
use OS-specific functionality to map the file to memory.

* Many GUI things, regardless of OS, have strict limits on how much
data they can handle. The limit for a Windows listbox is about
one thousandth of what you're trying to push into it.

* It's not a good idea to /combine/ responsibilities, as you have
done here (both reading data and filling a listbox), unless by that
combination you can capitalize on some huge advantage. Try instead
to /separate/ responsibilities. I.e., one function for reading, and
some other function that uses the result.
 
B

benben

* Many GUI things, regardless of OS, have strict limits on how much
data they can handle. The limit for a Windows listbox is about
one thousandth of what you're trying to push into it.

From another point of view, the OP is simply making a UI mistake. Even
if a list box can handle that much information, a user would probably
not want to scroll through 50 lines.

Regards,
Ben
 
I

I V

Thanks a lot for the replys, they're really helpful. That makes sense
about giving control back to the os every few thousand lines. Oops

Rather than periodically returning control to the OS, it might be simpler
to change your design so that your function does a small amount of work
each time it's called, rather than all the work at once. Like:

class FileReader
{
std::vector<std::string> lines_;
std::ifstream file_;

void on_complete();
void on_error();

bool read_line();
public:
FileReader(const char* name)
: file_(name)
{ }

static bool callback(void* user_data);
};


// Read a line, return true if the function
// should be called again to read more lines,
// false otherwise. (You probably really
// want to read more than one line at a time,
// but this is easier for the purposes of the
// example).
bool FileReader::read_line()
{
if( file_.fail() )
{
if( file_.eof() )
on_complete();
else
on_error();
return false;
}
std::string line_;
std::getline(file_, line);
lines_.push_back(line);
return true;
}

// This is a static function so that you can pass
// a pointer to it to whatever API your system
// offers to do delayed execution.
bool FileReader::callback(void* user_data)
{
FileReader* r = static_cast<FileReader*>user_data;

return r->read_some_lines();
}

And implement on_complete and on_error to do whatever you need to do to
handle the completion of the read, or the error condition. Depending on
how your system works, you may have to have the callback function request
that it be invoked again, something like:

bool FileReader::callback(void* user_data)
{
FileReader* r = static_cast<FileReader*>user_data;

if( r->read_some_lines() )
{
call_me_later(FileReader::callback, user_data);
return true;
}
else
{
return false;
}
}

Replacing "call_me_later" with whatever is the appropriate function on
your platform.

BTW,
for( x; x < lno; x++ )
{
SendDlgItemMessage(*hWnd, IDC_WORDLIST, LB_ADDSTRING, 0,
(LPARAM)lines[x].c_str());
}

Do you need to read into the vector and _then_ add to the listbox? Or
could you just add the data to the listbox as you read it in?

People on windows programming groups will probably be able to help you
more - they'll be familiar with event-based programming, and with the
specific API functions you'll need to call.
 
I

Ian Collins

Thanks a lot for the replys, they're really helpful. That makes sense
about giving control back to the os every few thousand lines. Oops
sorry not mb, MB :)

You shouldn't have to, the OS should be able to take care of it's own
scheduling. Unless you are running at a real time or very high other
priority.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top