I/O getline and small buffer

A

Alex Vinokur

What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline ()
{
char small_buffer[4];
char big_buffer[512];

ifstream infile ("foo.in");
assert (infile.is_open());

cout << infile.rdbuf() << endl;

// --------------------------
infile.clear ();
infile.seekg (0, ios::beg);

cout << endl;
cout << "[small_buffer] Start: sizeof(small_buffer) = " << sizeof(small_buffer) << endl;
cout << "[small_buffer] Start: rdstate = " << infile.rdstate() << endl;
while (infile.getline (small_buffer, sizeof(small_buffer)))
{
cout << "[small_buffer] Read: " << small_buffer << '\n';
}
cout << "[small_buffer] Finish: rdstate = " << infile.rdstate() << endl;


// --------------------------
infile.clear ();
infile.seekg (0, ios::beg);

cout << endl;
cout << "[big_buffer] Start: sizeof(big_buffer) = " << sizeof(big_buffer) << endl;
cout << "[big_buffer] Start: rdstate = " << infile.rdstate() << endl;
while (infile.getline (big_buffer, sizeof(big_buffer)))
{
cout << "[big_buffer] Read: " << big_buffer << '\n';
}
cout << "[big_buffer] Finish: rdstate = " << infile.rdstate() << endl;
}


int main()
{
read_using_io_getline ();
return 0;

}
=====================



====== Compilation & Run ======

// gpp.exe (GCC) 3.4.1

$ gpp foo.cpp
// No errors/warnings

$ ./a

1234567890
ABCDEGHIJKL
XYZUWT


[small_buffer] Start: sizeof(small_buffer) = 4
[small_buffer] Start: rdstate = 0
[small_buffer] Finish: rdstate = 4

[big_buffer] Start: sizeof(big_buffer) = 512
[big_buffer] Start: rdstate = 0
[big_buffer] Read: 1234567890
[big_buffer] Read: ABCDEGHIJKL
[big_buffer] Read: XYZUWT
[big_buffer] Finish: rdstate = 6

================================
 
V

Victor Bazarov

Alex said:
What is wrong with small_buffer in program below?
I/O getline doesn't read data from file into small (relative to file line size) buffer.

It does. If you RTFM carefully on 'getline', you'll probably see that it
sets the error condition in the stream if the buffer is too small to read
the entire line. That's the only indication it has for you to know that
there is still some stuff in the same line to be read. I guess you need
to improve your code to work on that error condition. Essentially, if the
stream is not good after 'getline' it doesn't necessarily mean error in
reading *in general*. It could be just a flag that tells you "hey, you
asked for a line, your buffer isn't big enough for the whole line, just to
let you know".

V
 
A

Alex Vinokur

Victor Bazarov said:
It does. If you RTFM carefully on 'getline', you'll probably see that it
sets the error condition in the stream if the buffer is too small to read
the entire line. That's the only indication it has for you to know that
there is still some stuff in the same line to be read. I guess you need
to improve your code to work on that error condition. Essentially, if the
stream is not good after 'getline' it doesn't necessarily mean error in
reading *in general*. It could be just a flag that tells you "hey, you
asked for a line, your buffer isn't big enough for the whole line, just to
let you know".

V

OK.

So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input file?
 
V

Victor Bazarov

Alex said:
OK.

So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input file?

No, we don't have to use large enough buffer. Just be a bit more specific
when checking the status of the stream after 'getline'. If 'failbit' is
set, clear it and try reading again. You can probably find out how many
chars were actually read by interrogating the stream buffer about its
current position before and after the read operation.

V
 
D

Duane Hebert

So, must we use enough big buffer while using I/O getline?

What to do if we don't know what is the size of largest line on input
file?

Use std::string.

Something like (probably need more error checking but...):

std::ifstream infile("spoo.txt");
if(!infile.is_open()) { std::cerr << "open failed" << std::endl; return
ERROR;}
std::string buffer;
while(std::getline(infile,buffer,'\n') {
if(infile.bad()) { std::cerr << "read failed" << std::endl; return
ERROR;}
// use the buffer...
}
 
D

Dietmar Kuehl

Victor said:
No, we don't have to use large enough buffer. Just be a bit more specific
when checking the status of the stream after 'getline'. If 'failbit' is
set, clear it and try reading again. You can probably find out how many
chars were actually read by interrogating the stream buffer about its
current position before and after the read operation.

You can use 'std::istream's member 'gcount()' to find out how many
'char's where read by the last unformatted input operation (and
the fixes in TC1 make it pretty specific what unformatted input
is). However, it is much simpler to use an 'std::string' for
reading lines:

/**/ for (std::string line; std::getline(in, line); )
/**/ ...;
 
A

Alex Vinokur

Dietmar Kuehl said:
You can use 'std::istream's member 'gcount()' to find out how many
'char's where read by the last unformatted input operation (and
the fixes in TC1 make it pretty specific what unformatted input is).

Somethig like:

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline (ifstream& infile_io, int buf_size)
{
char buffer[buf_size];

// --------------------------
infile_io.clear ();
infile_io.seekg (0, ios::beg);

cout << endl;
cout << "Start: sizeof(buffer) = " << sizeof(buffer) << endl;
cout << "Start: rdstate = " << infile_io.rdstate() << endl;

while (infile_io.getline (buffer, sizeof(buffer)).gcount())
{
assert (!infile_io.bad());
cout << buffer;

if (infile_io.fail())
{
infile_io.clear (~(ios_base::failbit | ~infile_io.rdstate ()));
}
else
{
cout << '\n';
}

}
cout << "Finish: rdstate = " << infile_io.rdstate() << endl;

}


int main()
{
ifstream infile ("foo.in");
assert (infile.is_open());

cout << infile.rdbuf() << endl;

read_using_io_getline (infile, 4);
read_using_io_getline (infile, 512);

infile.close();
assert (!infile.is_open());

return 0;

}
=====================


====== Run ======

9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye


Start: sizeof(buffer) = 4
Start: rdstate = 0
9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye
Finish: rdstate = 6

Start: sizeof(buffer) = 512
Start: rdstate = 0
9 8 7 6 5 4
1234567890
ABCDEGHIJ
XYZUWTPR
MNKSQVU
abcdeg
xyzuw
mnks
opr
st
k

goodbye
Finish: rdstate = 6

=================

However, it is much simpler to use an 'std::string' for
reading lines:

/**/ for (std::string line; std::getline(in, line); )
/**/ ...;

Of course.
But I am comparing various methods of copying files.
So, I need both the getline() function and the I/O getline method.
 
V

Victor Bazarov

Alex said:
[..]
Somethig like:

====== foo.cpp ======
#include <cassert>
#include <iostream>
#include <fstream>
using namespace std;

void read_using_io_getline (ifstream& infile_io, int buf_size)
{
char buffer[buf_size];
^^^^^^^^^^^^^^^^^^^^^^
This is not C++.

V
 
A

Alex Vinokur

Alex Vinokur said:
Somethig like:
[snip]

Here is update version which treats last '\n' or non-'\n'.

------ read_using_io_getline ------
#include <cassert>
#include <iostream>
#include <sstream>
#include <fstream>
using namespace std;

string read_using_io_getline (ifstream& infile_io, int buf_size)
{
char* buffer = new (nothrow) char [buf_size];
assert (!(buffer == NULL));

const ios::iostate prev_state (infile_io.rdstate());
const ios::pos_type prev_pos (infile_io.tellg());

// --------------------------
infile_io.clear ();
infile_io.seekg (0, ios::beg);

ostringstream oss;
while (infile_io.getline (buffer, buf_size).gcount())
{
assert (!infile_io.bad());
oss << buffer;

if (infile_io.fail()) infile_io.clear (~(ios_base::failbit | ~infile_io.rdstate ()));
else oss << '\n';
}

string ret_str(oss.str());
if (ret_str.size() > 1)
{
infile_io.rdbuf()->sungetc ();
if (infile_io.rdbuf()->sgetc() != '\n') ret_str.erase(ret_str.size() - 1);
}

// ---------------------------
infile_io.clear(prev_state);
infile_io.seekg(prev_pos, ios::beg);

assert (prev_state == infile_io.rdstate());
assert (prev_pos == infile_io.tellg());
// ---------------------------

return ret_str;

}


int main(int argc, char** argv)
{
cout << "YOUR COMMAND LINE: ";
for (int i = 0; i < argc; i++)
{
cout << argv << " ";
}
cout << endl;
cout << endl;

for (int i = 1; i < argc; i++)
{
cout << endl;
cout << "--- File-" << i << " : " << argv << " ---" << endl;
cout << endl;


ifstream infile (argv);
assert (infile.is_open());

cout << "Source data file: " << endl;
cout << "<" << infile.rdbuf() << ">" << endl;
cout << endl;

int cur_buf_size;

cur_buf_size = 4;
cout << endl;
cout << "Start : buf_size = " << cur_buf_size << endl;
cout << "<" << read_using_io_getline (infile, cur_buf_size) << ">" << endl;
cout << "Finish: buf_size = " << cur_buf_size << endl << endl;

cur_buf_size = 256;
cout << endl;
cout << "Start : buf_size = " << cur_buf_size << endl;
cout << "<" << read_using_io_getline (infile, cur_buf_size) << ">" << endl;
cout << "Finish: buf_size = " << cur_buf_size << endl << endl;

infile.close();
assert (!infile.is_open());

cout << endl;
}

return 0;

}
 
A

Alex Vinokur

Alex Vinokur said:
Here is update version which treats last '\n' or non-'\n'.

------ read_using_io_getline ------ [snip]
string ret_str(oss.str()); -------------------------------
if (ret_str.size() > 1)
// Should be
if (!ret_str.empty())
-------------------------------
{
infile_io.rdbuf()->sungetc ();
if (infile_io.rdbuf()->sgetc() != '\n') ret_str.erase(ret_str.size() - 1);
}

[snip]
delete[] buffer;
return ret_str;

[snip]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,078
Latest member
MakersCBDBlood

Latest Threads

Top