reading a file into std::string

A

arnuld

I want to read a file into std::string. I am basically a C Programmer so
it was quite hard for me to understand how to do it in C++. I did C++
long time back (if you guys remember my name but I do remember Shiva and
Victor Bazarov and others).

I googled for it and this is the best what I could come up with. Do you
guys have any suggestion for improvement ? and whether this is really a
correct C++ program. (Compiled with "g++ -ansi -pedantic -Wall -Wextra")



#include <iostream>
#include <fstream>
#include <string>


int main()
{
std::string my_contents, tmp_contents;
std::ifstream my_file("reference.cpp");
if(!my_file)
{
std::cerr << "Error Opening file" << std::endl;
exit(EXIT_FAILURE);
}

while(my_file)
{
std::getline(my_file, tmp_contents);
my_contents += tmp_contents;
my_contents += "\n";
}

std::cout << "String contents are: "<< "\n"
<< my_contents << std::endl;

my_file.close();

return 0;
}





-- arnuld
www.LispMachine.Wordpress.com
 
M

Marc

arnuld said:
I want to read a file into std::string. I am basically a C Programmer so
it was quite hard for me to understand how to do it in C++. I did C++
long time back (if you guys remember my name but I do remember Shiva and
Victor Bazarov and others). [...]
std::string my_contents, tmp_contents;
std::ifstream my_file("reference.cpp");
if(!my_file)
{
std::cerr << "Error Opening file" << std::endl;
exit(EXIT_FAILURE);
}

while(my_file)
{
std::getline(my_file, tmp_contents);
my_contents += tmp_contents;
my_contents += "\n";
}

You might as well use std::getline(my_file, my_contents, '\0'),
assuming there is no null character in your file. Not that it is a
good solution, but at least you don't have a loop.
 
R

red floyd

… Or, here's a slightly more efficient version.

#include <iostream>
#include <fstream>
#include <iterator>
#include <algorithm>

int main()
{
std::string my_contents;

std::copy(std::istreambuf_iterator<char>(std::ifstream("reference.cpp").rdbuf()),

std::istreambuf_iterator<char>(),
std::back_insert_iterator<std::string>(my_contents));

return 0;
}


Avoid the copy.


#include <iostream>
#include <fstream>
#include <string>
#include <iterator>

int main()
{
// note: extra parens on the constructor args to
// avoid potential "Most Vexing Parse" issues
std::string my_contents(
std::istreambuf_iterator<char>(
(std::ifstream("reference.cpp").rdbuf())),
(std::istreambuf_iterator<char>()));
}
 
A

arnuld

Avoid the copy.


#include <iostream>
#include <fstream>
#include <string>
#include <iterator>

int main()
{
// note: extra parens on the constructor args to
// avoid potential "Most Vexing Parse" issues
std::string my_contents(
std::istreambuf_iterator<char>(
(std::ifstream("reference.cpp").rdbuf())),
(std::istreambuf_iterator<char>()));
}


Can you please tell me in brief how it works. 2nd, you said "Avoid the
copy", so your program does not copy ?

All I can see is its a template. What about error checking whether file
was opened successfully or not, we are nto even closing the file after we
finish ? I have Stroustrup 3/e in my hands, checking out what rdbuf()
does from section 21.6.3, page 644.
 
R

red floyd

Can you please tell me in brief how it works. 2nd, you said "Avoid the
copy", so your program does not copy ?

The example I was referring to had constructed an empty string, and then
copied the file into it.

My example creates the string with the contents of the file, by using
a constructor that takes two iterators.

In actual practice, there won't be much difference, but one-upmanship
sometimes comes into play on this newsgroup! said:
All I can see is its a template. What about error checking whether file
was opened successfully or not, we are nto even closing the file after we
finish ? I have Stroustrup 3/e in my hands, checking out what rdbuf()
does from section 21.6.3, page 644.

It is "template" code. Error checking, etc... is left as an exercise
for the reader.

rdbuf() returns the streambuf underlying the fstream.
 
A

arnuld

My example creates the string with the contents of the file, by using a
constructor that takes two iterators.

Chapter Strings, Section 20.3.4 I don't see any constructir which takes 2
arguments. If I go to section 16.3.4 then it does have two iterators in a
constructor but that is std::vector not std::string

It is "template" code. Error checking, etc... is left as an exercise
for the reader.

you mean this code will be real-life code based on the ideas you have
given me:

#include <iostream>
#include <fstream>
#include <string>


int main()
{
std::ifstream my_file("reference.cpp");
if(!my_file)
{
std::cerr << "Error Opening file" << std::endl;
exit(EXIT_FAILURE);
}

std::string my_contents(std::istreambuf_iterator<char>(my_file.rdbuf()),
(std::istreambuf_iterator<char>()));

std::cout << "String contents are: "<< "\n"
<< my_contents << std::endl;

my_file.close();

return 0;
}

as usual compiled with "gcc -ansi -pedantic -Wall -Wextra" and it
compiles and runs fine.
 
J

Juha Nieminen

Sam said:
Your program is basically a C program. Here's a C++ program.

I think the std::getline() with a null line terminator is a much simpler
and better solution.
 
M

Marc

Juha said:
I think the std::getline() with a null line terminator is a much simpler
and better solution.

ostream::eek:perator<<(streambuf*) has the advantage that it doesn't need
to check characters one by one. Of all the solutions in the thread, it
looks like the one most likely to perform batch copies.

(of course anything C++ will be much slower than a version with mmap)
 
A

arnuld

I think the std::getline() with a null line terminator is a much
simpler and better solution.

Stroustrup, section 20.3.15, page 598

The getline() function reads a line terminated by eol into its
string, expanding the string as needed to hold the line. If no eol
argument is provided, a newline '\n' is used as the delimiter. The line
terminator is removed from the stream but not entered into string.

where string = 2nd argument, eol = 3rd argument

now my point is when you give '\0' (its called null or NULL ?) as 3rd
argument which is not present in the input stream, what will be its
behavior ?

(1) will getline() keep on looking for it till its reaches EOF (End of
File) and read whole file into the string

(2) If I give anything which is not present in the input e.g. '#' will it
behave the same ?
 
R

red floyd

Chapter Strings, Section 20.3.4 I don't see any constructir which takes 2
arguments. If I go to section 16.3.4 then it does have two iterators in a
constructor but that is std::vector not std::string

ISO/IEC 14882:2003, sections 21.3/6 [lib.basic.string] and
21.3.1/14 [lib.string.cons]

template<class InputIterator>
basic_string(InputIterator begin, InputIterator end,
const Allocator& a = Allocator());
 
R

red floyd

you mean this code will be real-life code based on the ideas you have
given me:

#include<iostream>
#include<fstream>
#include<string>


int main()
{
std::ifstream my_file("reference.cpp");
if(!my_file)
{
std::cerr<< "Error Opening file"<< std::endl;
exit(EXIT_FAILURE);
}

std::string my_contents(std::istreambuf_iterator<char>(my_file.rdbuf()),
(std::istreambuf_iterator<char>()));

std::cout<< "String contents are:"<< "\n"
<< my_contents<< std::endl;

my_file.close();

return 0;
}

Just an FYI, the my_file.close() is not required, as the ifstream
destructor will close it for you.
 
R

red floyd

you mean this code will be real-life code based on the ideas you have
given me:

#include<iostream>
#include<fstream>
#include<string>


int main()
{
std::ifstream my_file("reference.cpp");
if(!my_file)
{
std::cerr<< "Error Opening file"<< std::endl;
exit(EXIT_FAILURE);
}

std::string my_contents(std::istreambuf_iterator<char>(my_file.rdbuf()),
(std::istreambuf_iterator<char>()));

std::cout<< "String contents are:"<< "\n"
<< my_contents<< std::endl;

my_file.close();

return 0;
}

as usual compiled with "gcc -ansi -pedantic -Wall -Wextra" and it
compiles and runs fine.
Technically. you need to #include <cstdlib> to define EXIT_FAILURE.

In actual practice, obviously one of <iostream>, <fstream>, or <string>
is including it.

Also, I believe std::endl is defined in <ostream>. In C++11 <iostream>
covers that, but in C++03, <iostream> does not officially include
<ostream>, though as far as I know, all known C++ compilers do the
nested include.
 
J

Juha Nieminen

Marc said:
(of course anything C++ will be much slower than a version with mmap)

For a more portable solution, the fastest method would probably be to
resolve the size of the input file (I think using fseek and ftell is
portable enough for this, although I don't know if it's technically a
100% portable method, but it does work in most systems I know of), then
allocate a string of that size and dump the entire file into it using
fread (which is faster than std::istream::read in most systems).

OTOH I don't remember now if the standard guarantees that std::string
will allocate a contiguous block of memory. Probably not. That would be
a problem. You could use std::vector<char> which is guaranteed to be
contiguous, but then you don't have the std::string functions to operate
on it. However, std::vector<char> is often enough for most purposes.
(Assigning a std::string from the std::vector<char> would be inefficient
because it would temporarily double the memory requirement, and there
would be basically useless copying of all the data.)
 
M

Marc

Juha said:
OTOH I don't remember now if the standard guarantees that std::string
will allocate a contiguous block of memory. Probably not. That would be
a problem.

C++0X does. I can't find such a guarantee in C++03, but it is probably
ok to assume it is there...
 
R

red floyd

C++0X does. I can't find such a guarantee in C++03, but it is probably
ok to assume it is there...

It's not guaranteed in C++03, but I believe all current implementations
do so.
 
A

arnuld

ISO/IEC 14882:2003, sections 21.3/6 [lib.basic.string] and
21.3.1/14 [lib.string.cons]
template<class InputIterator>
basic_string(InputIterator begin, InputIterator end,
const Allocator& a = Allocator());

I have n3242.pdf and in section 21.3 (page 624) I can't find this (may
template syntax is too alien for me). 2nd, even if its there, we used 2
arguments unlike 3 mentioned here. I assume it will have some default
value.
 
A

arnuld

Technically. you need to #include <cstdlib> to define EXIT_FAILURE.

I wanted to do it C++ way, how C++ defines exit for failure/success.

In actual practice, obviously one of <iostream>, <fstream>, or <string>
is including it.

Also, I believe std::endl is defined in <ostream>. In C++11 <iostream>
covers that, but in C++03, <iostream> does not officially include
<ostream>, though as far as I know, all known C++ compilers do the
nested include.

That is strange. I thought both <istream> and <ostream> were children of
parent class <iostream> and they will inherit std::endl from <iostream>.
 
B

Bo Persson

arnuld said:
ISO/IEC 14882:2003, sections 21.3/6 [lib.basic.string] and
21.3.1/14 [lib.string.cons]
template<class InputIterator>
basic_string(InputIterator begin, InputIterator end,
const Allocator& a = Allocator());

I have n3242.pdf and in section 21.3 (page 624) I can't find this
(may template syntax is too alien for me). 2nd, even if its there,
we used 2 arguments unlike 3 mentioned here. I assume it will have
some default value.

It is a member (constructor). See page 629 and bottom of page 634.



Bo Persson
 
R

red floyd

ISO/IEC 14882:2003, sections 21.3/6 [lib.basic.string] and
21.3.1/14 [lib.string.cons]
template<class InputIterator>
basic_string(InputIterator begin, InputIterator end,
const Allocator& a = Allocator());

I have n3242.pdf and in section 21.3 (page 624) I can't find this (may
template syntax is too alien for me). 2nd, even if its there, we used 2
arguments unlike 3 mentioned here. I assume it will have some default
value.

The section numbering of n3242 (C++11 draft) differs from the official
2003 Standard.
 
N

Nobody

That is strange. I thought both <istream> and <ostream> were children of
parent class <iostream> and they will inherit std::endl from <iostream>.

<istream>, <ostream> and <iostream> are headers, not classes, although
classes with those names exist.

Also, std::endl() is a function, not a method.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top