how can c++ be slower than Perl?

J

jeffrey.bigham

Hello,

I'm writing a program that needs to read input line by line and analyze
each, and it needs to be as efficient as possible. I wrote the
following sample program that works, but is 10 times slower than a Perl
equivalent on a large input (~212 Mb). There has to be something wrong
- can anyone help me fix it? I've included both at the bottom of my
post and I compile with g++ -03 cin_test.C

Thanks!
Jeff

#include <string>
#include <iostream>

using namespace std;

int main() {
string my_string;
while(cin) {
getline(cin, my_string);
}
}

The Perl "equivalent" that I used (verbatim) is:

while (<>) { print if /someregex/; }
 
A

Alf P. Steinbach

* (e-mail address removed):
I'm writing a program that needs to read input line by line and analyze
each, and it needs to be as efficient as possible. I wrote the
following sample program that works, but is 10 times slower than a Perl
equivalent on a large input (~212 Mb). There has to be something wrong
- can anyone help me fix it? I've included both at the bottom of my
post and I compile with g++ -03 cin_test.C

Thanks!
Jeff

#include <string>
#include <iostream>

using namespace std;

int main() {
string my_string;
while(cin) {
getline(cin, my_string);
}
}

The Perl "equivalent" that I used (verbatim) is:

while (<>) { print if /someregex/; }

Your result is not surprising. C++ iostreams are notoriously
inefficient, in most (all?) implementations. But the Perl bytecode
interpreter is probably written in C, which is a mostly a subset of C++:
that means you can do at least as well as the Perl interpreter.

Things to improve performance even with C++ iostreams:

* Use binary i/o (turn off that darned newline conversion).

* Read a larger chunk of the file at a time (larger buffer).

* Read into a statically allocated buffer (like Perl probably does)
with a limit on line length, instead of std::string -- this
trades some safety and functionality for efficiency.

However, you can't force std::cin and std::cout to binary mode in a
portable way (that's a design error with iostreams); with C++ iostreams
you'll have to require a name file as input, or use some non-portable
mechanism.

So the upshot is that if you want to stay within the standard library,
and have performance and guaranteed results, and have portable code, use
low-level C FILE* (I'm not sure if the Boost library offers something).
 
G

Greg Buchholz

Hello,

I'm writing a program that needs to read input line by line and analyze
each, and it needs to be as efficient as possible. I wrote the
following sample program that works, but is 10 times slower than a Perl
equivalent on a large input (~212 Mb). There has to be something wrong
- can anyone help me fix it? I've included both at the bottom of my
post and I compile with g++ -03 cin_test.C

// Try this program instead...
#include <string>
#include <iostream>

using namespace std;

int main()
{
// Don't sync C++ and C I/O...
ios_base::sync_with_stdio(false);
string my_string;
while(cin) {
getline(cin, my_string);
}
return 0;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top