making an istream from a char array

J

John Salmon

I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

Thanks,
John Salmon

[jsalmon@river c++]$ cat chararraytostream.cpp
#include <string>
#include <sstream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
using namespace std;

char *getLotsOfBytes();
istream& streamParser(istream &s);
void linuxChkMem(const char *msg);

void withImplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
istringstream iss(chunk);
linuxChkMem("After iss(p): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
free(chunk);
linuxChkMem("After free(p): ");
}

void withExplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
string s(chunk);
linuxChkMem("After s(chunk): ");
free(chunk);
linuxChkMem("After free(p): ");
istringstream iss(s);
linuxChkMem("After iss(s): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
}

int main(int argc, char **argv){
printf("with an implicit string constructor\n");
withImplicitString();
printf("\nwith an explicit string constructor\n");
withExplicitString();
return 0;
}

// On linux, tell us how much data space we're using
// in the VM.
void linuxChkMem(const char *msg){
printf("%s", msg);
fflush(stdout);
char cmd[50];
sprintf(cmd, "grep VmData /proc/%d/status", getpid());
system(cmd);
}

static const int SZ = 100*1024*1024;
// A rough approximation to getLotsOfBytes. In the
// real application, getLotsOfBytes has these characteristics:
// - it returns a malloced pointer to a NUL-terminated array of chars.
// - it is out of my control. E.g., I can't rewrite it in a way
// that might be more friendly to C++ streams.
char *getLotsOfBytes(){
char *p = (char *)malloc(SZ);
memset(p, ' ', SZ);
strcpy(p+SZ-50, "3.1415 2.718 1.414");
return p;
}

// A rough approximation to streamParser. In the real
// application, streamParser takes a ref to an istream
// and does what it does. Again, I can't easily redefine
// the interface.
istream& streamParser(istream& s){
double x, y, z;
s >> x >> y >> z;
printf("x: %f y: %f z: %f\n", x, y, z);
return s;
}

[jsalmon@river c++]$ g++ -O3 chararraytostream.cpp
[jsalmon@river c++]$ a.out
with an implicit string constructor
Before getLotsOfBytes: VmData: 40 kB
After getLotsOfBytes():VmData: 102444 kB
After iss(p): VmData: 204848 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 102576 kB
After free(p): VmData: 172 kB

with an explicit string constructor
Before getLotsOfBytes: VmData: 172 kB
After getLotsOfBytes():VmData: 102576 kB
After s(chunk): VmData: 204980 kB
After free(p): VmData: 102576 kB
After iss(s): VmData: 204980 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 172 kB
[jsalmon@river c++]$
 
D

Denise Kleingeist

Hello John!
John said:
My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used.

std::istringstream takes a std::string. For creating this
std::string from a char array, a copy is created. This copy
is then copied into the std::istringstream. For this purpose,
you probably don't want to use an std::istringstream. Instead,
you could use a simple homegrown stream buffer (code see
below).

Good luck, Denise!
--- CUT HERE ---
#include <istream>
#include <iostream>
#include <streambuf>
#include <string>
#include <string.h>

struct membuf:
std::streambuf
{
membuf(char* b, char* e) { this->setg(b, b, e); }
};

int main()
{
char* buffer = get_huge_buffer_with_data();
membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
std::istream in(&sbuf);
for (std::string line; std::getline(in, line); )
std::cout << "line: " << line << "\n";
}
 
G

Gianni Mariani

John said:
I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

The "without making a copy" might be a little tricky with istringstream.

I'm no expert on c++ streams but something like this might work.

#include <istream>

class Xistream
: public std::istream,
public std::streambuf
{
public:
Xistream( const char * begin, const char * end )
: std::istream( this )
{
setg( const_cast<char *>(begin), const_cast<char *>(begin),
const_cast<char *>(end) );
}
};

#include <iostream>

int main()
{
const char xx[] = "1 22 33";

Xistream xi( xx, xx + sizeof(xx) -1);

int i;
xi >> i;

std::cout << i << "\n";

xi >> i;

std::cout << i << "\n";

}
 
J

John Salmon

Denise> Hello John!

Denise> std::istringstream takes a std::string. For creating this
Denise> std::string from a char array, a copy is created. This copy
Denise> is then copied into the std::istringstream. For this purpose,
Denise> you probably don't want to use an std::istringstream. Instead,
Denise> you could use a simple homegrown stream buffer (code see
Denise> below).

Denise> Good luck, Denise!
Denise> --- CUT HERE ---
Denise> #include <istream>
Denise> #include <iostream>
Denise> #include <streambuf>
Denise> #include <string>
Denise> #include <string.h>

Denise> struct membuf:
Denise> std::streambuf
Denise> {
Denise> membuf(char* b, char* e) { this->setg(b, b, e); }
Denise> };

Denise> int main()
Denise> {
Denise> char* buffer = get_huge_buffer_with_data();
Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
Denise> std::istream in(&sbuf);
Denise> for (std::string line; std::getline(in, line); )
Denise> std::cout << "line: " << line << "\n";
Denise> }

Thanks! This is exactly what I needed.

One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

Cheers,
John Salmon
 
D

Denise Kleingeist

Hello John!
John said:
Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

You are right: it is a left over from a discarded attempt to use
std::find() instead of strlen()! Just use buffer + strlen(buffer)
instead.

Sorry for any confusion caused, Denise!
 
P

P.J. Plauger

I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

See the header <strstream>. It does exactly what you want,
and it's part of the C++ Standard (albeit a bit old
fashioned).

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
J

John Salmon

PJ> See the header <strstream>. It does exactly what you want,
PJ> and it's part of the C++ Standard (albeit a bit old
PJ> fashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

Cheers,
John Salmon
 
P

P.J. Plauger

PJ> See the header <strstream>. It does exactly what you want,
PJ> and it's part of the C++ Standard (albeit a bit old
PJ> fashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

You should prefer strstream because:

1) it's exactly what you need

2) it's still part of the C++ Standard

3) there's no reason to believe it'll become nonstandard anytime
soon, despite the dire warnings

4) even if it does officially go away, there's not a sane vendor
who'll stop supporting it for the next decade

So what the hell.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,882
Messages
2,569,948
Members
46,267
Latest member
TECHSCORE

Latest Threads

Top