making an istream from a char array

John Salmon · Dec 30, 2006

I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

Thanks,
John Salmon

[jsalmon@river c++]$ cat chararraytostream.cpp
#include <string>
#include <sstream>
#include <cstdlib>
#include <cstring>
#include <cstdio>
using namespace std;

char *getLotsOfBytes();
istream& streamParser(istream &s);
void linuxChkMem(const char *msg);

void withImplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
istringstream iss(chunk);
linuxChkMem("After iss(p): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
free(chunk);
linuxChkMem("After free(p): ");
}

void withExplicitString(){
linuxChkMem("Before getLotsOfBytes: ");
char *chunk = getLotsOfBytes();
linuxChkMem("After getLotsOfBytes():");
{
string s(chunk);
linuxChkMem("After s(chunk): ");
free(chunk);
linuxChkMem("After free(p): ");
istringstream iss(s);
linuxChkMem("After iss(s): ");
streamParser(iss);
linuxChkMem("After streamParser(iss): ");
}
linuxChkMem("After iss goes out of scope: ");
}

int main(int argc, char **argv){
printf("with an implicit string constructor\n");
withImplicitString();
printf("\nwith an explicit string constructor\n");
withExplicitString();
return 0;
}

// On linux, tell us how much data space we're using
// in the VM.
void linuxChkMem(const char *msg){
printf("%s", msg);
fflush(stdout);
char cmd[50];
sprintf(cmd, "grep VmData /proc/%d/status", getpid());
system(cmd);
}

static const int SZ = 100*1024*1024;
// A rough approximation to getLotsOfBytes. In the
// real application, getLotsOfBytes has these characteristics:
// - it returns a malloced pointer to a NUL-terminated array of chars.
// - it is out of my control. E.g., I can't rewrite it in a way
// that might be more friendly to C++ streams.
char *getLotsOfBytes(){
char *p = (char *)malloc(SZ);
memset(p, ' ', SZ);
strcpy(p+SZ-50, "3.1415 2.718 1.414");
return p;
}

// A rough approximation to streamParser. In the real
// application, streamParser takes a ref to an istream
// and does what it does. Again, I can't easily redefine
// the interface.
istream& streamParser(istream& s){
double x, y, z;
s >> x >> y >> z;
printf("x: %f y: %f z: %f\n", x, y, z);
return s;
}

[jsalmon@river c++]$ g++ -O3 chararraytostream.cpp
[jsalmon@river c++]$ a.out
with an implicit string constructor
Before getLotsOfBytes: VmData: 40 kB
After getLotsOfBytes():VmData: 102444 kB
After iss(p): VmData: 204848 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 102576 kB
After free(p): VmData: 172 kB

with an explicit string constructor
Before getLotsOfBytes: VmData: 172 kB
After getLotsOfBytes():VmData: 102576 kB
After s(chunk): VmData: 204980 kB
After free(p): VmData: 102576 kB
After iss(s): VmData: 204980 kB
x: 3.141500 y: 2.718000 z: 1.414000
After streamParser(iss): VmData: 204980 kB
After iss goes out of scope: VmData: 172 kB
[jsalmon@river c++]$

Denise Kleingeist · Dec 30, 2006

Hello John!

John said:
My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used.

std::istringstream takes a std::string. For creating this
std::string from a char array, a copy is created. This copy
is then copied into the std::istringstream. For this purpose,
you probably don't want to use an std::istringstream. Instead,
you could use a simple homegrown stream buffer (code see
below).

Good luck, Denise!
--- CUT HERE ---
#include <istream>
#include <iostream>
#include <streambuf>
#include <string>
#include <string.h>

struct membuf:
std::streambuf
{
membuf(char* b, char* e) { this->setg(b, b, e); }
};

int main()
{
char* buffer = get_huge_buffer_with_data();
membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
std::istream in(&sbuf);
for (std::string line; std::getline(in, line); )
std::cout << "line: " << line << "\n";
}

Gianni Mariani · Dec 30, 2006

John said:
I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

The "without making a copy" might be a little tricky with istringstream.

I'm no expert on c++ streams but something like this might work.

#include <istream>

class Xistream
: public std::istream,
public std::streambuf
{
public:
Xistream( const char * begin, const char * end )
: std::istream( this )
{
setg( const_cast<char *>(begin), const_cast<char *>(begin),
const_cast<char *>(end) );
}
};

#include <iostream>

int main()
{
const char xx[] = "1 22 33";

Xistream xi( xx, xx + sizeof(xx) -1);

int i;
xi >> i;

std::cout << i << "\n";

xi >> i;

std::cout << i << "\n";

}

John Salmon · Dec 30, 2006

Denise> Hello John!

Denise> std::istringstream takes a std::string. For creating this
Denise> std::string from a char array, a copy is created. This copy
Denise> is then copied into the std::istringstream. For this purpose,
Denise> you probably don't want to use an std::istringstream. Instead,
Denise> you could use a simple homegrown stream buffer (code see
Denise> below).

Denise> Good luck, Denise!
Denise> --- CUT HERE ---
Denise> #include <istream>
Denise> #include <iostream>
Denise> #include <streambuf>
Denise> #include <string>
Denise> #include <string.h>

Denise> struct membuf:
Denise> std::streambuf
Denise> {
Denise> membuf(char* b, char* e) { this->setg(b, b, e); }
Denise> };

Denise> int main()
Denise> {
Denise> char* buffer = get_huge_buffer_with_data();
Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
Denise> std::istream in(&sbuf);
Denise> for (std::string line; std::getline(in, line); )
Denise> std::cout << "line: " << line << "\n";
Denise> }

Thanks! This is exactly what I needed.

One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

Cheers,
John Salmon

Denise Kleingeist · Dec 30, 2006

Hello John!

John said:
Denise> membuf sbuf(buffer, std::find(buffer, buffer + strlen(buffer), 0));
One question - what's the point of the std::find()?

I don't see how std::find(buffer, buffer+strlen(buffer), 0);
could ever be different from buffer+strlen(buffer)??

You are right: it is a left over from a discarded attempt to use
std::find() instead of strlen()! Just use buffer + strlen(buffer)
instead.

Sorry for any confusion caused, Denise!

P.J. Plauger · Dec 30, 2006

I'm working with two libraries, one written
in old school C, that returns a very large
chunk of data in the form of a C-style,
NUL-terminated string.

The other written in a more modern C++
is a parser for the chunk of bytes returned by
the first. It expects a reference to a
std::istream as its argument.

The chunk of data is very large.
I'd like to feed the output of the first to
the second WITHOUT MAKING AN EXTRA IN-MEMORY COPY.

My attempts to create an istringstream from the
chunk of data all seem to at least double the
amount of VM used. Here's a short program demonstrating
what I've tried. Is there any way to get "inside"
the istringstream and tell it to use the 'chunk'
directly, rather than insisting on making a copy?

See the header <strstream>. It does exactly what you want,
and it's part of the C++ Standard (albeit a bit old
fashioned).

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

John Salmon · Dec 30, 2006

PJ> See the header <strstream>. It does exactly what you want,
PJ> and it's part of the C++ Standard (albeit a bit old
PJ> fashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

Cheers,
John Salmon

P.J. Plauger · Dec 30, 2006

PJ> See the header <strstream>. It does exactly what you want,
PJ> and it's part of the C++ Standard (albeit a bit old
PJ> fashioned).

Thanks to Usenet, I now have two workable solutions.

Googling for strstream turns up lots of warnings that "strstream is
deprecated", with dire warnings that it may be removed from future
versions of the standard. OTOH, an istrstream does exactly what I
want, without any extra custom machinery ( struct membuf : public
streambuf ).

Other than simplicity and possible compatibility with future
standards, is there any reason to prefer one approach over the
other?

You should prefer strstream because:

1) it's exactly what you need

2) it's still part of the C++ Standard

3) there's no reason to believe it'll become nonstandard anytime
soon, despite the dire warnings

4) even if it does officially go away, there's not a sane vendor
who'll stop supporting it for the next decade

So what the hell.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Help with reading formatted input from an istream	2	Jan 27, 2012
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
can't stream cast from a case insensitive string	0	Oct 9, 2011
Problem- strcat with char and char indexed from char array	3	Apr 20, 2006
Somone's SO question: "Is there an existing library for dynamically-determineddimensional array in c	1	Dec 9, 2013
My Status, Ciphertext	2	Nov 28, 2023
string to char array	7	Jul 13, 2006
copying contents from a char array to std::string obj	8	Jul 24, 2008

making an istream from a char array

John Salmon

Denise Kleingeist

Gianni Mariani

John Salmon

Denise Kleingeist

P.J. Plauger

John Salmon

P.J. Plauger

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads