Own filters(?) to streams

G

Guest

Hi, I wrote few filters working on streams of bytes, in example
enciption, UTF-8 decoding and such. Now I wonder how can I turn them
into classes derived from std::stream(?) or in other way to use them
with code working on std::stream std::ifsteam and such.

Keyword codecvc is probably related.

With "connecting to std:: streams/strings" I so far succeded in
std::string, I just wrote my own class of UnicodeChar (stored as full 32
bit) and made class UnicodeString derived from
std::basic_stream<UnicodeChar>, added operator<<(ostream that encodes to
Utf8 and so on...

But how to do simmilar thing for streams and stringstreams? Should I
only add
operator<<(ostringstream
and such - is it a good way to do it?

Or do std allow better way?
 
D

Dietmar Kuehl

Rafa? Maj Raf256 wrote:
[character encoding/decoding by fiddling with streams]
Or do std allow better way?

Certainly! The first thing to note is that character encoding and
decoding moves between different things: encoding turns characters
into bytes and decoding turns bytes into characters. It is important
to distinguish between bytes and characters: characters do not care
about their encoding and you can investigate characters to determine
their semantics. Bytes on the other hand are encoded characters (in
this context; bytes might represent other stuff, too) which are
useless when taken out of context. The only reasonable thing to do
to them, except of passing the whole sequence around, of course, is
to decode them and use the resulting characters.

OK, this sets the stage for the 'std::codecvt' facets: these turn
bytes into characters or vice versa. Each file stream, well, actually
the 'std::filebuf' stream buffer, internally uses the code conversion
facets to blockwise encode characters or decode bytes. Unfortunately,
the same mechanism is not readily available for other streams although
Dinkumware's standard library ships with a class which can be used to
do the conversions. It isn't too hard to implement a simple filtering
stream buffer which converts bytes into a characters using an
appropriate 'std::codecvt' facet (the aspect which makes the code
conversion stuff pretty complex e.g. for 'std::basic_filebuf' is
support for positioning which is rarely necessary on streams
representing characters). This would be the way to go: create a
filtering stream buffer which internally uses code conversion facets,
e.g. the ones you provide to implement the Unicode encodings. These
filtering stream buffer is then used with stream classes to actually
use the encoding.
 
G

Guest

Dietmar said:
Certainly! The first thing to note is that character encoding and
decoding moves between different things: encoding turns characters
into bytes and decoding turns bytes into characters. It is important
to distinguish between bytes and characters: characters do not care
about their encoding and you can investigate characters to determine
their semantics. Bytes on the other hand are encoded characters (in
this context; bytes might represent other stuff, too) which are
useless when taken out of context. The only reasonable thing to do
to them, except of passing the whole sequence around, of course, is
to decode them and use the resulting characters.

OK, this sets the stage for the 'std::codecvt' facets: these turn
bytes into characters or vice versa. Each file stream, well, actually
the 'std::filebuf' stream buffer, internally uses the code conversion
facets to blockwise encode characters or decode bytes. Unfortunately,
the same mechanism is not readily available for other streams although
Dinkumware's standard library ships with a class which can be used to
do the conversions. It isn't too hard to implement a simple filtering
stream buffer which converts bytes into a characters using an
appropriate 'std::codecvt' facet (the aspect which makes the code
conversion stuff pretty complex e.g. for 'std::basic_filebuf' is
support for positioning which is rarely necessary on streams
representing characters). This would be the way to go: create a
filtering stream buffer which internally uses code conversion facets,
e.g. the ones you provide to implement the Unicode encodings. These
filtering stream buffer is then used with stream classes to actually
use the encoding.

Hmm yes, I +/- understand the theory but I can't find none good examples
nor documentations.... counld You perahps write a tiny example of a
filter, like that just reads two bytes A and B into singe char (discards
A and reads B), and on write writes first 'x', and then give character?
Or any other example.

Like: "xaxbxc" is read into "abc" and vice-versa.
 
D

Dietmar Kuehl

Rafa? Maj Raf256 said:
Hmm yes, I +/- understand the theory but I can't find none good examples
nor documentations....

Concerning examples, this is indeed a little bit tricky. In my standard
library implementation I have a converting buffer implemented (you can
get it following the CXXRT link on my homepage). I'm not sure whether
it is complete, however, and I know that it does not support seeking.
You might also want to have a look at STLPort's and/or libstdc++'s
implementations of 'std::basic_filebuf' (I haven't looked at them
myself, though).

With respect to documentation, "The C++ Standard Library" (N.Josuttis;
Addison-Wesley) has some documentation on all facets and "Standard
C++ IOStreams and Locales" (A.Langer, K.Kreft; Addison-Wesley) should
also document them.
counld You perahps write a tiny example of a
filter, like that just reads two bytes A and B into singe char (discards
A and reads B), and on write writes first 'x', and then give character?
Or any other example.

Like: "xaxbxc" is read into "abc" and vice-versa.

I thought I could do it on the fly but it turns out to be, well, a
little bit more involved than I remembered...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top