changing delimiters for std::stream

S

shea martin

Is there a way to change the delimiters used by >> in an istream? I am
parsing a markup file, and want <tag>word</tag> to be broken into 3
strings. I would prefer not to use a 3rd party lib, though I am sure
there are lots out there. I am hoping that the std::streams can provide
this functionality.

~S
 
J

John Ericson

shea martin said:
Is there a way to change the delimiters used by >> in an istream? I am
parsing a markup file, and want <tag>word</tag> to be broken into 3
strings. I would prefer not to use a 3rd party lib, though I am sure
there are lots out there. I am hoping that the std::streams can provide
this functionality.

~S

I suppose you could write a ctype facet that defines '<' and
'>' (and '/'?) as whitespace. I'd go with a 3rd party lib,
myself.

Best regards, JE
 
M

Mike Wahler

shea martin said:
Is there a way to change the delimiters used by >> in an istream? I am
parsing a markup file, and want <tag>word</tag> to be broken into 3
strings. I would prefer not to use a 3rd party lib, though I am sure
there are lots out there. I am hoping that the std::streams can provide
this functionality.

Read about 'facets'. But it does take some effort to get
things right. If it were me, I'd use an already written
library for this.

-Mike
 
J

Jerry Coffin

shea martin said:
Is there a way to change the delimiters used by >> in an istream? I am
parsing a markup file, and want <tag>word</tag> to be broken into 3
strings. I would prefer not to use a 3rd party lib, though I am sure
there are lots out there. I am hoping that the std::streams can provide
this functionality.

I haven't tested it, but this should at least be sort of close. For
the moment it assumes that you want all the usual white space to
remain as white space, and just add '<', '>' and '/' to the usual.

class my_ctype : public
std::ctype<char>
{
mask my_table[UCHAR_MAX];
public:
my_ctype(size_t refs = 0)
: std::ctype<char>(my_table, false, refs)
{
std::copy(classic_table(), classic_table() + table_size,
my_table);
my_table[widen('<')] = (mask)space;
my_table[widen('>')] = (mask)space;
my_table[widen('/')] = (mask)space;
}
};

Theoretically, you want to use table_size instead of UCHAR_MAX for the
size of the table, but on at least a few standard library
implementations that won't work, so I've cheated and used UCHAR_MAX
instead.
 
B

Bob Hairgrove

Is there a way to change the delimiters used by >> in an istream? I am
parsing a markup file, and want <tag>word</tag> to be broken into 3
strings. I would prefer not to use a 3rd party lib, though I am sure
there are lots out there. I am hoping that the std::streams can provide
this functionality.

Don't reinvent the wheel. Check out the Boost "tokenizer" library:

http://www.boost.org/libs/tokenizer/index.html

The Boost libraries will largely be included in the next C++ STL, so
whenever this happens, they won't be "third party" anymore.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top