elementary string processing question

T

tonywh00t

Hi everyone,

I have a "simple" question, especially for people familiar with regex.
I need to parse strings that have the form:

1:3::5:9

which indicates the set of integers {1 3 4 5 9}. In other words i have
a set of numbers separated by ":", where "::" indicates a range from
lo to hi inclusive. It is desirable to error check this string (i.e it
should. start and end with a number, and be composed only numbers,
"::", and ":"). I'm currently using the Boost C++ library, and i've
worked out some pretty ugly solutions. If anyone has a suggestion, I'd
very much appreciate it. Thanks!
 
J

James Kanze

I have a "simple" question, especially for people familiar
with regex. I need to parse strings that have the form:

which indicates the set of integers {1 3 4 5 9}. In other
words i have a set of numbers separated by ":", where "::"
indicates a range from lo to hi inclusive. It is desirable to
error check this string (i.e it should. start and end with a
number, and be composed only numbers, "::", and ":"). I'm
currently using the Boost C++ library, and i've worked out
some pretty ugly solutions. If anyone has a suggestion, I'd
very much appreciate it. Thanks!

I presume that the number of entries in the string may vary;
otherwise, of course, you said it yourself, regex. I'd still
use regex to validate the string, something like
"^\\d+:)\\d+|::\\d+)*$", I think would do the trick. (It would
be really elegant if you could use capture, but capture doesn't
work well within closures---only the last match is captured.)
Then I'd simply break the string up into substrings at each ':':

std::vector< std::string >
parse( std::string const& source )
{
typedef std::string::const_iterator
TextIter ;
std::vector< std::string >
result ;
TextIter current = source.begin() ;
TextIter const end = source.end() ;
while ( current != end ) {
TextIter fieldBegin = current ;
current = std::find( current, end, ':' ) ;
result.push_back( std::string( fieldBegin, current ) ) ;
if ( current != end ) {
++ current ;
}
}
return result ;
}

This gives you an array of strings, with an emtpy string between
:: (so when you see an empty string, you know you have a range).
So you could do something like:

int
toInt( std::string const& string )
{
std::istringstream cvt( string ) ;
int result ;
cvt >> result ;
return result ;
}

std::vector< int >
convert( std::vector< std::string const& source )
{
typedef std::vector< std::string >::const_iterator
FieldIter ;
std::vector< int > result ;
FieldIter current = source.begin() ;
FieldIter const end = source.end() ;
while ( current != end ) {
result.push_back( toInt( *current ) ) ;
++ current ;
if ( current != end && *current == "" ) {
int bottom = result.back() ;
++ current ;
int top = toInt( *current ) ;
if ( top <= bottom ) {
throw someError ;
}
while ( ++ bottom <= top ) {
result.push_back( bottom ) ;
}
++ current ;
}
}
sort( result.begin(), result.end() ) ;
// Or you might want to track the last seen to ensure
// that the input was correctly sorted.
return result ;
}

Note that all of the above code supposes the precheck on the
format using regex. Otherwise, you'll need a lot more error
handling and special cases.
 
J

Juha Nieminen

tonywh00t said:
I'm currently using the Boost C++ library, and i've
worked out some pretty ugly solutions. If anyone has a suggestion, I'd
very much appreciate it. Thanks!

My experience is that whenever you need to parse input data which is
more complicated than fixed-format whitespace-separated elements, the
parsing code always becomes very complicated in C++ (as well as C). The
C/C++ language has clearly not been designed to be a language which you
can use to create complicated format parsers with one-liners. Often not
even with 100-liners (especially if you want full error checking).

Of course libraries have been developed during the decades to try to
help this, but they often only help more on the abstraction rather than
on the verbosity and complexity of the code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top