Design Question

F

F. GEIGER

Often I have to parse files containing lines of formats like the following:

ItemName::1::100

ItemName is a name, 1 and 100 are the min and max values of a range.

When I parse such lines I split them into substrings, where "::" is the
delimiter. Then I feed them into static methods of classes, that are
concerned with:

CItem::FromString(line); // Eats "ItemName", leaves "1::100" in line
CRange::FromString(line); // Eats "1::100"; line empty now.

While this seems okay in terms of "each class knows best by itself how to do
it", it is not so good, if the format changes somehow. The sample above is a
simplified one and usually much more classes are involved. Think of the
ripple going thru this part of the programm on a change of the format.

How is this done in a better way? When I say better, I think this has to
mean it should follow "from the many to the few" [C++ FAQ].

Many thanks in advance and kind regards
Franz GEIGER
 
B

Bob Hairgrove

Often I have to parse files containing lines of formats like the following:

ItemName::1::100

ItemName is a name, 1 and 100 are the min and max values of a range.

When I parse such lines I split them into substrings, where "::" is the
delimiter. Then I feed them into static methods of classes, that are
concerned with:

CItem::FromString(line); // Eats "ItemName", leaves "1::100" in line
CRange::FromString(line); // Eats "1::100"; line empty now.

While this seems okay in terms of "each class knows best by itself how to do
it", it is not so good, if the format changes somehow. The sample above is a
simplified one and usually much more classes are involved. Think of the
ripple going thru this part of the programm on a change of the format.

How is this done in a better way? When I say better, I think this has to
mean it should follow "from the many to the few" [C++ FAQ].

Many thanks in advance and kind regards
Franz GEIGER

If you have any influence over the format, you might consider using
XML instead of proprietary text formats. Then you have consistency
over the format and can more easily adapt to change.

If XML isn't an option, you could create a hierarchy of objects
instead of providing input to each individual stand-alone object as
raw text data. Only the topmost object gets the line of text and is
responsible for massaging the format in a way so that it is acceptable
to the next lower level. The output of each step is contained in an
object, not as text.

For example, I would have a top level class whose only responsibility
would be to split the string at the delimiter and stores the results
in a vector of strings. According to your example, this might be
something like this:

#include <string>
#include <vector>
/* other headers ... */

struct StringSplitter {
StringSplitter(std::string const &);
std::vector<std::string> items;
bool valid() const {
return (items.size()==5 /* or whatever */);
}
};

struct HeaderParser (
HeaderParser(StringSplitter const &);
std::string header; // receives "ItemName"
bool valid() const {
/* do validation on header here */
}
};
struct MinMaxParser {
MinMaxParser(StringSplitter const &);
std::vector<int> min_max_values;
bool valid() const {
/* do validation on min-max data here */
}
};

Perhaps "header" would be represented by a different class instead of
std::string if it had a complex structure. At any rate, classes
concerned with processing data in the header need not worry about the
min-max data, and vice-versa. Each class has a very well-defined
responsibility. If the text delimiter or the number of items should
ever change, for example, only the implementation of the
StringSplitter class would have to adapt. Validation should also be
implemented separately as an abstract class; the different parsers,
for example, could all be derived from such a class.
 
F

F. GEIGER

Thank you, Bob, for this comprehensive answer!

Kind regards
Franz GEIGER


Bob Hairgrove said:
Often I have to parse files containing lines of formats like the following:

ItemName::1::100

ItemName is a name, 1 and 100 are the min and max values of a range.

When I parse such lines I split them into substrings, where "::" is the
delimiter. Then I feed them into static methods of classes, that are
concerned with:

CItem::FromString(line); // Eats "ItemName", leaves "1::100" in line
CRange::FromString(line); // Eats "1::100"; line empty now.

While this seems okay in terms of "each class knows best by itself how to do
it", it is not so good, if the format changes somehow. The sample above is a
simplified one and usually much more classes are involved. Think of the
ripple going thru this part of the programm on a change of the format.

How is this done in a better way? When I say better, I think this has to
mean it should follow "from the many to the few" [C++ FAQ].

Many thanks in advance and kind regards
Franz GEIGER

If you have any influence over the format, you might consider using
XML instead of proprietary text formats. Then you have consistency
over the format and can more easily adapt to change.

If XML isn't an option, you could create a hierarchy of objects
instead of providing input to each individual stand-alone object as
raw text data. Only the topmost object gets the line of text and is
responsible for massaging the format in a way so that it is acceptable
to the next lower level. The output of each step is contained in an
object, not as text.

For example, I would have a top level class whose only responsibility
would be to split the string at the delimiter and stores the results
in a vector of strings. According to your example, this might be
something like this:

#include <string>
#include <vector>
/* other headers ... */

struct StringSplitter {
StringSplitter(std::string const &);
std::vector<std::string> items;
bool valid() const {
return (items.size()==5 /* or whatever */);
}
};

struct HeaderParser (
HeaderParser(StringSplitter const &);
std::string header; // receives "ItemName"
bool valid() const {
/* do validation on header here */
}
};
struct MinMaxParser {
MinMaxParser(StringSplitter const &);
std::vector<int> min_max_values;
bool valid() const {
/* do validation on min-max data here */
}
};

Perhaps "header" would be represented by a different class instead of
std::string if it had a complex structure. At any rate, classes
concerned with processing data in the header need not worry about the
min-max data, and vice-versa. Each class has a very well-defined
responsibility. If the text delimiter or the number of items should
ever change, for example, only the implementation of the
StringSplitter class would have to adapt. Validation should also be
implemented separately as an abstract class; the different parsers,
for example, could all be derived from such a class.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

HELP:function at c returning (null) 4
feedback on code design 23
Good design question 2
A design question .... 1
Design question: asynchronous API in C++ 4
Taskcproblem calendar 4
Please help 2
Help with code plsss 0

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,281
Latest member
Pedroaciny

Latest Threads

Top