Advice on input organisation

S

SophistiCat

Hi,

I am working on a computational program that has to read a number of
parameters (~50) from an input file. The program contains a single class
hierarchy with about a dozen member-classes or inherited classes, each of
which needs some subset of those input parameters. The classes may
individually perform some input validation, and even determine which
parameters are to be read next. Currently, each class performs its own
file input. This doesn't satisfy me, mainly because the program is hard
to read and maintain when the input is dispersed among many different
classes.

What would be a good strategy for organising input in this case?

My first impulse was to at least "outsource" the actual file input into a
separate class, which also keeps track of the current position and
context within the input file and can do meaningful error reporting. The
other classes utilise the input class interface to get the parameters
that they need. But that doesn't quite solve the problem, in my view.

Should I have a class that reads the input file in its entirety, and then
makes the results available to other classes? I see too problems(?) with
that:

- The parameter sequence is not entirely linear. Depending on context,
the input file may contain some parameters and not others. The logic is
currently determined by the classes that read their subsets of input
parameters. I would have to transfer that logic to the input-reading
class, which doesn't sound right.

- If a change in some class places a different requirement on input data,
this change would also have to be implemented in the input-reading class.
This seems to go against the encapsulation principle. Trouble is likely
if the input-class is not synchronised with all the other classes that
require input.

- Individual classes only need access to a subset of the input
parameters, not all of them. Is there some elegant way of giving each
class access only to those parameters that it needs, and not the entire
"catalogue"?

Any ideas?

Thanks.

P.S. I realise that this is not specifically a C++ question, but some
solutions may involve features that are specific to C++.
 
K

Kanenas

Hi,

I am working on a computational program that has to read a number of
parameters (~50) from an input file. The program contains a single class
hierarchy with about a dozen member-classes or inherited classes, each of
which needs some subset of those input parameters. The classes may
individually perform some input validation, and even determine which
parameters are to be read next. Currently, each class performs its own
file input. This doesn't satisfy me, mainly because the program is hard
to read and maintain when the input is dispersed among many different
classes.

What would be a good strategy for organising input in this case?
I would go with your idea of a class which reads the entirety of the
input file but additionally try to divorce the input format from the
input requirements of the computational classes. If you can achieve
this, changes in the computational classes won't affect the parser.
An advantage of parsing the input file and storing it in the
appropriate data structure (which may be the same object as the
parser) is that you could simplify the input-access logic within the
computational classes. The input container could be a map, a sequence
of some type or of a custom type (which may have attributes of maps
and sequences), depending on your exact needs. Without knowing more
about the pattern of access, I can't make precise recommendations.
Could you provide a model of the non-linear access and examples of
input requirements you later mention? You've aroused my curiosity and
problem-solving drive.

If you have control over the format of the file, you could
additionally define the format so that parsing is easier.

[...]
Should I have a class that reads the input file in its entirety, and then
makes the results available to other classes? I see too problems(?) with
that:

- The parameter sequence is not entirely linear. Depending on context,
the input file may contain some parameters and not others. The logic is
currently determined by the classes that read their subsets of input
parameters. I would have to transfer that logic to the input-reading
class, which doesn't sound right.
With an appropriate format for input, it sounds more right, for the
parser is creating a structure implied by the format rather than
trying to emulate the input behavior of the computational classes.
- If a change in some class places a different requirement on input data,
this change would also have to be implemented in the input-reading class.
This seems to go against the encapsulation principle. Trouble is likely
if the input-class is not synchronised with all the other classes that
require input.
By defining a format for the input file which is general enough, a
data structure which supports the format could accommodate changes in
the classes' input requirements.

If you can't make the input format/parser class general enough, you
could define methods of the computational classes (or define helper
classes for each computational class) to parse sections of the input
file.


You could also support some form of input requirement registration,
perhaps an interface for the parser which lets instances of the
computational classes tell the parser what their input requirements
are, or an interface for the computational classes which lets the
parser get the input requirements of the computational classes (or
their instances). (On a tangent, what would you call an "instances of
a computational class" within the context of your program? A
computational object?) With this approach, you define a meta-format
and the classes (or instances) define the input format. As a partial
example, parameters could be given different types. Classes (or
rather, instances) register parameters and their types with the
parser. The type of a parameter determines the parser's behavior when
it encounters it.

If you control the input format, you could include tags in the input
which define the type of the parameter; at its most extreme, the input
format could define a programming language and your program becomes an
interpreter. Whether or not you take this approach (parameter markup)
depends on whether you want the input or the computational objects to
drive the computation.
- Individual classes only need access to a subset of the input
parameters, not all of them. Is there some elegant way of giving each
class access only to those parameters that it needs, and not the entire
"catalogue"?
If you want to restrict the input accessible to the computational
objects, you could subclass the input container class and pass an
instance of the appropriate subclass to each computational object.
Unfortunately, this may have the problem you mentioned above wherein
changes in input requirements affect the input container [subclass].

If the input format is determined at runtime by requirement
registration, you could use the same information to give an instance
of a computational object the data it wants in the format it wants
(perhaps using subclasses of the input container class).

What seems most likely to me is that you don't want computational
objects to have to handle parameters which don't affect their
behavior. With the appropriate input container, you won't need to
worry about this.

Am I hitting anywhere near the mark?

Kanenas
 
S

SophistiCat

Hi,

Thanks for your reply!

Kanenas said:
Without knowing more
about the pattern of access, I can't make precise recommendations.
Could you provide a model of the non-linear access and examples of
input requirements you later mention? You've aroused my curiosity and
problem-solving drive.

Input data consists of ints, floats, single-word strings, or letters. One
or more values per line, followed by an optional comment (lines starting
with # are ignored):

asdfgf Comment
128 64 Comment
2 2 blocks of data to follow:
# Data block 1
1E6
x
# Data block 2
1.5E6
y
# Data blocks end
0 1 0
1 This flag indicates that an extra data block will follow
# Data block begins
9.8
200E6
# Data block ends
dfgh Some more data...
1.0

etc.

As you can see, the code that performs input has to be aware of the format
in which the data is written. Yes, I suppose that a rigourous markup
language, like XML for instance, could probably divorce data parsing and
storage from the eventual consumers of that data (although validation would
have to be postponed until later).
You could also support some form of input requirement registration,
perhaps an interface for the parser which lets instances of the
computational classes tell the parser what their input requirements
are, or an interface for the computational classes which lets the
parser get the input requirements of the computational classes (or
their instances). (On a tangent, what would you call an "instances of
a computational class" within the context of your program? A
computational object?) With this approach, you define a meta-format
and the classes (or instances) define the input format. As a partial
example, parameters could be given different types. Classes (or
rather, instances) register parameters and their types with the
parser. The type of a parameter determines the parser's behavior when
it encounters it.

If I understand you correctly, that's basically what I did: I have a
parser-class that includes template functions for reading values of
different basic types:

// Read the first parameter from the next line.
template<typename T> bool ReadFirst(T& param, std::string description = "")
{ return NewLine() && ReadNext(param, description); }

// Read the next parameter from the current line.
template<typename T> bool ReadNext(T& param, std::string description = "");

Computational or storage classes use the parser class interface to get the
next value from the file. They only need to tell the parser class whether
to read from the current line or start from a new line. The reason I don't
really like this is that input is spread over about a dozen different
classes, and it is rather hard to follow.
(On a tangent, what would you call an "instances of
a computational class" within the context of your program? A
computational object?)

It can be a computational object, a storage object, or a combination of
both. Some of the class instances spring up from the context prompted by
input, which means that it is not possible to instantiate the entire class
structure before all the input is read.
Am I hitting anywhere near the mark?

Kanenas

Yes, thank you very much for your interest and your advice.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top