refactoring

R

Roedy Green

I have always felt that a if a method can have multiple inputs it
should be able to have multiple outputs, but very few language
designers (Forth and PostScript being exceptions) have agreed.

Places where multiple outputs would be useful:

1. 2D and 3D co-ordinates, Cartesian to polar.

2. returning a value with a status indicator about how good the value
is.

3. categorising routines. Using a separate method for each category
must repeat the binning logic.

4. division return quotient and remainder.

5. font/colour pair

6. find min, max average of a set.
 
S

Stefan Ram

Roedy Green said:
Places where multiple outputs would be useful:
1. 2D and 3D co-ordinates, Cartesian to polar.
2. returning a value with a status indicator about how good the value is.
4. division return quotient and remainder.
5. font/colour pair
6. find min, max average of a set.

Yes, but actually I am not aware of a reason,
why not to encapsulate these compounds in a class.

The only argument I remember was »overhead«.

This breaks down to

- Notational overhead in the source code

This indeed is a problem. Java is too
verbose in several regards already.
But if one really despise this, one might
not have chosen Java in the first place.

- Execution time overhead

One can imagine how an optimizer might
remove nearly all of it, and some
implementations might be good in this regard.
3. categorising routines. Using a separate method for each
category must repeat the binning logic.

I do not understand what this is, but never mind.
 
E

Eric Sosman

Roedy Green wrote On 08/08/07 13:10,:
Places where multiple outputs would be useful:

Most of these seem better suited to using single
objects that wrap multiple values.
1. 2D and 3D co-ordinates, Cartesian to polar.

You'd ordinarily want the point or vector to be an
object in its own right, not something that had its
various components scattered in unrelated free-standing
variables.
2. returning a value with a status indicator about how good the value
is.

I imagine the status indicator would be something like
an error bar or a confidence interval. Again, it seems you
would want the indicator to accompany the "indicatee" rather
than floating around independently, and you'd achieve this
by putting both in a single object.
3. categorising routines. Using a separate method for each category
must repeat the binning logic.

I'm not sure what you mean by this.
4. division return quotient and remainder.

Here's a case where I agree: The hardware produces both
results, and it's a shame that we can't get them both in
one operation. (I can still recall being shocked to learn
that Java compilers would not optimize a paired /-and-% to
a single division, the way C and Fortran and so on have done
from time immemorial. It seems, in fact, that the JLS forbids
the optimization! Of course, the JIT can play fast and loose
with the JLS whenever it can tell it won't get caught.)
5. font/colour pair

The utility of multiple-valued expressions eludes me in
cases like this one. If it makes sense to keep the font and
the color (and the baseline angle/path, and the background
color, and ...) together, then it makes sense to create an
object to hold all these things. If it makes sense to treat
them separately, then why/how is one expression evaluating
both?
6. find min, max average of a set.

A SetStats object doesn't seem inconvenient to me.

If all we're doing is

double a;
long b;
char c;
(a,b,c) = method(x, y, z);
// use a, use b, use c

.... then I don't see a significant advantage over

Triplet t = method(x, y, z);
// use t.a, use t.b, use t.c

It seems to me that multiple-valued expressions (as opposed
to expressions yielding single objects that wrap multiple
values) aren't a whole lot of use unless there's a way to
do more with them than just pass them around. That quotient-
and-remainder thing, for instance: if the integer / operator
produced both, we'd need an expression syntax that could
plug both results into different parts of a formula. Go
too far along *that* path, and you'll come upon the smoking
ruins of APL ;-)
 
R

Roedy Green

3. categorising routines. Using a separate method for each
I do not understand what this is, but never mind.
Imagine a method that took the name of an animal and categorised it by
size, colour, and country of origin.
 
M

Michael Jung

Yes, but actually I am not aware of a reason,
why not to encapsulate these compounds in a class.
The only argument I remember was »overhead«.
This breaks down to
- Notational overhead in the source code
- Execution time overhead

- Garbage overhead.
This may turn out significant.

Michael
 
L

Lew

Michael said:
- Garbage overhead.
This may turn out significant.

Nonsense.

Individual values would have to be collected; there's virtually no difference
in collection of a container for those values.

Weigh against the bug risk induced by separation of variables that belong
together.
 
E

Eric Sosman

Roedy Green wrote On 08/09/07 01:04,:
Imagine a method that took the name of an animal and categorised it by
size, colour, and country of origin.

Thanks for the example. It doesn't seem to me to make
a compelling case for multiple-valued expressions, though.
If the computations are independent (what has size to do
with color?), there's no reason to try to bundle them all
into one method. If they're not (all the results come from
consulting a database), then putting the descriptions in an
AnimalAttributes class or even in the Animal class itself
seems attractive. Writing

Animal beast = Animal.instanceOf(
"Ravenous Bugblatter Beast");
Size size;
Color color;
Origin origin;
(size, color, origin) = beast.getAttributes();

does not seem much of an improvement over

Animal beast = ...;
AnimalAttributes attr = beast.getAttributes();

or over

Animal beast = ...;
Size size = beast.getSize(); // 1st query caches all
Color color = beast.getColor(); // use cached value
assert beast.getOrigin() == Planet.TRAAL; // cached

Also, what if you were only interested in the color?
You might invent a syntax along the lines of

(,color,) = beast.getAttributes();

to indicate that the first and third values are to be
discarded, but this probably means that getAttributes()
must compute all three values even if only one is needed,
whether that computation is cheap or not. Using getColor()
allows the implementor to make -- and change -- decisions
about things like caching; if size and color come from
one database and origin from another, there's no need to
consult the second database for a value that's not used.

I've used a language (Common Lisp) that could return
multiple values from an expression. The only situation
where it seemed useful was when an expression produced
a "primary" value and "ancillary" information -- somewhat
like the "value and trust level" situation you mentioned
earlier. Usually, the ancillary information was an error
code of some kind: The "primary" value nil (typically)
indicated that the computation could not be carried out,
and additional values could carry information about the
reasons (if the caller cared to receive them). I never
saw multiple values used for multiple "essential" pieces
of information -- they could have been so used, I guess,
but nobody seemed to think it a good idea.

My $0.02 worth (not a colossal sum, given the state
of the US dollar these days).
 
S

Stefan Ram

Eric Sosman said:
The only situation where it seemed useful was when an
expression produced a "primary" value and "ancillary"
information -- somewhat like the "value and trust level"
situation you mentioned earlier.

Recently I tentatively introduced the following type
declarations into the ram.jar library:

public interface Value /* de.dclj.ram.Value */
<Type>
{ Type value(); }

public interface PossibleValue /* de.dclj.ram.PossibleValue */
< Range >
extends de.dclj.ram.Value< Range >
{
/* Returns whether the value is valid.
This value only is meaningful, if it can be proven
that a preceding call to {@code valid()} would have
returned {@code true}, or if such a call indeed has
returned {@code true}.
@result whether the value is valid.*/
boolean valid();

}
 
M

Michael Jung

Individual values would have to be collected; there's virtually no difference
in collection of a container for those values.

No they don't. Imagine a method rerun passing values to a static result very
often. Even if not, the overhead for another object may be 1:2 (I.e. two ints
vs. two ints in a container.
Weigh against the bug risk induced by separation of variables that belong
together.

You would not need to have them separated logically or on code basis, never
more than a line apart.

Michael
 
L

Lew

Michael said:
No they don't. Imagine a method rerun passing values to a static result very often.

When did the result start having to be "static"? I understand neither what
you mean by the term in this context, nor how that influences GC overhead.
Even if not, the overhead for another object may be 1:2 (I.e. two ints
vs. two ints in a container.

How are you getting this ratio?

The cost of garbage collection is proportionate to the number of live objects,
not to the number being collected.
You would not need to have them separated logically or on code basis, never
more than a line apart.

Separated is separated. "Close" is only good for horseshoes.
 
M

Michael Jung

When did the result start having to be "static"? I understand neither what
you mean by the term in this context, nor how that influences GC overhead.

They don't have to be, but they can be. We are talking about examples, where
multiple return values are superior to wrapper objects.

static int x;
static int y;

Compare

f(int x, int y) = { return x+y, x-y; }
x,y = f(x,y);

with

f(int x, int y) = { return new int[] {x+y, x-y}; }
int[] xy = f(x,y);
x = xy[0]; y = xy[1];

where the wrapper object is a (probably non-elegant) array.

Run this a "couple" of times. (Of course, the first example can't be run:)
How are you getting this ratio?

It's stated. The memory ratio is 2:3 on a 32bit machine, so the overhead ratio
is 1:2 (one part overhead compared to two parts necessity).
The cost of garbage collection is proportionate to the number of live
objects, not to the number being collected.

That's a different issue. I assume the live objects being equal in the above
examples.
Separated is separated. "Close" is only good for horseshoes.

(Tell that to a horse.)

Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top