Modifying the arguments to a function is a bad idea


J

jacob navia

This discussion started in the thread:
"Help a beginner - simple lowercase to uppercase and so on function"
but it has actually nothing to do with the subject of that thread so I
start a new one.

My arguments for justifying the thesis in the subject line are:

o It makes the function not restartable
o Under the debugger you can't know what was the parameter you
received when the function misbehaves. You can't debug it.
o The small gain in storage space is not worth the associated risks.

To my surprise, the eternal heathfield doesn't agree with this obvious
stuff.

He says to the first point:
> You lost me. I am sure it is possible to construct a function in
> which modifying its parameter makes it non-restartable, but you
> don't /have/ to construct functions that way.

Well, apparently he doesn't get it. Suppose a function

int fn(int a)
{
// some code
a = 56;
// more code
}

After that assignment it is impossible to go to the first line of the
function and restart it, since the original value of a is lost.

This is completely obvious but somehow not obvious to some.

To the second point, he argues ( I cite again the second point
for clarity)

> o Under the debugger you can't know what was the parameter
> you received when the function misbehaves. You can't debug it.
<end of jn quote>

<heathfield>
Maybe you can't, but I can - by setting a breakpoint at an
appropriate place and making a note of the value before the mod.
<end of heathfield>

The obvious situation where the function is called 4765 times
before it crashes doesn't occur to him. Maybe because he has never been
in the frustrating situation of trying to know what is the specific
parameter that makes the function crash. :)

He continues:
<heathfield>
What risk do you see in:

void intcopy(int *s, int *t, int term)
{
while((*s++ = *t++) != term) {}
}
< end of heathfield>

I see MANY risks in that construct, the first and foremost
is that "t" is a vector of integers that is MISSING the
"term" terminator!

Of course THAT "can't happen", and heathfield doesn't even consider
that case.

BUT, let's suppose that this happens, since in the real world,
bugs DO appear.

How you will see the effects of that?

(1) The program as written will crash when it arrives at the end of
available memory . Since you do not have the original data, you
can't figure out what parameters were passed in.

(2) The program will find a random integer in memory that
corresponds to "term", after having overwritten a lot of memory.
That function will return as if nothing had happened leaving
the rest of the software in an unknown state. Since this depends
on the contents previously stored in unknown data areas, the
program will (or will not) go on, maybe crashing sometimes
or (worst) never crashing but giving *sometimes* strange results or
crashes hours later.

(3)
Another obvious problem with the function above is that no SIZE
information is passed to limit the copy, so that if "s" points to
an area too small to receive "t" a buffer overflow will happen.

That has in principle nothing to do with the modification of the
parameters but having the parameters still available it would
allow you to see the corresponding address and you can figure out
its size and location within the debugger.

Actually, I wanted to highlite this problem because I consider that
functions like "intcopy" are the ideal example of how NOT to program
in C.

o Function modifies its parameters. Not restartable, not debuggable.
o No error analysis at all (return value void)
o The slightest error in the calling function when passing the
parameters leads to debugging nightmares.

It is funny that another "regular" (Mr Ben Pfaff) says:

<quote Mr Pfaff>
I would say that the latter is riskier, because there are more
entities for me to keep track of as I read it. It is easier to
understand the behavior of three variables than five.
<end of quote>

The brain of Mr Pfaff has difficulties reading:
> void intcopy(int *s, int *t, int term)
> {
> int *u = s;
> int *v = t;
> while((*u++ = *v++) != term) {}
> }


Imagine the problems Mr Pfaff would have if we pass size information
to make better designed software!

I can only recommend to him to stop programming. Reading NO variables
at all is surely even easier to do.

:)

Conclusion:

a) Do not modify the parameters of a function. It is a bad idea.

b) Think about the debugging of the functions when you write them.
If you do that, you will avoid monstrosities like the above
"intcopy".
c) Do not follow the advise of heathfield (and co)...
 
Ad

Advertisements

M

Morris Keesan

Suppose a function

int fn(int a)
{
// some code
a = 56;
// more code
}

After that assignment it is impossible to go to the first line of the
function and restart it, since the original value of a is lost.

This is completely obvious but somehow not obvious to some.

I agree with your basic position, which is that modifying the values
of function parameters is generally a bad idea, primarily because
unless the function is extremely simple, like Richard Heathfield's
string copy example, it's too easy to confuse a maintainer of the
code, who is likely to lose track of what the values of the parameters
are.

But you've lost me with the argument cited above. What exactly
do you mean by "restarting" a function? Under what circumstances,
and how, would you go to the first line of the function and restart it?
You're not really suggesting that a goto statement is an argument
against anything, are you?
 
J

jacob navia

Morris said:
I agree with your basic position, which is that modifying the values
of function parameters is generally a bad idea, primarily because
unless the function is extremely simple, like Richard Heathfield's
string copy example, it's too easy to confuse a maintainer of the
code, who is likely to lose track of what the values of the parameters
are.

But you've lost me with the argument cited above. What exactly
do you mean by "restarting" a function? Under what circumstances,
and how, would you go to the first line of the function and restart it?
You're not really suggesting that a goto statement is an argument
against anything, are you?

Sorry, I am always thinking of debugging in a debugger.

When a function crashes, in the debugger you can go to the
first line and restart the function to see how the crash happened

If you have modified the arguments that's impossible.
 
M

Morris Keesan

Now, the posit is that modifying function
parameters (or "arguments" as you rather confusingly call them in
the Subject line) is a bad idea. But to make that case, you have to
show that there are no circumstances in which it is a good idea.

No, only if you want to make the case that it's ALWAYS a bad idea.
I maintain that driving above legal speed limits is a bad idea,
but that doesn't mean that I believe that there are no circumstances
where it's okay.
 
B

bartc

Richard Heathfield said:
jacob navia said:

No, I can't say I have. Now, the posit is that modifying function
parameters (or "arguments" as you rather confusingly call them in
the Subject line) is a bad idea. But to make that case, you have to
show that there are no circumstances in which it is a good idea. In
so doing, you have to explain why Kernighan does precisely that, in
several code fragments in K&R2. Is it because Kernighan is a bad
programmer?

K&R2 says "Because arguments are passed by value, strcpy can use the
parameters s and t any way it pleases". Which is true, and in their example
it's really using the arguments to initialise local variables, instead of
copying them and wasting a bit of time. That's fine.

However their's was a 4-line function optimised further down the page to a
2-line function; not exactly complex.

Just today I came across one of my functions that had a parameter q which
was a linked list root. The list had to be traversed twice; as written, the
first loop used a copy of q, the second just used q (and at the end q was
null).

This looked awkward and would have been better if both traversals used a
copy (so the iteration code was identical), and at the end q would have been
preserved in case of a third traversal or I needed to do something else with
it. Even better would be to rename q as qroot or something so that modifying
it would look wrong.

Anyway the subject just says modifying parameters is a bad idea, that's all.
Not that it must never be done.
 
B

Beej Jorgensen

Richard Heathfield said:
jacob navia said:

If you need to do that, you obviously preserve the value of a. But
it's quite a rare thing to have to do, surely?

I think he might be talking about restarting the function in a debugger.

-Beej
 
Ad

Advertisements

O

Old Wolf

void intcopy(int *s, int *t, int term)
{
   while((*s++ = *t++) != term) {}}


I have to say that I would be quite capable of debugging
this function without the need to manually change the
program flow, like Mr. Navia seems to be suggesting.
 
K

Keith Thompson

jacob navia said:
Conclusion:

a) Do not modify the parameters of a function. It is a bad idea.

I agree that it's usually a bad idea. I don't agree that it's
*always* a bad idea. On the other hand, I suggest that it this
may be a special case of a more general rule of thumb.

Let's step back and look at the bigger picture. Parameters are
essentially local variables; the only special thing about them is
that their initial values are taken from the arguments in the call.

Consider:

int square(int n) {
n = n * n; /* poor style */
return n;
}

Here n, on entry to the function, hold the number whose square
we want to compute. The code then re-uses the same variable for
a different purpose, to hold the result of squaring the original
value. The problem, for a human reader, is that "n" has, not just
a different value, but a different *meaning*, depending on where
you are in the function.

Instead, I might write:

int square(int n) {
int result = n * n;
return result;
}

Or even:

int square(const int n) {
const int result = n * n;
return result;
}

And if the compiler is sufficiently clever, it might store both in
the same location or elimate result altogether.

(Of course in real life I'd just write "return n * n;", but let's
ignore that; I chose this example for its simplicity.)

But what's so special about parameters? Why shouldn't the same
apply to other variables?

int v;
/* ... */
v = some_value;
v = v * v;
/* ... */

Again, "v" has a different meaning depending on where you are in
the code, but now v is just a local variable, not a parameter.

Functions are the main unit of encapsulation in a C program, and
that does make parameters special. But they're not *that* special.

And conversely, there are times when it makes sense to modify a
variable after it's been initialized; the same applies to parameters,
particularly for very short functions. In particular, if modifying
the value doesn't change the *meaning*, I don't see much wrong
with it.

There are even languages that don't allow objects
to be modified once they've been initialized; see
<http://en.wikipedia.org/wiki/Single_assignment>. I won't
necessarily argue that such a strict rule is a good idea, but I
do suggest that minimizing re-assignment might make code easier
to analyze. And in fact, I'd suggest declaring objects with
"const" *unless* there's a specific requirement to modify them
after initialization.

Doing the same thing with parameters is slightly problematic in C.
The problem is that the constness of a parameter is of no concern to
the caller, so a "const" keyword in a prototype is just confusing.
But if you want to enforce the constness of a parameter within the
function, you need the "const" keyword in the definition.

You can legally write something like this:

void func(int param);

/* ... */

void func(const int param) {
/* implementation of func */
}

But in all other cases it's possible, and IMHO good practice, for
a function's declaration (prototype) and the declaration part of
its definition to be identical.

If I were designing a new language, "const" (or "readonly") would
be the default for all declared objects; if you want something you
can modify later (a variable), you'd have to add a keyword to the
declaration. (Apparently Scala does something like this, though
without the default; you have to specify either "val" or "var".)

[snip]
 
R

robertwessel2

Sorry, I am always thinking of debugging in a debugger.

When a function crashes, in the debugger you can go to the
first line and restart the function to see how the crash happened

If you have modified the arguments that's impossible.


We've disagreed in the past about the use of debuggers, and I guess
we'll continue to disagree on this one - I can't bring myself to be
moved by that particular side effect of modifying parameters. And
there are all sorts of reasons a function might not be restartable in
the way you suggest, any modification of persistent state (alter a
global, alter something referenced via a passed in pointer, open a
file, etc.) will do the same.

If there's an argument for not doing that, it's that it can be
confusing in larger routines, especially to beginners. In a large
routine it can be easy to miss the modification, and then incorrectly
assume that the parameter is unchanged later on (simply because it's
so rare that it *is* changed). For a small routine it can be a semi-
useful shorthand. And I'd expect any semi-competent compiler to
generate the same code in most cases if I copied the values to "real"
locals first or not.

Personally I usually avoid the form, simply because it’s unusual, and
converting to the more common form is trivial.
 
M

Morris Keesan

Just as a side note, in some sense the value passed in and the modified
value are separate things. That is, the argument holds the value of a
parameter for the current execution instance; when we modify it we are
changing it into a local variable which is a subtly different kind of
thing.

Whether one should concern one's self with the difference is another
matter. If you never use a debugger I doubt that it matters.

I think the debugger issue is a red herring.
I would say if you're never unlucky enough to try
to use the (original) value of the parameter after another programmer
has modified it, then it doesn't matter.
 
P

Phil Carmody

Richard Heathfield said:
The way I see it, this is (largely) what const is for - to protect
your parameters when you know you *don't* want to modify them.

I've seen code from [tirade deleted] someone where all parameters
were const. Including the strings. Those being strings which weren't
to be modified.

Yet his prototypes were things like:
void printCrapToAFile(char *const pCrap, char *const pFilename);

I laughed.

Then I cried.

Yes, David, I'm looking at you. Expect a flamewar on bugzilla later on today...

Phil
 
Ad

Advertisements

R

Richard Tobin

Morris Keesan said:
I think the debugger issue is a red herring.

If it's common in debuggers to want to see the original value of
an argument, then debuggers should record that value for you.

I believe that it's easy enough to simulate this in some debuggers
by telling them to store the variable somewhere each time the function
is entered.

I have most often encountered this problem when debugging a program
compiled with optimisation (obviously not an ideal situation) and the
value is unavailable because the implementation has reused the
location, even though the original code does not.

-- Richard
 
R

Richard Tobin

When a function crashes, in the debugger you can go to the
first line and restart the function to see how the crash happened

If you have modified the arguments that's impossible.

It's also likely to be impossible if you have modified any other
data the function uses, such as objects pointed to by the arguments.
Presumably you don't argue that you should never do that?

-- Richard
 
J

jacob navia

Richard said:
If it's common in debuggers to want to see the original value of
an argument, then debuggers should record that value for you.

I believe that it's easy enough to simulate this in some debuggers
by telling them to store the variable somewhere each time the function
is entered.

I have most often encountered this problem when debugging a program
compiled with optimisation (obviously not an ideal situation) and the
value is unavailable because the implementation has reused the
location, even though the original code does not.

-- Richard

I do not know of a debugger that does this. Maybe you know what
kind of command you have to give to gdb to do that?

Microsoft's debugger doesn't do this as far as I can tell.
The debugger I wrote for lcc-win doesn't either...

In any case, it is easier to do it yourself in plain C.
 
J

jacob navia

Richard said:
It's also likely to be impossible if you have modified any other
data the function uses, such as objects pointed to by the arguments.
Presumably you don't argue that you should never do that?

-- Richard

See the reply to Mt Butditt in the same thread.
 
R

Richard Tobin

If it's common in debuggers to want to see the original value of
an argument, then debuggers should record that value for you.

I believe that it's easy enough to simulate this in some debuggers
by telling them to store the variable somewhere each time the function
is entered.
[...]

I do not know of a debugger that does this. Maybe you know what
kind of command you have to give to gdb to do that?

Here's a program of the kind we're considering:

#include <stdio.h>

int fact(int n)
{
int f=1;

while(n > 1)
f *= n--;

f /= 0;

return f;
}

int main(void)
{
printf("%d\n", fact(6));
return 0;
}

The division by zero is intended to cause an error of course. We want
to see what the argument n was when fact() was entered. We set
a breakpoint on the function, then tell gdb to execute commands
at that breakpoint to store the value of n and continue:

(gdb) break fact
Breakpoint 1 at 0x1f90: file fact.c, line 5.
(gdb) commands
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
set $arg = n
continue
end

Now we run the program:

(gdb) run
Starting program: /private/tmp/fact
Reading symbols for shared libraries ++. done

Breakpoint 1, fact (n=6) at fact.c:5
5 int f=1;

Program received signal EXC_ARITHMETIC, Arithmetic exception.
0x00001fbb in fact (n=1) at fact.c:10
10 f /= 0;

And look at the saved value:

(gdb) print $arg
$1 = 6

-- Richard
 
Ad

Advertisements

D

Dik T. Winter

> Beej Jorgensen said:
>
> Yes, this has become apparent during the discussion.

And in that case the same argument can be used to not modify variables
that are not local to a particular block, it makes restarting the block
impossible. Consider the case with loop that contains a block of code.
If there is going something wrong at some iteration and if you modify
non-local variables you cannot restart that iteration...
 
J

jacob navia

Dik said:
And in that case the same argument can be used to not modify variables
that are not local to a particular block, it makes restarting the block
impossible. Consider the case with loop that contains a block of code.
If there is going something wrong at some iteration and if you modify
non-local variables you cannot restart that iteration...

Another "argument" that doesn't hold.

You can always restart the block by going to the start of the
function!

The function INTERFACE is a clear separation between parts of a
software. That interface should be held constant if possible.
 
N

Nick Keighley

posts should be standalone and understandable without
reading the subject line

Subject: "Modifying the arguments to a function is a bad idea"


My arguments for justifying the thesis in the subject line are:

o It makes the function not restartable

how do you "restart" a function? With a debugger?
I never restart functions with a debugger so I don't regard this as a
problem


<snip>
 
Ad

Advertisements

D

Dik T. Winter

>
> Another "argument" that doesn't hold.
>
> You can always restart the block by going to the start of the
> function!
>
> The function INTERFACE is a clear separation between parts of a
> software. That interface should be held constant if possible.

Yeah, nice if the crash occurs in the millionth execution of a loop.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top