Blocks and Their Use

Martin · Mar 25, 2007

I have a question regarding *use* of blocks.

In Plauger's THE STANDARD C LIBRARY in xfmtval.c in Chapter 6 I noticed in
function _Fmtval() that after some processing of 50 lines or so he creates a
block

char *_Fmtval( ... )
{

/* .. 50 lines or so of code ... */

{ /* build string in buf under control of fmt */
char *end, *s;
const char *g;
size_t i, ns;

for (s = buf; *fmt; ++fmt, s+= strlen(s))
/* ... */

}
return (buf);

} /* end function _Fmtval */

We all know this is legitimate, but I admit I have not seen this technique
employed before. Plauger also uses this technique in setlocal.c in the same
chapter.

Is this technique recommended, or are there caveats? I can see some C Coding
Guidelines excluding its use, if not specifically then by implication,
perhaps.

Having just added some code to an existing file in a void function, I
elected to employ this techique. It takes the form:

if ( error )
{
/* report error */
return;
}

if ( another_error )
{
/* report this error */
return;
}

/* ... a few more error checks ... */

/* no errors here, start processing */
{
unsigned char c;
/* ... more definitions ... */

/*... processing ... */
}

Which I think works well because the variables defined in the block are
necessary only if the errors looked for above are not present.

Has anyone any constructive comments about this technique?

Keith Thompson · Mar 25, 2007

Martin said:
I have a question regarding *use* of blocks.

In Plauger's THE STANDARD C LIBRARY in xfmtval.c in Chapter 6 I noticed in
function _Fmtval() that after some processing of 50 lines or so he creates a
block

char *_Fmtval( ... )
{

/* .. 50 lines or so of code ... */

{ /* build string in buf under control of fmt */
char *end, *s;
const char *g;
size_t i, ns;

for (s = buf; *fmt; ++fmt, s+= strlen(s))
/* ... */

}
return (buf);

} /* end function _Fmtval */

We all know this is legitimate, but I admit I have not seen this
technique employed before. Plauger also uses this technique in
setlocal.c in the same chapter.

Is this technique recommended, or are there caveats? I can see some
C Coding Guidelines excluding its use, if not specifically then by
implication, perhaps.

After reading the above, it wasn't at all clear to me what you meant
by "this technique". Reading the code after that that uses the
technique, I see that you're referring to introducing a block for the
purpose of declaring local variables that aren't needed by the
preceding code. (The subject "Blocks and Their Use" should have clued
me in!)

Yes, that's a perfectly valid technique, and I see nothing wrong with
it stylistically. Note that it doesn't necessarily save you any
memory; you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function. (Except for VLAs, I suppose.) But it does make it clear
that those variables are only used within the block, which makes the
code easier to understand.

Try moving the declarations of end, s, g, i, and ns to the beginning
of the function, above the "50 lines or so of code". Without
carefully reading all 50 lines, you can't be sure where those
variables are used.

I was about to express surprise that the code recomputes strlen(s) on
each iteration of the loop. It's a common newbie error to do
something like:

for (i = 0; i < strlen(s); i ++) {
/* ... code that uses s ... */
}

The problem is that strlen() has to scan the entire string, making the
loop O(N**2). Saving the value of strlen(s) in a variable would make
it O(N), a significant improvement.

But of course P.J. Plauger didn't make such a newbie mistake.
strlen(s) *has* to be recomputed on each iteration, because it changes
on each iteration. The "s += strlen(s)" is simply a clever way to
advance the pointer s up to the next '\0' character.

Ian Collins · Mar 25, 2007

Martin said:
I have a question regarding *use* of blocks.

Having just added some code to an existing file in a void function, I
elected to employ this techique. It takes the form:

if ( error )
{
/* report error */
return;
}

if ( another_error )
{
/* report this error */
return;
}

/* ... a few more error checks ... */

/* no errors here, start processing */
{
unsigned char c;
/* ... more definitions ... */

/*... processing ... */
}

Which I think works well because the variables defined in the block are
necessary only if the errors looked for above are not present.

Has anyone any constructive comments about this technique?

It is a perfectly valid technique for containing the scope of variables.
It is also a strong hint that the code in the block might be better off
in its own function.

Jack Klein · Mar 25, 2007

I have a question regarding *use* of blocks.

In Plauger's THE STANDARD C LIBRARY in xfmtval.c in Chapter 6 I noticed in
function _Fmtval() that after some processing of 50 lines or so he creates a
block

char *_Fmtval( ... )
{

/* .. 50 lines or so of code ... */

{ /* build string in buf under control of fmt */
char *end, *s;
const char *g;
size_t i, ns;

for (s = buf; *fmt; ++fmt, s+= strlen(s))
/* ... */

}
return (buf);

} /* end function _Fmtval */

We all know this is legitimate, but I admit I have not seen this technique
employed before. Plauger also uses this technique in setlocal.c in the same
chapter.

Is this technique recommended, or are there caveats? I can see some C Coding
Guidelines excluding its use, if not specifically then by implication,
perhaps.

Having just added some code to an existing file in a void function, I
elected to employ this techique. It takes the form:

if ( error )
{
/* report error */
return;
}

if ( another_error )
{
/* report this error */
return;
}

/* ... a few more error checks ... */

/* no errors here, start processing */
{
unsigned char c;
/* ... more definitions ... */

/*... processing ... */
}

Which I think works well because the variables defined in the block are
necessary only if the errors looked for above are not present.

Has anyone any constructive comments about this technique?

I use the technique of adding a block merely to open a local scope
most often when I am maintaining older code, my own or somebody
else's.

There is, unfortunately, a lot of older C code with very large
functions and all the variables used anywhere in the function defined
at the top.

In fact, I have seen C functions with dozens of cases in a switch
statement, all written in line, where some counter variable,
invariably named 'i', is used in two or three of the cases, and
defined way up at the top of the file.

The alternative is quite simply to put each case handler in its own
scope:

case '1':
{
/* define variables if needed */
break;
}

Opening a new scope allows you to make sure that you can define new
variables for added code in the middle of a function and helps you
avoid accidentally modifying the value of an existing on, especially
when it is not easy to see all the places where it might be used.

On the other hand, I never a block just to define a local scope in new
code. The code can always be structured better.

In your particular example, which could be a simplification, you could
open that block with an else or else if.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

SM Ryan · Mar 26, 2007

# Is this technique recommended, or are there caveats? I can see some C Coding
# Guidelines excluding its use, if not specifically then by implication,
# perhaps.

It's useful if you don't want to scan lots of code to make sure some
code you're inserting doesn't step on existing code.

Also useful if you have a real macro processor and you're building up
the code in diverse locations.

Malcolm McLean · Mar 26, 2007

Martin said:
[block scope variables] Which I think works well because the variables
defined in the block are necessary only if the errors looked for above are
not present.

Has anyone any constructive comments about this technique?

There is a limit to the number of scopes a programmer can cope with. We
already have global, file scope, and function scope variables. Adding
another scope would be too much, except that globals are typically used so
rarely that we can discount them. However when you have a hierarchy of
blocks each with different variables, the code can become unreadable.
I am against block scope variables because they encourage hacking rather
than thinking of the function as a unit, as well as on account of their
potential for introducing too many scopes. However I am not militantly
against them in short leaf blocks.

Malcolm McLean · Mar 26, 2007

SM Ryan said:
# Is this technique recommended, or are there caveats? I can see some C
Coding
# Guidelines excluding its use, if not specifically then by implication,
# perhaps.

It's useful if you don't want to scan lots of code to make sure some
code you're inserting doesn't step on existing code.

My point exactly.

Richard Tobin · Mar 26, 2007

There is a limit to the number of scopes a programmer can cope with. We
already have global, file scope, and function scope variables. Adding
another scope would be too much, except that globals are typically used so
rarely that we can discount them. However when you have a hierarchy of
blocks each with different variables, the code can become unreadable.

I can't deny that you may find that the case, but examples from other
languages such as Lisp show that it's just a matter of opinion,
somewhat influenced by the syntax of the language in question.

I find that, for code of equal length, introducing sub-blocks often
improves readability by making the scope of variables explicit. On
the other hand, the amount of vertical whitespace it introduces in
most C styles works in the opposite direction.

-- Richard

Mark McIntyre · Mar 26, 2007

you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function.

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

Or is it a case of "it can, provided there are no side-effects"?

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Ian Collins · Mar 27, 2007

Mark said:
you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function.

Click to expand...

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

Or is it a case of "it can, provided there are no side-effects"?

The compiler is free to allocate the *storage* (typically space on the
stack). It could allocate space for sizeof(int)+4 bytes, or 4 bytes.

Alan Curry · Mar 27, 2007

you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function.

Click to expand...

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

could easily be compiled the same as this, with all local variables allocated
at the top:

void foo()
{
double x_the_first;
int x_the_second;
char x_the_third[4];

x_the_first = 4.5;
{
x_the_second = 3;
}
{
strcpy(x_the_third, "foo");
}
}

Keith Thompson · Mar 27, 2007

Mark McIntyre said:
you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function.

Click to expand...

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

Or is it a case of "it can, provided there are no side-effects"?

What problem does this create?

Scope and storage duration are two different things. Assuming that,
let's say, double is 8 bytes and int is 4 bytes, an implemention could
allocate 16 bytes on entry to foo(), or 12 bytes if it chooses to
overlay storage the parallel blocks. Each variable has its own
address (the addresses of the int and the char[4] may or may not be
the same), and any reference to x will resolve the correct object for
the scope in which the reference appears.

Chris Dollin · Mar 27, 2007

Mark said:
you might assume that the variables in the block won't be
allocated until and unless you enter the block, but that's not
guaranteed, and a compiler is free to allocate all of a function's
local variables (including ones in inner blocks) on entry to the
function.

Click to expand...

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

What about it? The compiler can allocate space for a double, an int,
and a char[] on function entry, using the first for the first `x`,
the second for the second, and the third for the third.

Or it could, on a suitable machine, allocate space for a double and
for an int-or-char[4], using the second for both nested `x`s.

[Or it could allocate no space at all and implement the function with
a `mov pc, r14` or your local flavour of return instruction ...]

--
The second Jena user conference! http://hpl.hp.com/conferences/juc2007/
"I just wonder when we're going to have to sit down and re-evaluate /Sahara/
our decision-making paradigm."

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN

Mark McIntyre · Mar 27, 2007

I guess your answer was "yes"...

let's say, double is 8 bytes and int is 4 bytes, an implemention could
allocate 16 bytes on entry to foo(), or 12 bytes if it chooses to
overlay storage the parallel blocks.

I believe the point you're making is that the variable *names* don't
exist in the compiled code so as long as the compiler correctly
resolves each reference to the right object, it can allocate storage
at whatever point suits it best - even at programme startup should it
so desire. Obiviously such allocations schemes would be suboptimal in
many environments.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Martin · Mar 27, 2007

Well, as I expected, I got a lot of informed responses. Thank you.

It seems that the use of blocks does not guarantee a saving in terms of
allocation of variables - one of the reasons I found them attractive and
*assumed* was a reason that Plauger uses them.

However, another reason I found attrractive for the use of blocks was the
containment of variables defined within the block, to that block, useful in
code maintenance as some respondents have pointed out.

Ian Collins mentioned that using a block is a "strong hint that the code in
the block might be better off in its own function" - although true, it also
means that if variables are used within the block from an outer block, the
function would have to take them in as arguments.

My main question concerned my code. Basically, I am not fond of having a
situation where you do this:

if ( /* no error */ )
{
ret_val = SUCCESS;
/* lots of code */
}
else
{
ret_val = FAIL;
}

return ret_val;

I think this is better written:

if ( an error )
{
return FAIL;
}

/* now we can process, having obviated any errors */
then
return SUCCESS;

My original post actually has several situations:

if ( error_1 )
return error_1_code;

if (error_2_
return error_2_code;

/* a couple more */

/* now process having error checked above */
{
int var_1; /* these variables not needed if errors detected above
*/
char var_2;
/* processing */
}

Pardon the pseudo-C approach but I'm sure you know what I'm trying to
demonstrate. In fact, the processing I'm doing is all inside a case
statement in a rather large switch(). I'm maintaining it, by the way, it's
not the way I would have done it. I have made the case statement a block to
further contain the code (prior to Jack Klein's suggestion).

In fact, Jack mentioned "In your particular example, which could be a
simplification, you could
open that block with an else or else if" but I'm not really sure what you
mean, Jack.

Ian Collins · Mar 27, 2007

Martin said:
Well, as I expected, I got a lot of informed responses. Thank you.

It seems that the use of blocks does not guarantee a saving in terms of
allocation of variables - one of the reasons I found them attractive and
*assumed* was a reason that Plauger uses them.

However, another reason I found attrractive for the use of blocks was the
containment of variables defined within the block, to that block, useful in
code maintenance as some respondents have pointed out.

Ian Collins mentioned that using a block is a "strong hint that the code in
the block might be better off in its own function" - although true, it also
means that if variables are used within the block from an outer block, the
function would have to take them in as arguments.

True, the cost of any change has to be weighed against the benefits.

My main question concerned my code. Basically, I am not fond of having a
situation where you do this:

if ( /* no error */ )
{
ret_val = SUCCESS;
/* lots of code */
}
else
{
ret_val = FAIL;
}

return ret_val;

I think this is better written:

if ( an error )
{
return FAIL;
}

/* now we can process, having obviated any errors */
then
return SUCCESS;

This problem has been solved in C99, where you can mix variable
declarations and statements.

CBFalconer · Mar 28, 2007

Martin said:
.... snip ...

My main question concerned my code. Basically, I am not fond of
having a situation where you do this:

if ( /* no error */ )
{
ret_val = SUCCESS;
/* lots of code */
}
else
{
ret_val = FAIL;
}
return ret_val;

I think this is better handled as:

if (error) ret_val = FAIL;
else {
ret_val = success;
/* lotsa code */
/* which may include "ret_val = FAIL\" overides */
}
return ret_val;

Which incorporates several guidelines. One is single point of
exit. Another is that the controlling condition is easily visible
from the indentation. A third is that the short condition comes
first, easing finding the controlling condition via the
indentation.

Keith Thompson · Mar 28, 2007

Ian Collins said:
Martin wrote: [...]

My main question concerned my code. Basically, I am not fond of having a
situation where you do this:

if ( /* no error */ )
{
ret_val = SUCCESS;
/* lots of code */
}
else
{
ret_val = FAIL;
}

return ret_val;

I think this is better written:

if ( an error )
{
return FAIL;
}

/* now we can process, having obviated any errors */
then
return SUCCESS;

Click to expand...

Where the "now we can process" presumably includes the declaration of
any objects needed for that processing; in C90, this requires a block.

This problem has been solved in C99, where you can mix variable
declarations and statements.

Partly, but even in C99, if you don't introduce a block, the scope of
each variable still extends from its declaration to the end of the
function. More generally, the scope extends from the declaration to
the end of the enclosing block (which may be the entire body of the
function).

If you want parallel scopes, you still need blocks:

if (something) {
/* declare variables here */
do_something ...;
}
else {
/* declare other variables here */
do_something_else ...;
}

The compiler isn't obligated to overlay the two sets of variables, but
it's a reasonable and easy optimization.

Keith Thompson · Mar 28, 2007

Chris Dollin said:
Mark McIntyre wrote: [...]

Really? Pathological case:

void foo()
{
double x = 4.5;
{
int x = 3;
}
{
char x[4]={"foo"};
}
}

Click to expand...

What about it? The compiler can allocate space for a double, an int,
and a char[] on function entry, using the first for the first `x`,
the second for the second, and the third for the third.

Or it could, on a suitable machine, allocate space for a double and
for an int-or-char[4], using the second for both nested `x`s.

[...]

What do you mean by "on a suitable machine"? The only
machine-specific thing I can think of that would affect this would be
an alignment constraint; even then there could easily be a partial
overlap.

Richard Tobin · Mar 28, 2007

Keith Thompson said:
If you want parallel scopes, you still need blocks:

if (something) {
/* declare variables here */
do_something ...;
}
else {
/* declare other variables here */
do_something_else ...;
}

The compiler isn't obligated to overlay the two sets of variables, but
it's a reasonable and easy optimization.

Even with this:

/* declare variables here */
/* declare other variables here */
if (something) {
do_something using only first variables ...;
}
else {
do_something_else using only second variables ...;
}

a compiler can use the same memory for both sets of variables. The
analysis done by modern compilers makes this too an easy optimisation.

-- Richard

Question on header files and their use.	13	Mar 29, 2011
question about try/except blocks	1	May 3, 2013
How to install and use PhpSanitization	0	Feb 7, 2021
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
simple_html_dom: simple use-case - getting a scipt to work	0	Mar 2, 2020
Methods and blocks - not that clear when blocks passed into	9	Apr 25, 2009
'strlen' : Why valgrind reports invalid read ?	12	Jan 18, 2013
sizeof and strlen()	29	May 7, 2010

Blocks and Their Use

Martin

Keith Thompson

Ian Collins

Jack Klein

SM Ryan

Malcolm McLean

Malcolm McLean

Richard Tobin

Mark McIntyre

Ian Collins

Alan Curry

Keith Thompson

Chris Dollin

Mark McIntyre

Martin

Ian Collins

CBFalconer

Keith Thompson

Keith Thompson

Richard Tobin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads