Definition of expression and statement.

Discussion in 'C Programming' started by dspfun, Dec 29, 2007.

  1. dspfun

    dspfun Guest

    Hi!

    The words "expression" and "statement" are often used in C99 and C-
    textbooks, however, I am not sure of the clear defintion of these
    words with respect to C.

    Can somebody provide a sharp defintion of "expression" and
    "statement"? What is the difference between an expression and a
    statement?

    This is what I have found (textbooks and own conclusions), please
    correct if/where wrong.

    -------------------------------------------------
    An expression is:
    An expression contains data or no data.
    Every expression has a type and, if the type is not a void, a value.
    An expression can contain zero or more operands, and zero or more
    operators.
    The simplest expressions consists of a single constant, a variable or
    a function call.
    An expression can contain an assignment.
    An expression never contains a semicolon.
    Expressions can be joined with other expressions to form more complex
    expressions.
    Expressions can serve as operands.
    A statement will become an expression if the semicolon is removed
    (not true for block statements though).
    The values of expressions that starts immediately after a semicolon
    and ends immediately before next semicolon are always discarded.

    Examples:
    4 * 512 //Type: int. Value: 2048.
    printf("An example!\n) //Type: int Value: Whatever is returned from
    printf.
    1.0 + sin(x) //Type: double Value: Whatever is the result of the
    expression.
    srand((unsigned)time(NULL)) //Type: void. Value: None.
    (int*)malloc(sizeof(int)) //Type: int*. Value: The address returned
    by malloc.
    1++ //Type: int. Value: 2, right?
    a++ //Type: Depends on a. Value: One more than a.
    x = 5 //Type: depends on the type of variable x, right? Value: 5.
    2 * 32767 //Type: depends on INT_MAX, right? Value: 65534
    Question: what is the type of the expression above?
    a //Type: Depends on a. Value: Depends on a.
    1 //Type: int. Value: 1
    f() //Type: depends on return type of f(). Value: Depends on what
    f() returns.

    Right?

    In the expressions above the values of the expressions are "thrown
    away", right?

    Any more examples of expressions which are not the same/variants of
    above examples?

    -------------------------------------------------

    A statement is:
    Anything separated by semicolons, unless it's a declaration or an
    expression in a for statement.
    Statements specify an action to be performed, such as an operation or
    function call.
    Statements are program constructs followed by a semicolon.
    An expression that is executed is a statement, right?
    Statements do not have a value or a type.
    A statement specifies an action to be performed, such as an
    arithmetic operation of a function call.
    Everey statement that is not a block is terminated by a semicolon.
    A statement is always "atomic", i.e., a statement cannot be broken
    down into "sub" statements.
    The following are statements:
    Assignment(=)
    Compound ({...})
    break
    continue
    goto
    label
    if
    do, while and for
    return
    switch

    Examples of statements:
    All the above expressions will become statements when a semicolon is
    added to the expression.

    Question: Is it possible to have a statement with a semicolon, which
    will not become an expression
    when the semicolon is removed?

    -------------------------------------------------
    Also,

    What is the defintion of an expression statement, and how is it
    different from a statement and an expression?
    Is it just an expression followed by a semicolon.

    What is the definition of a block statement?
    Is it just one or more statements within curly braces?

    BRs!
    dspfun, Dec 29, 2007
    #1
    1. Advertising

  2. dspfun

    osmium Guest

    "dspfun" wrote:

    > The words "expression" and "statement" are often used in C99 and C-
    > textbooks, however, I am not sure of the clear defintion of these
    > words with respect to C.
    >
    > Can somebody provide a sharp defintion of "expression" and
    > "statement"? What is the difference between an expression and a
    > statement?


    I think the only really clear definition comes from a study of the BNF of
    the language. (BNF - Backus Normal From/ Backus Naur Form.) Have you
    tried Wikipedia?
    osmium, Dec 29, 2007
    #2
    1. Advertising

  3. dspfun

    James Kuyper Guest

    dspfun wrote:
    > Hi!
    >
    > The words "expression" and "statement" are often used in C99 and C-
    > textbooks, however, I am not sure of the clear defintion of these
    > words with respect to C.
    >
    > Can somebody provide a sharp defintion of "expression" and
    > "statement"? What is the difference between an expression and a
    > statement?


    Section 6.5p1 says:
    "An _expression_ is a sequence of operators and operands that specifies
    computation of a value, or that designates an object or a function, or
    that generates side effects, or that performs a combination thereof."


    Section 6.8p2 says:
    "A _statement_ specifies an action to be performed. ..."

    The '_' characters around a word indicate that it was italicized in the
    original text. That is the standard's way of indicating that these
    clauses count as definitions of those terms.

    > This is what I have found (textbooks and own conclusions), please
    > correct if/where wrong.


    Note: I've only corrected you where wrong; I've cut out everything you
    wrote in which I found no error (which is not to say that there were no
    errors, only that I didn't find them).

    ....
    > An expression never contains a semicolon.


    Technically incorrect: c = ';' is an expression. However, expressions
    will never contain a semicolon as a token. In that expression, ';' is a
    token, but the semicolon character itself is not.

    ....
    > A statement will become an expression if the semicolon is removed
    > (not true for block statements though).


    This true for expression statements, but not necessarily for other
    kinds. Example:

    return;

    ....
    > 1++ //Type: int. Value: 2, right?


    The left operand of ++ must be an modifiable lvalue. It cannot be an
    integer literal.


    > a++ //Type: Depends on a. Value: One more than a.


    The value of that expression is the value of a before it was
    incremented. Note that if 'a' is already at it's maximum, the behavior
    of that expression is undefined unless a has an unsigned type.

    ....
    > A statement is:
    > Anything separated by semicolons, unless it's a declaration or an
    > expression in a for statement.


    Statements are not separated by semicolons. Statements include the
    semicolon. Also, note that a compound statement is terminated by a '}',
    not a semicolon. Finally, note that declarations are also terminated by
    semicolons.

    ....
    > Statements are program constructs followed by a semicolon.


    Not in the case of compound statements.

    ....
    > An expression that is executed is a statement, right?


    No. The three expressions in a for(a; b; c) construct are executed, but
    none of them are statements in themselves.

    > ... A statement is always "atomic", i.e., a statement cannot be broken
    > down into "sub" statements.


    Not true for compound, selection, or iteration statements. Each of those
    contain sub-statements.

    > Question: Is it possible to have a statement with a semicolon, which
    > will not become an expression
    > when the semicolon is removed?


    return;

    > What is the defintion of an expression statement, and how is it
    > different from a statement and an expression?


    An expression statement is a particular kind of statement. There are
    many other kinds. An expression statement contains an expression; it is
    not itself an expression.

    > Is it just an expression followed by a semicolon.


    Yes.

    > What is the definition of a block statement?
    > Is it just one or more statements within curly braces?


    Yes.
    James Kuyper, Dec 29, 2007
    #3
  4. On Sat, 29 Dec 2007 16:40:04 +0000, James Kuyper wrote:
    > dspfun wrote:
    >> An expression never contains a semicolon.

    >
    > Technically incorrect: c = ';' is an expression. However, expressions
    > will never contain a semicolon as a token. In that expression, ';' is a
    > token, but the semicolon character itself is not.


    Semicolons can occur in declarations nested within expressions.

    (struct S { int member; }) { 0 }

    The above is a perfectly valid expression of type struct S.
    Harald van Dijk, Dec 29, 2007
    #4
  5. dspfun

    manisha Guest

    On Dec 30, 1:53 am, dspfun <> wrote:
    > Hi!
    >
    > The words "expression" and "statement" are often used in C99 and C-
    > textbooks, however, I am not sure of the clear defintion of these
    > words with respect to C.
    >
    > Can somebody provide a sharp defintion of "expression" and
    > "statement"? What is the difference between an expression and a
    > statement?
    >
    > This is what I have found (textbooks and own conclusions), please
    > correct if/where wrong.
    >
    > -------------------------------------------------
    > An expression is:
    >  An expression contains data or no data.
    >  Every expression has a type and, if the type is not a void, a value.
    >  An expression can contain zero or more operands, and zero or more
    > operators.
    >  The simplest expressions consists of a single constant, a variable or
    > a function call.
    >  An expression can contain an assignment.
    >  An expression never contains a semicolon.
    >  Expressions can be joined with other expressions to form more complex
    > expressions.
    >  Expressions can serve as operands.
    >  A statement will become an expression if the semicolon is removed
    > (not true for block statements though).
    >  The values of expressions that starts immediately after a semicolon
    > and ends immediately before next semicolon are always discarded.
    >
    > Examples:
    >  4 * 512                        //Type: int.    Value: 2048.
    >  printf("An example!\n)    //Type: int     Value: Whatever is returned from
    > printf.
    >  1.0 + sin(x)           //Type: double  Value: Whatever is the result of the
    > expression.
    >  srand((unsigned)time(NULL))    //Type: void.   Value: None.
    >  (int*)malloc(sizeof(int))      //Type: int*.   Value: The address returned
    > by malloc.
    >  1++                    //Type: int.    Value: 2, right?
    >  a++                    //Type: Depends on a. Value: One more than a.
    >  x = 5                  //Type: depends on the type of variable x, right? Value: 5.
    >  2 * 32767                      //Type: depends on INT_MAX, right? Value: 65534
    >  Question: what is the type of the expression above?
    >  a                      //Type: Depends on a. Value: Depends on a.
    >  1                      //Type: int.     Value: 1
    >  f()                    //Type: depends on return type of f(). Value: Depends on what
    > f() returns.
    >
    > Right?
    >
    > In the expressions above the values of the expressions are "thrown
    > away", right?
    >
    > Any more examples of expressions which are not the same/variants of
    > above examples?
    >
    > -------------------------------------------------
    >
    > A statement is:
    >  Anything separated by semicolons, unless it's a declaration or an
    > expression in a for statement.
    >  Statements specify an action to be performed, such as an operation or
    > function call.
    >  Statements are program constructs followed by a semicolon.
    >  An expression that is executed is a statement, right?
    >  Statements do not have a value or a type.
    >  A statement specifies an action to be performed, such as an
    > arithmetic operation of a function call.
    >  Everey statement that is not a block is terminated by a semicolon.
    >  A statement is always "atomic", i.e., a statement cannot be broken
    > down into "sub" statements.
    > The following are statements:
    >  Assignment(=)
    >  Compound ({...})
    >  break
    >  continue
    >  goto
    >  label
    >  if
    >  do, while and for
    >  return
    >  switch
    >
    > Examples of statements:
    >  All the above expressions will become statements when a semicolon is
    > added to the expression.
    >
    > Question: Is it possible to have a statement with a semicolon, which
    > will not become an expression
    > when the semicolon is removed?
    >
    > -------------------------------------------------
    > Also,
    >
    > What is the defintion of an expression statement, and how is it
    > different from a statement and an expression?
    > Is it just an expression followed by a semicolon.
    >
    > What is the definition of a block statement?
    > Is it just one or more statements within curly braces?
    >
    > BRs!


    hello,
    an expression is a combination of one or more operators, operands and
    constants which is arranged according to the precedences of operators
    and rules of the corresponding languages, an expression every time
    produces a result, expressions are in general of several types such
    as..constant expression, integral, float, logical, relational, boolean
    and bitwise depending upon the value which is produced by an
    expression. On the other hand, a statement may be any instruction
    given to the computer it is followed by a semicolon, it may contain
    keywords, variables, functions etc. statements are also of different
    types for eg. control statements, looping statements, branching
    statements, i/o statements, type declaration and etc. When an
    expression is followed by a semicolon then such stmt. may be called as
    a expression stmt. eg. c=a*b;
    A block statement is nothing but a group of statements enclosed within
    curly braces sometimes it is also called as compound statement and it
    has to be every time placed within two braces, most of the times it is
    used in loops and function definitions.
    manisha, Dec 29, 2007
    #5
  6. >The words "expression" and "statement" are often used in C99 and C-
    >textbooks, however, I am not sure of the clear defintion of these
    >words with respect to C.
    >
    >Can somebody provide a sharp defintion of "expression" and
    >"statement"? What is the difference between an expression and a
    >statement?


    An expression followed by a semicolon is one type of statement.
    It is NOT the only type of statement; there are many others.

    >An expression is:
    > An expression contains data or no data.


    I'm not sure what you mean by this, but the expression:
    ""
    might be considered to be an exception.

    Ok.
    > Every expression has a type and, if the type is not a void, a value.
    > An expression can contain zero or more operands, and zero or more
    >operators.

    Ok.
    > The simplest expressions consists of a single constant, a variable or
    >a function call.


    I don't think I'd call a function call "simple", especially since the
    arguments can get very complicated..

    > An expression can contain an assignment.

    Ok.
    > An expression never contains a semicolon.


    c = ';'
    is a valid expression. So is:
    message = "H;e;l;l;o;;;W;o;r;l;d;\n";

    > Expressions can be joined with other expressions to form more complex
    >expressions.


    Ok, but not to an unlimited extent, as there are type rules.

    > Expressions can serve as operands.


    Ok.

    > A statement will become an expression if the semicolon is removed
    >(not true for block statements though).


    This is only true for expression statements. The following are not
    expressions:
    return 5
    break
    int i
    continue
    and if, for, do-while, while, switch, etc. statements aren't expressions either.

    > The values of expressions that starts immediately after a semicolon
    >and ends immediately before next semicolon are always discarded.


    This is an expression statement you are describing, and yes, the value
    is discarded.

    >Examples:
    > 4 * 512 //Type: int. Value: 2048.
    > printf("An example!\n) //Type: int Value: Whatever is returned from
    >printf.
    > 1.0 + sin(x) //Type: double Value: Whatever is the result of the
    >expression.
    > srand((unsigned)time(NULL)) //Type: void. Value: None.
    > (int*)malloc(sizeof(int)) //Type: int*. Value: The address returned
    >by malloc.
    > 1++ //Type: int. Value: 2, right?


    Error. 1 is not an lvalue. This should not compile.

    > a++ //Type: Depends on a. Value: One more than a.


    Incorrect. The value returned by a++ is the original value of a.

    > x = 5 //Type: depends on the type of variable x, right? Value: 5.
    > 2 * 32767 //Type: depends on INT_MAX, right? Value: 65534


    This is signed int multiplied by signed int, so the result is signed int.
    The value might be 65534 if it is representable in signed int, which is
    not guaranteed (and won't be if int is 16 bits).

    > Question: what is the type of the expression above?
    > a //Type: Depends on a. Value: Depends on a.
    > 1 //Type: int. Value: 1
    > f() //Type: depends on return type of f(). Value: Depends on what
    >f() returns.
    >
    >Right?
    >
    >In the expressions above the values of the expressions are "thrown
    >away", right?


    Yes, if they are used as expression statements. No, if they are used
    as function arguments or part of a larger expression.

    >Any more examples of expressions which are not the same/variants of
    >above examples?
    >
    >-------------------------------------------------
    >
    >A statement is:
    > Anything separated by semicolons, unless it's a declaration or an
    >expression in a for statement.


    This is way too simple and does not account for semicolons in character
    constants or quoted string constants or comments. It also doesn't account
    for things like:

    while(borg(foo++) > 0) { }


    > Statements specify an action to be performed, such as an operation or
    >function call.


    It is debatable whether a null statement (lone semicolon) can be considered
    to specify an action. Also a constant as a statement expression doesn't
    call for any action:
    42;

    > Statements are program constructs followed by a semicolon.


    Some statements don't have their own semicolon but use one in a
    statement that's a part of it, for example:

    if (foo) printf("Thou hast committed a foo!\n");

    > An expression that is executed is a statement, right?


    An expression that is a part of a larger expression is not a statement.
    An expression that is never executed is still an expression:

    if (0) {
    a++;
    } else {
    b++;
    }
    a++ and b++ above are both expression statements. The fact that a++ will
    never be executed is irrelevant.

    > Statements do not have a value or a type.
    > A statement specifies an action to be performed, such as an
    >arithmetic operation of a function call.


    This depends a little on how loose you are with the definition of "action".

    > Everey statement that is not a block is terminated by a semicolon.


    while (1) { 42; }
    is not a block (but contains one) and does not end in a semicolon.

    > A statement is always "atomic", i.e., a statement cannot be broken
    >down into "sub" statements.


    That gets iffy if you consider that a left brace followed by zero or more
    statements followed by a right brace is a statement.

    >The following are statements:
    > Assignment(=)

    I think you're looking for "expression statement" here.
    An assignment need not be an expression statement or in an expression statement:

    for (; foo(a = 3, b = 4, c = 5); ) { bar(); }

    > Compound ({...})
    > break
    > continue
    > goto
    > label
    > if
    > do, while and for
    > return
    > switch
    >
    >Examples of statements:
    > All the above expressions will become statements when a semicolon is
    >added to the expression.


    Which above expressions? Immediately above I see a list of statements,
    not expressions.

    An expression followed by a semicolon is an expression statement.

    >
    >Question: Is it possible to have a statement with a semicolon, which
    >will not become an expression
    >when the semicolon is removed?


    Yes, and you listed some of them above.
    break continue goto if do, while and for return switch

    >What is the defintion of an expression statement, and how is it
    >different from a statement and an expression?
    >Is it just an expression followed by a semicolon.


    Yes. A sub-expression of an expression is an expression but it is
    not an expression statement.

    >What is the definition of a block statement?
    >Is it just one or more statements within curly braces?

    Yes.
    Gordon Burditt, Dec 29, 2007
    #6
  7. James Kuyper <> writes:
    > dspfun wrote:
    >> The words "expression" and "statement" are often used in C99 and C-
    >> textbooks, however, I am not sure of the clear defintion of these
    >> words with respect to C.
    >>
    >> Can somebody provide a sharp defintion of "expression" and
    >> "statement"? What is the difference between an expression and a
    >> statement?

    [...]
    > Note: I've only corrected you where wrong; I've cut out everything you
    > wrote in which I found no error (which is not to say that there were
    > no errors, only that I didn't find them).

    [...]
    >
    > ...
    >> An expression never contains a semicolon.

    >
    > Technically incorrect: c = ';' is an expression. However, expressions
    > will never contain a semicolon as a token. In that expression, ';' is
    > a token, but the semicolon character itself is not.


    Harald showed an example of an expression containing a semicolon
    token. (I probably wouldn't have though of that one myself.)

    [...]

    >> 1++ //Type: int. Value: 2, right?

    >
    > The left operand of ++ must be an modifiable lvalue. It cannot be an
    > integer literal.


    I think a lot of newbie C programmers are so fascinated by the "++"
    and "--" operators that they forget that the way to add one to an
    expression is simply "... + 1".

    [...]

    >> What is the defintion of an expression statement, and how is it
    >> different from a statement and an expression?

    >
    > An expression statement is a particular kind of statement. There are
    > many other kinds. An expression statement contains an expression; it
    > is not itself an expression.
    >
    >> Is it just an expression followed by a semicolon.

    >
    > Yes.


    According to the grammar, the expression in an expression statement
    is optional; thus a null statement
    ;
    is a special case of an expression statement.

    I don't know why it was defined this way. I think it would have been
    simpler to define the null statement as a separate kind of statement.

    >> What is the definition of a block statement?
    >> Is it just one or more statements within curly braces?

    >
    > Yes.


    Correction: zero or more statements. Actually, zero or more
    "block-items", where a block-item is either a declaration or a
    statement. (In C90, all the declarations must precede all the
    statements; in C99, they can be mixed.)

    --
    Keith Thompson (The_Other_Keith) <>
    [...]
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Dec 29, 2007
    #7
  8. manisha <> writes:
    [...]
    > an expression is a combination of one or more operators, operands and
    > constants which is arranged according to the precedences of operators
    > and rules of the corresponding languages, an expression every time
    > produces a result,


    An expression of type void produces no result.

    > expressions are in general of several types such
    > as..constant expression, integral, float, logical, relational, boolean
    > and bitwise depending upon the value which is produced by an
    > expression.


    Expressions can be classified in a number of ways, e.g., by the type
    of the expression (int, void, double*, etc.) or by the *kind* of
    expression, determined by the top-most operator. Your list mixes
    these two kinds of classification.

    > On the other hand, a statement may be any instruction
    > given to the computer it is followed by a semicolon, it may contain
    > keywords, variables, functions etc. statements are also of different
    > types for eg. control statements, looping statements, branching
    > statements, i/o statements, type declaration and etc.


    C has no i/o statements; i/o is done by function calls, which
    typically appear in expression statements.

    Declarations are not statements. <OT>I think they are in C++.</OT>

    > When an
    > expression is followed by a semicolon then such stmt. may be called as
    > a expression stmt. eg. c=a*b;
    > A block statement is nothing but a group of statements enclosed within
    > curly braces sometimes it is also called as compound statement and it
    > has to be every time placed within two braces, most of the times it is
    > used in loops and function definitions.


    A block statement can also contain declarations, or it can be empty.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    [...]
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Dec 29, 2007
    #8
  9. manisha <> writes:
    [...]
    > an expression is a combination of one or more operators, operands and
    > constants which is arranged according to the precedences of operators
    > and rules of the corresponding languages, an expression every time
    > produces a result,


    An expression of type void produces no result.

    > expressions are in general of several types such
    > as..constant expression, integral, float, logical, relational, boolean
    > and bitwise depending upon the value which is produced by an
    > expression.


    Expressions can be classified in a number of ways, e.g., by the type
    of the expression (int, void, double*, etc.) or by the *kind* of
    expression, determined by the top-most operator. Your list mixes
    these two kinds of classification.

    > On the other hand, a statement may be any instruction
    > given to the computer it is followed by a semicolon, it may contain
    > keywords, variables, functions etc. statements are also of different
    > types for eg. control statements, looping statements, branching
    > statements, i/o statements, type declaration and etc.


    C has no i/o statements; i/o is done by function calls, which
    typically appear in expression statements.

    Declarations are not statements. <OT>I think they are in C++.</OT>

    > When an
    > expression is followed by a semicolon then such stmt. may be called as
    > a expression stmt. eg. c=a*b;
    > A block statement is nothing but a group of statements enclosed within
    > curly braces sometimes it is also called as compound statement and it
    > has to be every time placed within two braces, most of the times it is
    > used in loops and function definitions.


    A block statement can also contain declarations, or it can be empty.

    --
    Keith Thompson (The_Other_Keith) <>
    [...]
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Dec 29, 2007
    #9
  10. dspfun

    Army1987 Guest

    dspfun wrote:

    > Hi!
    >
    > The words "expression" and "statement" are often used in C99 and C-
    > textbooks, however, I am not sure of the clear defintion of these
    > words with respect to C.
    >
    > Can somebody provide a sharp defintion of "expression" and
    > "statement"? What is the difference between an expression and a
    > statement?
    >
    > This is what I have found (textbooks and own conclusions), please
    > correct if/where wrong.
    >
    > -------------------------------------------------
    > An expression is:
    > An expression contains data or no data.
    > Every expression has a type and, if the type is not a void, a value.
    > An expression can contain zero or more operands, and zero or more
    > operators.
    > The simplest expressions consists of a single constant, a variable or
    > a function call.
    > An expression can contain an assignment.
    > An expression never contains a semicolon.

    putchar(';') is an expression...
    > Expressions can be joined with other expressions to form more complex
    > expressions.
    > Expressions can serve as operands.
    > A statement will become an expression if the semicolon is removed
    > (not true for block statements though).

    Not true for return statements, either. Or break statements.
    The other way round (an expression becomes a statement when a semicolon is
    added) is correct.
    > The values of expressions that starts immediately after a semicolon
    > and ends immediately before next semicolon are always discarded.

    True, yet a very complicate way to state that.
    Simpler and more accurate: "A statement of the form expression; evaluates
    the expression for side effects, and discards its value."
    > Examples:
    > 4 * 512 //Type: int. Value: 2048.
    > printf("An example!\n) //Type: int Value: Whatever is returned from
    > printf.
    > 1.0 + sin(x) //Type: double Value: Whatever is the result of the
    > expression.
    > srand((unsigned)time(NULL)) //Type: void. Value: None.
    > (int*)malloc(sizeof(int)) //Type: int*. Value: The address returned
    > by malloc.
    > 1++ //Type: int. Value: 2, right?

    No. You can't modify a constant. (You meant 1+1, right?)
    > a++ //Type: Depends on a. Value: One more than a. x = 5 //Type:
    > depends on the type of variable x, right? Value: 5. 2 * 32767 //Type:
    > depends on INT_MAX, right? Value: 65534

    The type is int. Whether it works depends on INT_MAX.
    Question: what is the type of
    > the expression above? a //Type: Depends on a. Value: Depends on a. 1
    > //Type: int. Value: 1
    > f() //Type: depends on return type of f(). Value: Depends on what
    > f() returns.
    >
    > Right?

    Yeah.
    > In the expressions above the values of the expressions are "thrown
    > away", right?

    It depends on where they are.
    > Any more examples of expressions which are not the same/variants of
    > above examples?

    && || < > ?: etc...

    > A statement is:
    > Anything separated by semicolons, unless it's a declaration or an
    > expression in a for statement.

    {} is a statement.
    > Statements specify an action to be performed, such as an operation or
    > function call.

    Not necessarily. ((void)0); is a statement.
    > Statements are program constructs followed by a semicolon. An
    > expression that is executed is a statement, right? Statements do not
    > have a value or a type. A statement specifies an action to be
    > performed, such as an
    > arithmetic operation of a function call.
    > Everey statement that is not a block is terminated by a semicolon. A
    > statement is always "atomic", i.e., a statement cannot be broken
    > down into "sub" statements.

    Wrong.
    if (foo) { bar(); baz(); } is a statement, but even bar(); and baz(); are
    themselves statements, and so is { bar(); baz(); }.
    > The following are statements:
    > Assignment(=)

    Assignments are expression (though they become statements with a ;)
    > Compound ({...})
    > break
    > continue
    > goto
    > label
    > if
    > do, while and for
    > return
    > switch
    >
    > Examples of statements:
    > All the above expressions will become statements when a semicolon is
    > added to the expression.
    >
    > Question: Is it possible to have a statement with a semicolon, which
    > will not become an expression
    > when the semicolon is removed?

    return 0;
    break;
    goto lab;
    > ------------------------------------------------- Also,
    >
    > What is the defintion of an expression statement, and how is it
    > different from a statement and an expression? Is it just an expression
    > followed by a semicolon.

    Yes.

    > What is the definition of a block statement? Is it just one or more
    > statements within curly braces?

    Yes, but C99 complicates the rules.
    enum {a, b};
    int different(void)
    {
    if (sizeof(enum {b, a}) != sizeof(int))
    return a; // a == 1
    return b; // which b?
    }
    In C99 the first two lines after the { form a block, so, unlike in C89,
    the b in return b; is 1.

    --
    Army1987 (Replace "NOSPAM" with "email")
    Army1987, Dec 29, 2007
    #10
  11. dspfun

    Chris Torek Guest

    In article <>
    dspfun <> wrote:
    >The words "expression" and "statement" are often used in C99 and C-
    >textbooks, however, I am not sure of the clear defintion of these
    >words with respect to C.


    Others have gone through a lot of examples and given various
    corrections. I would just like to emphasize a few details.

    >Can somebody provide a sharp defintion of "expression" and
    >"statement"? What is the difference between an expression and a
    >statement?


    As at least one person noted, the real heart of the difference is
    actually syntactic. An "expression" is that which is permitted
    syntactically by the grammar in the C Standard (whichever standard
    you use -- C89 or C99).

    In any case, *every* C expression can be turned into a statement
    simply by adding a semicolon at the end, but the reverse is not
    true. This is because the grammar (C89 or C99, either one) has
    various additional things recognized as "statement" that, even if
    they end with a semicolon, are not recognized as an "expression"
    without that semicolon. For instance, a while loop:

    while (expr) statement;

    is itself a statement (specifically, an "iteration-statement"),
    but removing the semicolon does not turn it into an expression.

    The C99 grammar includes the following fragments:

    statement:
    labeled-statement
    compound-statement
    expression-statement
    selection-statement
    iteration-statement
    jump-statement

    expression-statement:
    expression-opt ;

    This last (the expression-statement part of the grammar) is why
    any expression can be turned into a statement.

    The fact that a while loop (like a do-while or for loop) is recognized
    only by the "iteration-statement" part of the grammar is why it
    does not become a statement when removing the semicolon.

    Last, although this is not relevant to the distinction between
    "expression" and "statement": There is a key item here that I
    think many people miss here as well:

    > 1++ //Type: int. Value: 2, right?
    > a++ //Type: Depends on a. Value: One more than a.


    (As others noted, "1++" is a constraint violation and thus requires
    a diagnostic. "a++" is OK -- that is, is not a constraint violation
    as long as "a" is a "modifiable lvalue". It may have undefined
    behavior, e.g., if a is an "int" variable and is initially set to
    INT_MAX, but no diagnostic is required for this, and programmers
    should not expect one. The value is not "one more than a", but
    rather, "the value a had before the increment took place".)

    In C, expressions produce values (with one possible exception:
    expressions of type "void" produce no value, or produce "a value
    of type void", depending on who you ask; even the C Standard appears
    to be a bit confused on this issue :) ). However, expressions
    also have "side effects". (A "side effect" is, loosely speaking,
    a change in a variable. Things like printing output are also
    "side effects" in computing theory, although in C this is simply
    done with function calls, e.g., printf(). Side effects are quite
    important in computing theory because operations *without* side
    effects are always completely reversible. This means that "debugging"
    is, at its heart, simply the process of tracking all side effects
    -- all other operations can be trivially backed-up-over.)

    The various modifier operators, including the prefix and postfix
    increment and decrement, have TWO uses: they (a) produce a value,
    and (b) have a side effect. Sometimes, in programming in C, we
    want a value; sometimes we want a side effect; sometimes we even
    want both. We can use these modifier operators for their side
    effects, or for both their values *and* their side effects. For
    instance, in a loop like:

    for (i = 0; i < N; i++)

    we have two modifier-operators: initally we set i to 0, and each
    time at the end of the loop, we increment i. Here, the "=" operator
    is used purely for its side effect: it sets i to 0. The value of
    the entire operation is 0, but this value is discarded. Similarly,
    the "++" operator produces a value -- in this case, the previous
    value of i -- but we throw that value away, as the only thing we
    want is the side effect, of increasing i by 1.

    Because we only want the side effect, we could use any other operation
    that *also* increases i by 1:

    for (i = 0; i < N; ++i)

    and:

    for (i = 0; i < N; i = i + 1)

    are all equally valid ways to write the loop.

    Examples of places where we want *both* the value *and* the side
    effect are not quite as common, but do occur. For instance, if p
    points into a string that contains some 'x' characters, and *p is
    currently one of the 'x' characters, the following line skips over
    that x and any subsequent 'x', so that *p will be whatever character
    comes after the "x"s. E.g., if p points into "hexxllo world", *p
    will be 'l' after the loop ends; if it points into "magix", *p will
    be '\0':

    while (*p++ == 'x')
    continue;

    Here, the "++" operator is used both for the value it produces --
    i.e., "give me the value p had before an increment occurs" -- and
    for its side effect -- i.e., "and also please increment p before
    the next sequence point". (The old value of p is then given to
    the unary "*" operator, which fetches the character to which p
    pointed before the increment happened. The compiler is free to
    arrange for p to be incremented first or last or anywhere in between,
    as long as it manages to fetch *(whatever_p_used_to_be). On some
    machines, it may make sense to increment p first, then fetch p[-1];
    on some, it may make sense to increment p last; on some, it may be
    possible to increment p while simultaneously fetching, e.g., using
    the auto-increment addressing mode on a PDP-11, or the writeback
    feature of the ARM.)

    Something some C programmers do, but I claim is dodgy at best, is
    use modifier operators purely for their value. For instance,
    consider the following rather silly function, and an example of
    its use:

    int three_more(int x) {
    return x += 3;
    }

    #include <stdio.h>

    int main(void) {
    printf("%d\n", three_more(39));
    return 0;
    }

    which prints 42. The three_more() function uses the "+=" operator
    to modify x (a side effect) *and* produce a value (the value x will
    have after the increment-by-3), but -- by returning, in this case
    returning the value-after-increment -- immediately throws away the
    incremented variable "x". This is valid, "legal" C code, but to
    me it "makes more sense" to write:

    int better_three_more(int x) {
    return x + 3;
    }

    For some reason, beginning C programmers often seem to be fascinated
    by the "double effect" of modifier operators -- especially the
    prefix and postfix increment and decrement operators -- that have
    both a side effect *and* a value, and wind up "overusing" them (as
    in three_more() above). This seems to lead to the desire to write
    things like "1++" or "++41", which are not only pointless (a la
    the modification to x in three_more()), but invalid (draw a
    diagnostic, and usually fail to compile at all).
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Dec 29, 2007
    #11
  12. Chris Torek wrote:
    > In article <>
    > dspfun <> wrote:
    >> The words "expression" and "statement" are often used in C99 and C-
    >> textbooks, however, I am not sure of the clear defintion of these
    >> words with respect to C.

    >
    > Others have gone through a lot of examples and given various
    > corrections. I would just like to emphasize a few details.
    >
    >> Can somebody provide a sharp defintion of "expression" and
    >> "statement"? What is the difference between an expression and a
    >> statement?

    >
    > As at least one person noted, the real heart of the difference is
    > actually syntactic. An "expression" is that which is permitted
    > syntactically by the grammar in the C Standard (whichever standard
    > you use -- C89 or C99).
    >
    > In any case, *every* C expression can be turned into a statement
    > simply by adding a semicolon at the end, but the reverse is not
    > true. This is because the grammar (C89 or C99, either one) has
    > various additional things recognized as "statement" that, even if
    > they end with a semicolon, are not recognized as an "expression"
    > without that semicolon. For instance, a while loop:
    >
    > while (expr) statement;
    >
    > is itself a statement (specifically, an "iteration-statement"),
    > but removing the semicolon does not turn it into an expression.
    >
    > The C99 grammar includes the following fragments:
    >
    > statement:
    > labeled-statement
    > compound-statement
    > expression-statement
    > selection-statement
    > iteration-statement
    > jump-statement
    >
    > expression-statement:
    > expression-opt ;
    >
    > This last (the expression-statement part of the grammar) is why
    > any expression can be turned into a statement.


    int main(int argc, char *argv[])
    { int a;
    a+1;
    return(0);
    };

    Legal program. Doesn't do much though. And your compiler may emit a warning
    message. And it shows that an expression can be turned into a statement by
    putting a semicolon after it.

    As Chris points out you can't take a semicolon off a statement and always get an
    expression. An expression has a value.

    However "goto mess;" is a statement. You can't write "x = goto mess;" because
    "goto mess" isn't an expression. It doesn't have a value.
    Golden California Girls, Dec 30, 2007
    #12
  13. dspfun

    dspfun Guest


    > In C, expressions produce values (with one possible exception:
    > expressions of type "void" produce no value, or produce "a value
    > of type void", depending on who you ask; even the C Standard appears
    > to be a bit confused on this issue :) ).  However, expressions
    > also have "side effects".  (A "side effect" is, loosely speaking,
    > a change in a variable.   Things like printing output are also
    > "side effects" in computing theory, although in C this is simply
    > done with function calls, e.g., printf().  Side effects are quite
    > important in computing theory because operations *without* side
    > effects are always completely reversible.  This means that "debugging"
    > is, at its heart, simply the process of tracking all side effects
    > -- all other operations can be trivially backed-up-over.)
    >
    > The various modifier operators, including the prefix and postfix
    > increment and decrement, have TWO uses: they (a) produce a value,
    > and (b) have a side effect.  Sometimes, in programming in C, we
    > want a value; sometimes we want a side effect; sometimes we even
    > want both.  We can use these modifier operators for their side
    > effects, or for both their values *and* their side effects.


    Thank you Chris and others for great answers!

    Because of the *double effect* of modifier operators, is it a good
    idea to always convert expressions to void expressions when the
    value is not used but only the side effect is used? This way the
    discarding of the value is made explicit.

    For example:
    (void) a++

    Instead of:
    a++
    dspfun, Dec 30, 2007
    #13
  14. dspfun

    James Kuyper Guest

    dspfun wrote:
    > Because of the *double effect* of modifier operators, is it a good
    > idea to always convert expressions to void expressions when the
    > value is not used but only the side effect is used? This way the
    > discarding of the value is made explicit.
    >
    > For example:
    > (void) a++
    >
    > Instead of:
    > a++


    No, because the value of the expression in an expression-statement is
    always discarded, so you'd be putting (void) at the start of every
    expression-statement. You should consider that the discarding is
    implicit in the ';' at the end of the statement, and therefore doesn't
    require a (void) at the beginning.
    James Kuyper, Dec 30, 2007
    #14
  15. dspfun

    Army1987 Guest

    dspfun wrote:

    > Because of the *double effect* of modifier operators, is it a good
    > idea to always convert expressions to void expressions when the
    > value is not used but only the side effect is used? This way the
    > discarding of the value is made explicit.
    >
    > For example:
    > (void) a++
    >
    > Instead of:
    > a++


    It's a matter of style. I have even seen a program with
    #define V (void)
    and many instances of expressions (especially function calls) whose value
    was discarded were written as
    V printf("foo");
    The only thing that achieves is silencing lint and similar programs. I
    don't usually use (void), except when one would naturally think of an
    expression as "throwing away" something, e.g. (void)getchar(); throws
    away a character, or (void)rand(); (should I ever use it, I haven't so far)
    throws away a number from a pseudorandom sequence. On the other hand,
    fprintf(stderr, "Cannot open '%s' for reading: %s\n", argv[1],
    strerror(errno));
    simply prints an error message, and the fact that it does return a value
    which is discarded is somewhat irrelevant. So in this case I spare the
    (void).

    --
    Army1987 (Replace "NOSPAM" with "email")
    Army1987, Dec 30, 2007
    #15
  16. dspfun

    somenath Guest

    On Dec 30 2007, 3:48 am, Chris Torek <> wrote:
    > In article <>
    >
    > dspfun <> wrote:
    > >The words "expression" and "statement" are often used in C99 and C-
    > >textbooks, however, I am not sure of the clear defintion of these
    > >words with respect to C.

    >
    > Others have gone through a lot of examples and given various
    > corrections. I would just like to emphasize a few details.
    >
    > >Can somebody provide a sharp defintion of "expression" and
    > >"statement"? What is the difference between an expression and a
    > >statement?

    >
    > As at least one person noted, the real heart of the difference is
    > actually syntactic. An "expression" is that which is permitted
    > syntactically by the grammar in the C Standard (whichever standard
    > you use -- C89 or C99).
    >
    > In any case, *every* C expression can be turned into a statement
    > simply by adding a semicolon at the end, but the reverse is not
    > true. This is because the grammar (C89 or C99, either one) has
    > various additional things recognized as "statement" that, even if
    > they end with a semicolon, are not recognized as an "expression"
    > without that semicolon. For instance, a while loop:
    >
    > while (expr) statement;
    >
    > is itself a statement (specifically, an "iteration-statement"),
    > but removing the semicolon does not turn it into an expression.
    >
    > The C99 grammar includes the following fragments:
    >
    > statement:
    > labeled-statement
    > compound-statement
    > expression-statement
    > selection-statement
    > iteration-statement
    > jump-statement
    >
    > expression-statement:
    > expression-opt ;
    >
    > This last (the expression-statement part of the grammar) is why
    > any expression can be turned into a statement.
    >
    > The fact that a while loop (like a do-while or for loop) is recognized
    > only by the "iteration-statement" part of the grammar is why it
    > does not become a statement when removing the semicolon.
    >
    > Last, although this is not relevant to the distinction between
    > "expression" and "statement": There is a key item here that I
    > think many people miss here as well:
    >
    > > 1++ //Type: int. Value: 2, right?
    > > a++ //Type: Depends on a. Value: One more than a.

    >
    > (As others noted, "1++" is a constraint violation and thus requires
    > a diagnostic. "a++" is OK -- that is, is not a constraint violation
    > as long as "a" is a "modifiable lvalue". It may have undefined
    > behavior, e.g., if a is an "int" variable and is initially set to
    > INT_MAX, but no diagnostic is required for this, and programmers
    > should not expect one. The value is not "one more than a", but
    > rather, "the value a had before the increment took place".)
    >
    > In C, expressions produce values (with one possible exception:
    > expressions of type "void" produce no value, or produce "a value
    > of type void", depending on who you ask; even the C Standard appears
    > to be a bit confused on this issue :) ). However, expressions
    > also have "side effects". (A "side effect" is, loosely speaking,
    > a change in a variable. Things like printing output are also
    > "side effects" in computing theory, although in C this is simply
    > done with function calls, e.g., printf(). Side effects are quite
    > important in computing theory because operations *without* side
    > effects are always completely reversible. This means that "debugging"
    > is, at its heart, simply the process of tracking all side effects
    > -- all other operations can be trivially backed-up-over.)
    >
    > The various modifier operators, including the prefix and postfix
    > increment and decrement, have TWO uses: they (a) produce a value,
    > and (b) have a side effect. Sometimes, in programming in C, we
    > want a value; sometimes we want a side effect; sometimes we even
    > want both. We can use these modifier operators for their side
    > effects, or for both their values *and* their side effects. For
    > instance, in a loop like:
    >
    > for (i = 0; i < N; i++)
    >
    > we have two modifier-operators: initally we set i to 0, and each
    > time at the end of the loop, we increment i. Here, the "=" operator
    > is used purely for its side effect: it sets i to 0. The value of
    > the entire operation is 0, but this value is discarded. Similarly,
    > the "++" operator produces a value -- in this case, the previous
    > value of i -- but we throw that value away, as the only thing we
    > want is the side effect, of increasing i by 1.
    >
    > Because we only want the side effect, we could use any other operation
    > that *also* increases i by 1:
    >
    > for (i = 0; i < N; ++i)
    >
    > and:
    >
    > for (i = 0; i < N; i = i + 1)
    >
    > are all equally valid ways to write the loop.
    >
    > Examples of places where we want *both* the value *and* the side
    > effect are not quite as common, but do occur. For instance, if p
    > points into a string that contains some 'x' characters, and *p is
    > currently one of the 'x' characters, the following line skips over
    > that x and any subsequent 'x', so that *p will be whatever character
    > comes after the "x"s. E.g., if p points into "hexxllo world", *p
    > will be 'l' after the loop ends; if it points into "magix", *p will
    > be '\0':
    >
    > while (*p++ == 'x')
    > continue;
    >
    > Here, the "++" operator is used both for the value it produces --
    > i.e., "give me the value p had before an increment occurs" -- and
    > for its side effect -- i.e., "and also please increment p before
    > the next sequence point". (The old value of p is then given to
    > the unary "*" operator, which fetches the character to which p
    > pointed before the increment happened. The compiler is free to
    > arrange for p to be incremented first or last or anywhere in between,
    > as long as it manages to fetch *(whatever_p_used_to_be). On some
    > machines, it may make sense to increment p first, then fetch p[-1];
    > on some, it may make sense to increment p last; on some, it may be
    > possible to increment p while simultaneously fetching, e.g., using
    > the auto-increment addressing mode on a PDP-11, or the writeback
    > feature of the ARM.)
    >
    > Something some C programmers do, but I claim is dodgy at best, is
    > use modifier operators purely for their value. For instance,
    > consider the following rather silly function, and an example of
    > its use:
    >
    > int three_more(int x) {
    > return x += 3;
    > }
    >
    > #include <stdio.h>
    >
    > int main(void) {
    > printf("%d\n", three_more(39));
    > return 0;
    > }
    >
    > which prints 42. The three_more() function uses the "+=" operator
    > to modify x (a side effect) *and* produce a value (the value x will
    > have after the increment-by-3), but -- by returning, in this case
    > returning the value-after-increment -- immediately throws away the
    > incremented variable "x". This is valid, "legal" C code, but to
    > me it "makes more sense" to write:
    >
    > int better_three_more(int x) {
    > return x + 3;
    > }
    >
    > For some reason, beginning C programmers often seem to be fascinated
    > by the "double effect" of modifier operators -- especially the
    > prefix and postfix increment and decrement operators -- that have
    > both a side effect *and* a value, and wind up "overusing" them (as
    > in three_more() above). This seems to lead to the desire to write
    > things like "1++" or "++41", which are not only pointless (a la
    > the modification to x in three_more()), but invalid (draw a
    > diagnostic, and usually fail to compile at all).



    I would like to request you to explain why you are indicating second
    function as better.
    I would like to clarify my self why I requested so. I was reading one
    C text book which is famous in our country it says as mentioned.


    "These instructions increase directly specify the required information
    so help in faster execution. 'C' makes
    efficient use of this feature by providing compound statements for
    which translation can be done directly to
    its corresponding machine instruction. For example:
    140
    a=a+10;
    may be converted to,
    MOV AX,_a
    ADD 10
    MOV _a, AX
    Whereas a+=10; may be converted directly to,
    INC _a, 10
    in some machine."

    So according to this logic first function "int three_more(int x)"
    may be faster then the "int better_three_more(int x)". Is it not
    correct ?
    somenath, Jan 2, 2008
    #16
  17. somenath said:

    <snip>

    > So according to this


    (broken)

    > logic first function "int three_more(int x)"
    > may be faster then the "int better_three_more(int x)". Is it not
    > correct ?


    The formal answer is that the C Standard doesn't say either way.

    In practice:

    (a) the difference is likely to be minimal and not worth chasing;
    (b) if either one is going to be faster, it is more likely to be the one
    that doesn't pointlessly update an object that's about to be destroyed;
    (c) a good compiler will in any case optimise any difference away;
    (d) you should aim for clear code as a primary goal - write code that best
    expresses your algorithmic intent, rather than the code you think will run
    fastest, unless to do so would be grossly inefficient (e.g. recursive Fib,
    strlen in a loop condition, etc).

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -http://www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
    Richard Heathfield, Jan 2, 2008
    #17
  18. dspfun

    Chris Torek Guest

    >On Dec 30 2007, 3:48 am, Chris Torek <> wrote:
    [mass snippage]
    >> int three_more(int x) {
    >> return x += 3;
    >> }
    >> int better_three_more(int x) {
    >> return x + 3;
    >> }

    [and that I think that "better_three_more" is a better-written
    function].

    In article <>,
    somenath <> wrote:
    >I would like to request you to explain why you are indicating second
    >function as better.
    >I would like to clarify my self why I requested so. I was reading one
    >C text book which is famous in our country it says as mentioned.
    >
    >"These instructions increase directly specify the required information
    >so help in faster execution. 'C' makes
    >efficient use of this feature by providing compound statements for
    >which translation can be done directly to
    >its corresponding machine instruction. For example:
    >140
    >a=a+10;
    >may be converted to,
    >MOV AX,_a
    >ADD 10
    >MOV _a, AX
    >Whereas a+=10; may be converted directly to,
    >INC _a, 10
    >in some machine."
    >
    >So according to this logic first function "int three_more(int x)"
    >may be faster then the "int better_three_more(int x)". Is it not
    >correct ?


    Well, putting aside the fact that there are no guarantees about
    what comes out of any given compiler, except that -- if it is a
    correct C compiler at all -- it must implement those things the C
    Standard requires ... there are, in essence, two "kinds" of compilers
    here: the "stupid", directly-literal kind, and the "smart" or
    "optimizing" compiler.

    The claims above are true only of the "stupid" compiler. It uses
    the syntax, rather than [%] the semantics, to pick out instructions.
    Since "a = a + 10" syntactically says "get me a, get me 10, do an
    add, and store that as the result", a stupid compiler does exactly
    that, in exactly that order. Since "a += 10" syntactically says
    "get me 10, add that to a", the stupid compiler can use an "add 10"
    instruction, if one exists.

    [% Actually "in addition to" -- it uses semantics attached to things
    like types of variables and constants in order to choose between
    "integer add" and "floating point add", for instance. But the first
    part of selection is syntax-driven.]

    (Note, the above *also* assumes that the machine *has* an "add 10"
    instruction. On a load/store machine, the line "a += 10" has to be
    compiled into:

    load a
    add #10
    store a

    anyway, which is the same thing the stupid compiler produces for the
    "a = a + 10" line. So on some machines, even the stupid compiler gets
    no benefit from the more-syntactically-compact "a += 10" line.)

    A "smart" compiler, by contrast, reads entire blocks of code --
    possibly as large as entire functions, source-files, or programs
    -- and does a lot of work to figure out the "best" machine code to
    implement that. A smart compiler will generally produce the same
    machine code for either line (in this case, because "alias analysis"
    is terrifically easy for ordinary variables, and the compiler can
    see that "a = a + 10" and "a += 10" have exactly the same required
    semantics). In other words, a "smart" compiler gets no help from
    the += operator.

    But let us look at the whole thing in context: we do not have a
    simple "a += 10" (or in this case x += 3), but rather:

    return x += 3;

    So in this case, in the "stupid" compiler (on the machine for which
    we got the "INC 3, _x" above), we have to:

    - add 3 to the local variable x
    - put that value into the return register
    - tear down the stack frame
    - return to caller

    which we do thus:

    INC [BP-8], 3 # x += 3
    MOV AX, [BP-8] # return reg = x
    LEAVE # remove stack frame
    RET

    Now compare that to the code the stupid compiler emits for "return
    x + 3":

    MOV AX, [BP-8]
    ADD 3
    LEAVE
    RET

    Although this is still four instructions, it is actually runs in
    fewer clock cycles on the old versions of the CPU for which the
    stupid compiler was written. That is, even with the "stupid"
    compiler, the code is either faster, or no slower.

    Most compilers these days are reasonably smart, at least when run
    with optimization turned on. There *are* reasons to use "stupid"
    compilers, or the non-optimizing mode in an otherwise smart compiler:
    compilations run faster, for instance, and it is very difficult to
    trigger bugs in code that is not there, or not being used. :)
    Still, for most purposes, one mostly wants to turn optimization
    on. (For instance, in gcc, a number of very useful warnings are
    only enabled when optimization is on -- because it is the process
    of optimization itself that finds the bugs that the warnings point
    out).
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jan 2, 2008
    #18
  19. dspfun

    somenath Guest

    On Jan 2, 12:53 pm, Chris Torek <> wrote:
    > >On Dec 30 2007, 3:48 am, Chris Torek <> wrote:

    > [mass snippage]
    > >>     int three_more(int x) {
    > >>         return x += 3;
    > >>     }
    > >>     int better_three_more(int x) {
    > >>         return x + 3;
    > >>     }

    >
    > [and that I think that "better_three_more" is a better-written
    > function].
    >
    > In article <>,
    >
    >
    >
    >
    >
    > somenath  <> wrote:
    > >I would like to request you to explain why you are indicating second
    > >function as better.
    > >I would like to clarify my self why I requested so. I was reading one
    > >C text book which is famous in our country it says as mentioned.

    >
    > >"These instructions increase directly specify the required information
    > >so help in faster execution. 'C' makes
    > >efficient use of this feature by providing compound statements for
    > >which translation can be done directly to
    > >its corresponding machine instruction. For example:
    > >140
    > >a=a+10;
    > >may be converted to,
    > >MOV AX,_a
    > >ADD 10
    > >MOV _a, AX
    > >Whereas a+=10; may be converted directly to,
    > >INC _a, 10
    > >in some machine."

    >
    > >So according to this logic first function  "int three_more(int x)"
    > >may be faster then the  "int better_three_more(int x)". Is it not
    > >correct ?

    >
    > Well, putting aside the fact that there are no guarantees about
    > what comes out of any given compiler, except that -- if it is a
    > correct C compiler at all -- it must implement those things the C
    > Standard requires ... there are, in essence, two "kinds" of compilers
    > here: the "stupid", directly-literal kind, and the "smart" or
    > "optimizing" compiler.
    >
    > The claims above are true only of the "stupid" compiler.  It uses
    > the syntax, rather than [%] the semantics, to pick out instructions.
    > Since "a = a + 10" syntactically says "get me a, get me 10, do an
    > add, and store that as the result", a stupid compiler does exactly
    > that, in exactly that order.  Since "a += 10" syntactically says
    > "get me 10, add that to a", the stupid compiler can use an "add 10"
    > instruction, if one exists.
    >
    > [% Actually "in addition to" -- it uses semantics attached to things
    > like types of variables and constants in order to choose between
    > "integer add" and "floating point add", for instance.  But the first
    > part of selection is syntax-driven.]
    >
    > (Note, the above *also* assumes that the machine *has* an "add 10"
    > instruction.  On a load/store machine, the line "a += 10" has to be
    > compiled into:
    >
    >     load a
    >     add #10
    >     store a
    >
    > anyway, which is the same thing the stupid compiler produces for the
    > "a = a + 10" line.  So on some machines, even the stupid compiler gets
    > no benefit from the more-syntactically-compact "a += 10" line.)
    >
    > A "smart" compiler, by contrast, reads entire blocks of code --
    > possibly as large as entire functions, source-files, or programs
    > -- and does a lot of work to figure out the "best" machine code to
    > implement that.  A smart compiler will generally produce the same
    > machine code for either line (in this case, because "alias analysis"
    > is terrifically easy for ordinary variables, and the compiler can
    > see that "a = a + 10" and "a += 10" have exactly the same required
    > semantics).  In other words, a "smart" compiler gets no help from
    > the += operator.
    >
    > But let us look at the whole thing in context: we do not have a
    > simple "a += 10" (or in this case x += 3), but rather:
    >
    >     return x += 3;
    >
    > So in this case, in the "stupid" compiler (on the machine for which
    > we got the "INC 3, _x" above), we have to:
    >
    >   - add 3 to the local variable x
    >   - put that value into the return register
    >   - tear down the stack frame
    >   - return to caller
    >
    > which we do thus:
    >
    >     INC [BP-8], 3   # x += 3
    >     MOV AX, [BP-8]  # return reg = x
    >     LEAVE           # remove stack frame
    >     RET
    >
    > Now compare that to the code the stupid compiler emits for "return
    > x + 3":
    >
    >     MOV AX, [BP-8]
    >     ADD 3
    >     LEAVE
    >     RET
    >
    > Although this is still four instructions, it is actually runs in
    > fewer clock cycles on the old versions of the CPU for which the
    > stupid compiler was written.  That is, even with the "stupid"
    > compiler, the code is either faster, or no slower.
    >
    > Most compilers these days are reasonably smart, at least when run
    > with optimization turned on.  There *are* reasons to use "stupid"
    > compilers, or the non-optimizing mode in an otherwise smart compiler:
    > compilations run faster, for instance, and it is very difficult to
    > trigger bugs in code that is not there, or not being used. :)
    > Still, for most purposes, one mostly wants to turn optimization
    > on.  (For instance, in gcc, a number of very useful warnings are
    > only enabled when optimization is on -- because it is the process
    > of optimization itself that finds the bugs that the warnings point
    > out).


    Many thanks. From the above article I understood that
    1) We should not use compound assignment expression (i.e +=,-=, ..
    etc ) for writing faster code as optimizer emit suitable code for
    faster execution.

    So where is the real use of such kind of expression ?
    Only we should use when the expression is long ?
    i.e in x+=10 ; if x is complex to type then only the use of x+=10;
    comes into picture?
    somenath, Jan 2, 2008
    #19
  20. dspfun

    Chris Torek Guest

    In article <>
    somenath <> wrote:
    [regarding operators like *= or ++]
    >... So where is the real use of such kind of expression ?


    Use them when they make the code clearer to human readers, mainly.

    >Only we should use when the expression is long ?
    >i.e in x+=10 ; if x is complex to type then only the use of x+=10;
    >comes into picture?


    If the "true purpose" of the computation is to add 10 to x, use x += 10.
    If the "true purpose" is simply to calculate 10 more than x, use x + 10.
    For instance:

    result = (x += 10);
    x = 0; /* don't want the 10-greater x anymore */

    is "misusing" the += operator, but:

    x += 10;
    ... use the augmented x for a while ...
    x++;
    ... use the augmented x some more ...

    is "properly using" the += and ++ operators, because the "..."
    sections of the code "want" the incremented variable.

    This rule relies on deciding the "true purpose" of code, which is
    quite a difficult thing to do -- a human reader, or a computer,
    can follow the rules of the language to figure out what happen,
    step-by-step, but *why* it happens might be a mystery. (This is,
    in part, what comments are for -- the human writing the code can
    explain, in a comment, *why* the code is taking some series of
    steps.)

    This separation of "how" (the various individual steps needed) from
    "why" (the ultimate goal of any particular series of steps) is the
    heart of abstraction, which in turn is the real essence of computer
    programming. One immediate goal might be "tie shoelaces" and the
    steps involve manipulating finger-like appendages. Backing up
    another step reveals a higher-level goal, "put on shoes", of which
    "tie shoelaces" is simply one step. We must go up another level,
    though, in order to find out that "put on shoes" is just a step in
    the goal of going outside, which in turn is just a step in the goal
    of going to the market, and so on.

    In the same way, it is obvious enough what "x += 10" does -- it
    adds 10 to x and produces, as its value, the x+10 sum -- but we
    need to move up a level in order to see *why* someone is adding 10
    to x. If the reason is, or at least includes, "we need x to be 10
    bigger", then the += operator is quite appropriate.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Jan 2, 2008
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tony Johansson
    Replies:
    3
    Views:
    276
    angelo
    Nov 17, 2004
  2. Jianli Shen
    Replies:
    1
    Views:
    587
    Victor Bazarov
    Mar 13, 2005
  3. Ark
    Replies:
    1
    Views:
    415
    Chris Torek
    Aug 7, 2004
  4. Jon Slaughter
    Replies:
    4
    Views:
    449
    Jon Slaughter
    Oct 26, 2005
  5. Pierre Yves
    Replies:
    2
    Views:
    482
    Pierre Yves
    Jan 10, 2008
Loading...

Share This Page