"i = i|0"

Discussion in 'C Programming' started by Stefan Ram, Jun 11, 2014.

  1. Stefan Ram

    Stefan Ram Guest

    Newsgroups: comp.lang.c,comp.lang.javascript

    I have read this in a World-Wide Web encyclopedia:

    For example, given the following C code:

    int f(int i) {
    return i + 1;
    }

    Emscripten would output the following JS code:

    function f(i) {
    i = i|0;
    return (i + 1)|0;
    }

    Do you think that the »|0« is necessary to express the
    C semantics in JavaScript, or could the speed of the
    generated code be improved by omitting it?

    Newsgroups: comp.lang.c,comp.lang.javascript
     
    Stefan Ram, Jun 11, 2014
    #1
    1. Advertisements

  2. I don't have much knowledge of the underlying system.
    But my guess is that since Javascript uses a weakly typed
    system, "i +1" can mean several things depending on the
    type of i. i|0 probably forces it to an integer, and the
    converter just adds that to every integer expression, whether
    strictly required or not.
     
    Malcolm McLean, Jun 11, 2014
    #2
    1. Advertisements

  3. Stefan Ram

    Stefan Ram Guest

    Newsgroups: comp.lang.c,comp.lang.javascript

    Yes, but if it was compiled from a correct C program, then
    every call within that program should already have an
    integer as its argument - and authors of other clients in
    JavaScript can be instructed to use only integer arguments.

    When they then call the function with non-integer arguments
    its their fault. (Or they can write a |0 wrapper instead
    of requiring each and every function to do |0 and thereby
    possibly slowing down execution and enlarging code size.)

    Newsgroups: comp.lang.c,comp.lang.javascript
     
    Stefan Ram, Jun 11, 2014
    #3
  4. Stefan Ram

    Ike Naar Guest

    With a decent optimizing compiler, do you think that |0 would
    slow down execution or enlarge code size?
     
    Ike Naar, Jun 11, 2014
    #4
  5. Stefan Ram

    James Kuyper Guest

    I'm no JS experts, but so far I haven't seen any responses from those
    who are, so I'll throw in my guesses.

    It seems reasonable to me that it would - in javascript, |0 doesn't just
    cause the integer value to be unchanged (something a smart compiler
    could drop) - it also (and first) causes conversion to a 32-bit integer
    type, if necessary. The compiler can't be expected to know that such a
    conversion will not, in fact, be necessary. This is a substantial
    difference from the original C code, where such conversions would occur,
    if necessary, in the calling code, where the compiler can be sure. A JS
    compiler would have to generate code to check the type of the argument,
    and code to perform the conversion. The checking part, at least, will
    have to be executed even though the conversion code will not.
     
    James Kuyper, Jun 11, 2014
    #5
  6. At least the C semantics are better matched; consider arbitrary input to
    the function, or an overflow occuring in the calculation.

    I'm not sure about the performance. Actually, this idiom is quite
    common, so it might cause performance improvements in some ECMAScript
    implementations. At least it is used in asm.js to enforce the type of i
    and the return value to be an integer type.

    [xpost & fup2 comp.lang.javascript]
     
    Christoph Michael Becker, Jun 11, 2014
    #6
  7. [F'up2 comp.lang.javascript]

    First of all, this does _not_ express the C semantics in JavaScript, and
    there is no other way. “int†is a *generic* type in C/C++; IIUC, the result
    could be a 32-bit integer when compiled for a 32-bit platform or a 64-bit
    integer when compiled for a 64-bit platform.

    <http://en.wikibooks.org/wiki/C_Programming/Reference_Tables#Table_of_Data_Types>
    <http://stackoverflow.com/questions/11438794/is-the-size-of-c-int-2-bytes-or-4-bytes>

    By contrast, using the binary bitwise OR operator, as with all ECMAScript
    binary bitwise operators, *always* creates an IEEE-754 double-precision
    *floating-point* value representing a *32-bit* integer value. Also, not
    only the result will be such a value, but also the operands are converted
    to such values internally, before the operation is performed.

    <http://ecma-international.org/ecma-262/5.1/#sec-11.10>


    Conversion to an integer value of the ECMAScript Number type (i.e., where
    the fractional part of the mantissa is zero), which appears to be the goal
    here, can be better achieved with the Math.floor() and Math.ceil()
    functions, e.g.:

    if (typeof Number.prototype.toInteger != "undefined")
    {
    Number.prototype.toInteger = function () {
    return (this < 0 ? Math.ceil(this) : Math.floor(this));
    };
    }

    /**
    * Frobnicates this value
    *
    * @return {int} i
    * The value to be frobnicated
    * @return {int}
    * The frobnicated value
    */
    function f (i)
    {
    return (+i).toInteger() + 42;
    }

    [JSX:array.js features another converter, for
    jsx.array.BigArray.prototype.slice() & friends¹, that for practical reasons
    more closely matches the Specification (ToInt32), but does not convert to
    32-bit floating-point integer.]

    Of course, this still does not remotely implement the C semantics. One
    aspect of it is that code where you pass a non-integer would not compile.
    Since ECMAScript uses dynamic type-checking, it is not possible to prevent
    compilation. But at the very least passing unsuitable values should cause
    an exception to be thrown, so that it becomes unnecessary to handle them,
    e.g.:

    function f (i)
    {
    if (i % 1 != 0)
    {
    /* JSX:eek:bject.js provides jsx.InvalidArgumentError instead */
    throw new TypeError('f: Invalid argument for "i": ' + i + ':'
    + typeof i + '[' + _getClass(i) + ']'
    + (i != null ? ' by ' + (_getFunctionName(i.constructor) || '?')
    : '')
    + '; expected integer');
    }

    return i + 42;
    }

    _________
    ¹ supports arrays with up to 2âµÂ³âˆ’1 numerically indexed elements²
    ² Because 2âµÂ³+1 is indistinguishable from 2âµÂ³ due to precision limits,
    so that element overflow could not be detected, I had to reduce
    jsx.array.BigArray.MAX_LENGTH to Math.pow(2, 53) - 1 recently.
    (This is also the reason why standard Array instances can hold only
    up to 2³²_−1_ elements, so that the largest possible index is
    Math.pow(2, 32) _- 2_.)
     
    Thomas 'PointedEars' Lahn, Jun 11, 2014
    #7
  8. [F'up2 comp.lang.javascript]

    That much is obvious.
    Usenet is not a real-time communications medium. Besides, what one sees may
    very well be different from what someone else sees, due to different feeds
    (different speeds and Paths), scorefiles, killfiles, and so on.
    The reference material is freely available; there is no need to guess. If
    you are not sure, just refrain from posting. Nobody is being helped by wild
    guesses from people who have a smattering; you do not have to save the world
    alone.
    There is no javascript. [0]
    No, it causes conversion to a 64-bit floating-point value that represents a
    32-bit integer value.
     
    Thomas 'PointedEars' Lahn, Jun 11, 2014
    #8
  9. Stefan Ram

    BGB Guest

    no JS expert either...

    but, I had been working on writing a VM which can run C and supports JS
    as a target...


    but, yeah, "|0" if anything, makes code faster, by offering a useful
    hint (and being trivial to optimize away). besides this, it also helps
    enforce integer semantics.


    AFAIK:

    the numeric type in JS isn't really all that well-defined (in terms of
    how it is implemented), but is generally implemented, essentially, as a
    64-bit double (nevermind if implementations may use integer-types
    internally when they can get away with it, but this is mostly invisible
    at the JS level).

    "|0" basically then effectively means (besides "OR with 0") "truncate
    value range to that of a 32-bit integer", but may indirectly serve as a
    hint to the JS engine that it may safely use a 32-bit integer
    representation internally (it may, but need not necessarily, imply a
    type conversion, depending mostly on the "phase of the moon" and "the
    current feelings of the JS engine at this particular moment in time",
    and need not involve a runtime check, say, if the JS-engine already
    knows that the caller calls this code with an integer value, ...).

    different JS engines generally implement numbers internally in different
    ways:
    NaN-tagged values (pretty much all numbers are double but NaN encodes
    special cases, such as object-pointers/references);
    tagged-reference types (where we have a value and a few tag bits
    indicate what it is, ex: pointer/integer/double/...);
    inferred basic types (untagged integer or double values, ...);
    ....


    or such...
     
    BGB, Jun 12, 2014
    #9
  10. Stefan Ram

    Noob Guest

    AFAIU, ILP64 platforms are rare, and SILP64 even more so.
    https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
    FWIW, in C, operands of bitwise operators "shall have integer type"
    (otherwise it is a constraint violation).

    I'm not quite sure what the semantics of ORing two floating-point
    value should be (in a different language).

    Regards.
     
    Noob, Jun 12, 2014
    #10
  11. Stefan Ram

    James Kuyper Guest

    With the javascript definition given by the OP, could f() be passed an
    argument which is not of numeric type, but which can be converted? Your
    argument seems to suggest that the compiler need not worry about
    performing such a conversion.
     
    James Kuyper, Jun 12, 2014
    #11
  12. Stefan Ram

    James Kuyper Guest

    Yes, know all of that. But I was getting impatient for someone to reply.
    My comments were not pure guesses. I searched for reference materials,
    found one (I didn't take notes, so I'm not sure which one) that was
    identified as an official standard (though possibly not the relevant
    one). It defined the behavior of bit-wise or in terms of calls to
    ToInt32() for each operand.
    That comment requires explanation. What you meant by it may be perfectly
    obvious to those who monitor comp.lang.javascript, but for this
    particular C expert, the very existence of comp.lang.javascript seems to
    contradict the most obvious meaning for that comment.

    The [0] seems to be intended as a cross-reference, but I couldn't locate it.
    If I'd bothered to look up what ToInt32() did, I would have noticed, but
    the name seemed perfectly clear, so I didn't bother. It would never have
    occurred to me that ToInt32() might have such behavior. From a C
    perspective, that's a mind-bogglingly inefficient way of doing things,
    though I suppose it makes sense from a javascript perspective (Sorry - I
    couldn't figure out how to write that sentence without referring to the
    thing you've said is non-existent).

    That doesn't change my main point: the conversion is still required - it
    can't be optimized away. Or am I wrong about that, too?
     
    James Kuyper, Jun 12, 2014
    #12
  13. [F'up2 comp.lang.javascript]

    James Kuyper wrote in comp.lang.c:
    It should.
    Yes, that only seems to be so. The newsgroup name, charter and tagline are
    both historic considering what is being discussed there now (and rightly
    so), and the newsgroup name is case-insensitive as newsgroup names usually
    go. (You would not talk about “c†either, would you?)
    It was in my signature; the URI still is.
    The only possible optimization of

    function f(i) {
    i = i|0;
    return (i + 1)|0;
    }

    is
    function f(i) {
    return ((i|0) + 1)|0;
    }

    (Source code optimization could either add pretty printing or strip almost
    all whitespace.)

    The “| 0†operation itself cannot be optimized away, but as I indicated it
    can be replaced for greater flexibility (and a closer-to-C/C++-int
    implementation).

    F'up2 had been set. Please stop cross-posting (without F'up2).
     
    Thomas 'PointedEars' Lahn, Jun 12, 2014
    #13
  14. [F'up2 comp.lang.javascript]

    James Kuyper wrote in comp.lang.c:
    It should.
    Yes, that only seems to be so. The newsgroup name, charter and tagline are
    both historic considering what is being discussed there now (and rightly
    so), and the newsgroup name is case-insensitive as newsgroup names usually
    go. (You would not talk about “c†either, would you?)
    It was in my signature; the URI still is.
    The only possible optimization of

    function f(i) {
    i = i|0;
    return (i + 1)|0;
    }

    in terms of runtime is

    function f(i) {
    return ((i|0) + 1)|0;
    }

    (Source code optimization could either add pretty printing or strip almost
    all whitespace.)

    The “| 0†operation itself cannot be optimized away, but as I indicated it
    can be replaced for greater flexibility (and a closer-to-C/C++-int
    implementation).

    F'up2 had been set. Please stop cross-posting (without F'up2).
     
    Thomas 'PointedEars' Lahn, Jun 12, 2014
    #14
  15. Stefan Ram

    James Kuyper Guest

    Noted, and ignored. Explanation in last paragraph.
    I've found it, but I don't think it was reasonable to have expected
    someone to be able to find it there, without more of a clue about where
    to look. I routinely ignore signatures, I suspect that this is
    commonplace. My newsreader, like many, displays signatures in ways
    designed to avoid drawing attention to them. Specifically, it displays
    them in light grey text on a white background.

    The corresponding link brings up a blank screen on my system, so I'm
    still not sure what the cross-reference was intended to convey. From
    your comments earlier, I assume it's something about the case. Is
    Javascript better? Or should it be JavaScript?
    The original question, and everything I've ever had to say about this
    thread, has always been about whether the translation of a particular
    example of C code to a particular bit of Javascript was correct. I don't
    understand why you consider it inappropriate to cross-post such a
    discussion to both a C oriented newsgroup and a Javascript oriented one.
    I can't imagine a better case for cross-posting.
     
    James Kuyper, Jun 12, 2014
    #15
  16. Stefan Ram

    BGB Guest

    JS is weird, and if it *is* passed an integer value, often any
    checks/conversions can be optimized away.

    internally, a given function might end up compiled into several
    different internal functions:
    one which takes an integer value;
    one which takes a double, and converts it;
    one which blows up (known invalid types);
    one which performs a run-time check;
    ...

    then, if the caller passes an integer, it will internally call the
    version which accepts an integer (and thus no checks/conversion needed),
    rather than the version which takes other types.

    so, for example:
    function f(i) {
    i = i|0;
    return (i + 1)|0;
    }

    could potentially end up as 3-5 different functions internally.

    ex, in a C equivalent:
    int f_i(int i) {
    return (i + 1);
    }
    int f_d(double i) {
    int t_i;
    t_i=(int)i;
    return (t_i + 1);
    }
    ...

    then:
    f(3);
    f(3.14159);
    might become effectively:
    f_i(3);
    f_d(3.14159);

    usually, the argument lists the callers make use of will indicate which
    version of a function are generated, so if a function is only ever
    called with a particular combination of argument types, only this
    version will be compiled for.


    in less-trivial cases (those involving objects and closures, ...), lots
    of other wackiness may come up though.
     
    BGB, Jun 12, 2014
    #16
  17. Stefan Ram

    BartC Guest

    "c" by itself is ambiguous. "javascript" considerably less so; even without
    context, people will know what you were on about (ie. the language formerly
    known as Javascript).
    If that function is being called as a=f(b), then another optimisation might
    be:

    a = b+1;

    if the function source is visible at the call-site (especially if being
    converted from c source code).

    But, how important is that |0 anyway; what would happen if the function was
    just:

    function f(i) }
    return i+1;
    } ?

    (I assume that in this language, it is possible to write such a function,
    but would anyone actually bother with sticking |0 everywhere? If called with
    an invalid argument, it would still fail wouldn't it?)
     
    BartC, Jun 12, 2014
    #17
  18. The explanation is simple. Lahn resides under a bridge near you eating
    any goats he comes across.
     
    Denis McMahon, Jun 12, 2014
    #18
  19. Stefan Ram

    Richard Bos Guest

    My guess is that he wants to be called an EcmaScript(tm) Developer.

    Richard
     
    Richard Bos, Jun 12, 2014
    #19
  20. [F'up2 comp.lang.javascript]

    BartC wrote in comp.lang.javascript, comp.lang.c:
    It's attribution *line*, _not_ attribution novel.
    Why have you ignored that? This discussion has nothing to do with C
    (anymore).
    So those “people†are adopting a new term for a non-existing language that
    is thought to be a successor to some other non-existing language?

    (You have no clue what you are talking about. Visit the ECMAScript Support
    Matrix website to get a minimum one.)
    No, as that would change the semantics considerably. I also think inlining
    was not what was being asked for here; it is too obvious.
    Then the value of the “i†parameter would not necessarily be a numeric value
    whose fractional part is zero, neither would be the return value.
    What is “this language�

    Your questions have been answered before. Please read more carefully.
     
    Thomas 'PointedEars' Lahn, Jun 12, 2014
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.
Similar Threads
There are no similar threads yet.
Loading...