inca: a hacked-up apl in c based on the J incunabulum

Discussion in 'C Programming' started by luser- -droog, Mar 27, 2014.

  1. luser- -droog

    James Kuyper Guest

    Compact source code? The speed with which source code is processed
    seldom is an important factor. In any event, this style of code should,
    if anything, slow down the compiler, because of the additional work that
    needs to be done during translation phase 4 (preprocessing). The only
    thing that this speeds up is the typing of the code - and even that only
    applies if the unreadability of the code doesn't slow down the code
    development process enough to make up for the smaller source code files.
     
    James Kuyper, Mar 27, 2014
    #21
    1. Advertisements

  2. (snip, I wrote)
    Sometimes more compact code is faster, but not always.

    Personally, I like table-driven code, which tends to be compact,
    some may find it more readable, some less.

    I prefer a loop with a single IF statement in it to tens or
    hundreds of consecutive IF statements testing different values
    for a single, or small number, of variables. (Yes, I have seen
    that latter in actual production code.)

    Other than intentional obfuscation, exactly what someone
    finds more readable can be different for different people.

    -- glen
     
    glen herrmannsfeldt, Mar 27, 2014
    #22
    1. Advertisements

  3. (snip)
    There was once a question asked to one of the original creators of
    unix, if they could go back and change one thing, what would they
    have done differently.

    As I remember it, it was spelling creat() with an extra e.

    Many unix commands are shorter than similar commands on other
    systems, and if not short enough, shell aliases will make them shorter.

    It is very common in PostScript to define shorter versions of
    common commands (I forget what PS calls them). (Though often that
    is for program generated, not meant to be read by humans, code.)
    -- glen
     
    glen herrmannsfeldt, Mar 27, 2014
    #23
  4. luser- -droog

    BartC Guest

    That doesn't follow, unless you're dealing with actual source code at run
    time.

    And even if significant, we don't know in this example how compact or
    otherwise the expanded code might be.
    This is one dispatch loop for a bytecode interpreter:

    typedef void (*(*fnptr))(void);

    do {
    (**(fnptr)(pcptr))();
    } while (!stopped);

    It's reasonably fast (over 100M bytecodes per second), and is still pretty
    clear, which is not unexpected for three lines of code! (BTW removing the
    stop condition, just while(1), makes it slower for some reason.)

    Faster (approaching 200M) is this, showing just one of nearly 300 similar
    cases, each of which is handled in the same way:

    while (1) {
    switch (*pcptr) {
    case knop: do_nop(); break;
    ....
    }
    }

    That's about as far as I could go while using standard C, and still having
    an actual inner loop. I've gone a couple of further steps, and managed
    200-400M bytecodes per second, but the basic C bytecode handler functions
    /are the same/ in all cases.

    That is, for every bytecode instruction, there is a discrete, dedicated
    function to handle it. You can't get more straightforward than that. And
    compiler inlining will also collapse a lot of these, there is no need for
    the code to be physically compact and unreadable.
     
    BartC, Mar 27, 2014
    #24
  5. (big snip)
    I was recently looking at the flow charts of the microcode for the
    IBM 360/40 CPU. (The smallest of the 360s that could run OS/360.)
    Seems that they do a 16 way branch on the high half of the opcode,
    followed by 16 way branches on the low half. I presume that it was
    too much to supply a 256 way branch in microcode.

    I wonder how the speed of nested 16 way switch/case would compare
    to a 256 way switch/case?

    -- glen
     
    glen herrmannsfeldt, Mar 27, 2014
    #25
  6. luser- -droog

    BartC Guest

    There must have had their reasons for using a two-level switch. It's also
    possible the flowchart didn't fully represent what was going on in the
    microcode.

    Although that would be a little different from a C switch which has to deal
    with a default case, which is extra overhead for each level.

    (Maybe there is a way of telling C not to use a default case, or it might be
    able to infer it if, say, the switch expression was 8-bits, and 256
    different case labels were provided.)
     
    BartC, Mar 27, 2014
    #26
  7. luser- -droog

    Kaz Kylheku Guest

    Compactness obtained from transformations like:

    #define l long_variable_name

    has no bearing on executable speed.
     
    Kaz Kylheku, Mar 27, 2014
    #27
  8. luser- -droog

    James Kuyper Guest

    Are you talking about the code compiling faster, or executing faster?
    Use of these macros might have a small amount of influence on the
    compilation speed, but they shouldn't have any effect on the execution
    speed. Are you really suggesting that use of these particular macros
    might speed up (rather than slow down, as I expect) compilation of the
    code sufficiently to justify using them?

    ....
    In some cases, yes, readability can be a highly subjective issue, but
    I'm not willing to concede that these particular macros could ever be
    anything but an obfuscation, and a fairly substantial one at that. That
    might not have been the author's intent, but it is in fact what he has
    achieved.
     
    James Kuyper, Mar 28, 2014
    #28
  9. luser- -droog

    Javier Lopez Guest

    El 27/03/2014 21:38, glen herrmannsfeldt escribió:
    I guess you are referring to generated machine code and then, yes, in
    modern architectures more compact code == faster because of cache
    locality. Likewise, larger code == slower because cache trashing.

    But compact object code doesn't have /anything/ to do with compact
    sources anyway.
     
    Javier Lopez, Mar 28, 2014
    #29
  10. luser- -droog

    Javier Lopez Guest

    El 27/03/2014 22:06, BartC escribió:
    This depends on how smart the compiler is and the target architecture.
    Because of indirect SIB (Scale-Index-Base) call on x86, I guess it
    should be faster to use a jump table than to use switch.

    If I'm not wrong, it should be faster to use switch if there's little
    cases because of iCache.
     
    Javier Lopez, Mar 28, 2014
    #30
  11. luser- -droog

    James Kuyper Guest

    Code using such macros has frequently been posted to this newsgroup by
    people using a variety of different names, such as io_x <>,
    Rosario1903 <>, "Citizen C" <>,
    "¬a\\/b" <>, av <>, RoSsIaCrIiLoIA <>, Giuseppe
    <>. I found several references to an older
    obfuscator named "Jens", though I didn't find any of his actual
    messages. The reason why they chose to do so has been frequently asked,
    but never (to my knowledge) satisfactorily answered. Hanlon's Razor
    suggests that it might have been intended merely to reduce typing, but
    I'm not willing to rule out a deliberate attempt to obfuscate.

    It's hard to believe that so many people could separately invent a
    concept so stupid; possibly they borrowed from each other. Since very
    few, if any, of them seem to have valid e-mail addresses, they could
    easily be multiple aliases for a much smaller cadre of idiots.

    One message I found traced the usage back to "Arthur Whitney's J
    Interpreter", which seems pretty suspicious in the context of this thread.
     
    James Kuyper, Mar 28, 2014
    #31
  12. (snip, I wrote)
    Sorry, I meant more generally on writing compact code, and not
    specifically on using macros to make it look smaller.
    For non-english speakers, I have no idea what macros might make
    it easier to read C code. I might believe that for some people
    single letters would be better.

    Reminds me of a comment from IBM about the Fortran H compiler.
    Seems that they use six (names are one to six characters long)
    balanced trees, one for each length. They suggest that for maximum
    compilation speed you should distribute your names evenly between
    one and six characters long. No suggestion of readability.

    I might believe that on a 32 character wide screen that shorter names
    make for more readable C. I did at one time use a computer running C
    with a 32 character screen. Though I could print them out on wider
    paper. (I had a 132 character wide printer at the time.)

    -- glen
     
    glen herrmannsfeldt, Mar 28, 2014
    #32
  13. luser- -droog

    Stefan Ram Guest

    Reminds me of ANSI C 1989 where for names with external linkage
    the first six characters were portably significant IIRC.
     
    Stefan Ram, Mar 28, 2014
    #33
  14. What, I wonder, is "suspicious" about this?

    My source says the same thing.
    http://www.jsoftware.com/jwiki/Essays/Incunabulum

    FWIW, I have removed the P and R macros and added
    many comments to inca.c and examples to the README,
    so you don't actually need to try to compile it to
    get a feel for what the heck I'm talking about.
     
    luser- -droog, Mar 28, 2014
    #34
  15. Oops. No. Rosy memory. I've been scrutinizing the
    original code for over a year, as the dates on my
    comments to the SO question linked in the README show.
     
    luser- -droog, Mar 28, 2014
    #35
  16. I'll look again. But I think I've added "I"s for all the
    returns actually used. Then remaining implicit ints
    should all be void-functions.

    I do understand that it appears that way. But that's
    not how it happened. Take a peek at the the J incunabulum
    that I started with. I merely tried to maintain a consistent
    style while adding extensions.
     
    luser- -droog, Mar 28, 2014
    #36
  17. luser- -droog

    BartC Guest

    Yes, that switch-based code I showed was faster than the simple
    function-call loop. Almost certainly because the former can make excellent
    use of inlining.

    What is faster still than the switch, is to use a jump-table containing
    label pointers. But this is not satisfactory because it uses both a
    non-standard gcc extension, and lots of gotos:

    loop:
    goto **pcptr;

    jnop:
    do_nop();
    goto loop;

    ..... // 280 more similar groups

    (The table of pointers exists, but the contents are used to fix-up the
    byte-code so it contains the label pointers directly.)

    Strangely, eliminating the second goto to achieve 'threaded' code (and have
    no 'inner loop') was slower:

    loop:
    goto **pcptr; // only executed once

    jnop:
    do_nop();
    goto **pcptr; // After each bytecode, jump direct to the next
    .....

    (Fastest is probably a threaded code 'inner loop' written in ASM and using
    dedicated registers. Initially however it is slower than even the
    function-call loop, because there is no inlining, and there are extra
    overheads in saving registers to call the C bytecode handlers. The speed
    comes from dealing with selected bytecodes in ASM and avoiding calling C as
    much as possible.

    This is all for a pure bytecode interpreter dealing with dynamic, tagged
    data. There are more sophisticated techniques but then it will no longer be
    a simple interpreter.)
     
    BartC, Mar 28, 2014
    #37
  18. luser- -droog

    BartC Guest

    I've done some work with the source. The results are here:

    http://pastebin.com/7J0tfbJU

    Mainly, replacing I by INT, A by ABC (to make things simpler for my editor),
    and giving more vertical space to the functions. Ideally everything there
    would be one statement per line, and the code would be consistently
    indented, but I guess tools exist to do all that.

    The overall structure is now a bit clearer (especially after I figured out
    that V1 and V2 define functions), but it still looks quite complex to fully
    understand. (Not knowing APL doesn't help.)

    The code is also still very fragile: most things I type in will crash it.
     
    BartC, Mar 28, 2014
    #38
  19. They're usually called "mnemonics", assuming you do it right.
    In postscript this can have benefits in storage space,
    transmission time, and execution speed. Handwritten code will
    often use loops and procedures, thus paying for decoding the
    names just once, but generated code is usually unrolled,
    so reducing a name by 1 char reduces the file by 1 char
    for each usage, which can be significant.
     
    luser- -droog, Mar 28, 2014
    #39
  20. If you're trying to write conforming modern C, you need to remove *all*
    occurrences of implicit int. A void function needs to be declared with
    an explicit return type of "void".

    The "implicit int" feature was removed from the language by the C99
    standard.

    [...]
    I found this:

    https://github.com/tangentstorm/j-incunabulum

    and in particular this:

    https://github.com/tangentstorm/j-incunabulum/blob/master/ji.c

    which I see is where the horrid macro definitions:

    #define P printf
    #define R return

    came from. It's based on some code written in 1989, which probably
    explains the use of implicit int. Quoting the README.md:

    Although the code is terse and may seem unreadable at first,
    it can be understood with a little effort. It may help to
    understand that the code was written for and by people who
    were very familiar with array-centric languages (APL) and who
    came from a strong mathematical background. Indeed, one of the
    main reasons for adopting a style like this is that it meakes
    it easier to reason about and manipulate the code algebraically.

    I'm not convinced about the claimed advantages of the style,
    though I suppose it's possible I'm missing something.
     
    Keith Thompson, Mar 28, 2014
    #40
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.