understanding this (dmc c compiler)

Discussion in 'C Programming' started by fir, May 26, 2013.

  1. fir

    fir Guest

    Hullo all, recently I encountered this ->

    http://cm.bell-labs.com/cm/cs/who/dmr/primevalC.html

    these are sources of original c compiler, these
    are quite fine for me, short and fine (though
    not documented/tutorial to this)

    I would very much like to understand it with
    much possible extent Can someone explain me
    some crucial functions in it, what they do,
    and help me with understanding this ?
     
    fir, May 26, 2013
    #1
    1. Advertisements

  2. fir

    fir Guest

    W dniu niedziela, 26 maja 2013 19:27:27 UTC+2 użytkownik fir napisał:
    for example what does do this ->




    pswitch() {
    extern swp[], isn, swtab[], printf, deflab, statement, brklab;
    extern label;
    int sswp[], dl, cv, swlab;

    sswp = swp;
    if (swp==0)
    swp = swtab;
    swlab = isn++;
    printf("jsr pc,bswitch; l%d\n", swlab);
    dl = deflab;
    deflab = 0;
    statement(0);
    if (!deflab) {
    deflab = isn++;
    label(deflab);
    }
    printf("L%d:.data;L%d:", brklab, swlab);
    while(swp>sswp & swp>swtab) {
    cv = *--swp;
    printf("%o; l%d\n", cv, *--swp);
    }
    printf("L%d; 0\n.text\n", deflab);
    deflab = dl;
    swp = sswp;
    }

    could someone explain in detail ?
     
    fir, Jun 2, 2013
    #2
    1. Advertisements

  3. fir

    osmium Guest

    osmium, Jun 2, 2013
    #3
  4. fir

    fir Guest

    W dniu niedziela, 2 czerwca 2013 16:00:24 UTC+2 użytkownik osmium napisał:
    weak joke ;) this is a serious question
    (imo interesting topic) but understanding this
    c compiler code would be easier with some
    comments by someone who maybe had written
    c compiler already can manage to understand
    it and could explain this

    this above function probably write asm code
    for switch statement but what exactly (i understand it only partially)
     
    fir, Jun 2, 2013
    #4
  5. It is tough to study one function of a complex system in isolation. If
    you really want to understand this, you need at least three things:

    An understanding of the PDP-11/20 assembler language which is the
    output it produces. Note also that some of the output is in octal and
    some in decimal.

    An understanding of the functions it calls. What do statement()
    and label() do?

    An appreciation of the fact that the code is not in C as we know
    it now but in an earlier language that will eventually evolve into C.
    For example, swp is not an array but a pointer. Is the '&' in the
    while statement the "logical and" or the "bitwise and" operator?

     
    Barry Schwarz, Jun 2, 2013
    #5
  6. fir

    Eric Sosman Guest

    The code you posted is not C, although it may be some kind
    of "C with extras" dialect. For example,

    extern swp[], isn, swtab[], printf, deflab, statement, brklab;
    extern label;

    Most of these declarations have meaning ("are legal") in C as it
    was circa 1989; all of them have been prohibited in C since 1999.
    Even in 1989 the declaration of `printf' was incorrect and caused
    undefined behavior. (The declaration of `isn' is also suspect, but
    the undefined behavior in its case is likely to be "it works as
    intended, by luck and not by design.")

    int sswp[], dl, cv, swlab;
    sswp = swp;

    This assignment cannot be performed in C.

    if (swp==0)

    This test is always false in C ...

    swp = swtab;

    .... which is lucky, because this assignment cannot be made in C.

    swlab = isn++;

    Since `isn' has not been initialized or assigned to, its value is
    indeterminate and the behavior of this line is undefined in C.

    ... and so on. I think your first task is to discover what
    language this code is written in (it isn't C). Then you can look
    for an appropriate forum; this isn't it.

    Moral: It isn't necessarily C just because it has semicolons.
     
    Eric Sosman, Jun 2, 2013
    #6
  7. fir

    mark.bluemel Guest

    "Quite fine" - in what sense and for what purpose?

    Studying a compiler for a no longer extant language written over 40 years ago in a language which is also no longer extant doesn't seem a particularlyfruitful course of action.

    It's unlikely you'll find many people keen to join you in this exploration.

    If you are interested in compilers, there is probably a newsgroup better suited to aiding your studies (comp.compilers, perhaps). If you are interested in the language that this compiler is implemented in (as Barry and Eric point out, it isn't anything that would be recognised as C now - even DennisRitchie's comments on the webpage you reference suggest as much), then there is probably a newsgroup or mailing list somewhere which deals in computing archeology, but I don't know what it is.

    As the sources you're examining are neither written in nor compile what we now know as C, it's doubtful that this newsgroup will produce much help.
     
    mark.bluemel, Jun 2, 2013
    #7
  8. fir

    Seebs Guest

    This is a very very broad and open-ended question, and impossible to
    answer usefully without some kind of context.
    Probably, but... Maybe if you could narrow it down a little to indicate
    what it is you find confusing, or ask a concrete question? I don't know
    the program, but it's K&R C or thereabouts, and it's unclear what the
    problem you're having is. Do you need help understanding the pre-ISO
    C, or do you want to know the intended purpose of the routine, or what?

    An attempt to fully explain this code in detail would be the equivalent
    of a four-to-five page technical paper, easily, and would be of no value
    if it turned out your question had been something else.

    -s
     
    Seebs, Jun 2, 2013
    #8
  9. It is difficult to try to understand something as complex as a compiler
    with just one little part. Still...

    I believe that the [] notation is what today would be *swp, *swtab.

    It stayed legal for function parameters, but not for other variables.
    all compilers I know of that have ISN it is Internal Statement Number.
    Usually counting statements as they appear, for the compiler to keep
    track of, and sometimes appearing in error messages. At the time it
    was more usual than now for compilers to generate a listing file,
    among other things with the ISN printed at the beginning of each line.

    Then at the end, sometimes optional, an index of which variables and
    such were used at which ISN.
    In this case, the ISN is used as a comment in the generated assembly.
    Now the isn is used as a label in the assembly code.
    The DEC assemblers normally expect octal, but these are labels,
    which can be whatever symbolic names the person (or program) generating
    the code wants to use.
    I believe this is for the switch statement, which here writes out
    a table of octal values (presumably addresses) and comments with
    their label (ISN).
    Should be interesting to get running on modern compilers.

    Note that emulators for the PDP-10 (someone suggested it was PDP-10
    assembler code) that might be able to run the compiled code.

    You might even be able to port the compiler back to the PDP-10 again.

    -- glen
     
    glen herrmannsfeldt, Jun 2, 2013
    #9
  10. The link in the original post is:

    http://cm.bell-labs.com/cm/cs/who/dmr/primevalC.html

    It's the source code of one of the very first C compilers, written
    (mostly?) by Dennis Ritchie.
    It's far older than 1989. It's from 1972 or 1973. It's not just
    pre-ANSI; it's pre-K&R1.
    It's not the assignment that's changed, it's the declaration. In the
    version of C this was written in, "int sswp[];" actually defined sswp as
    a pointer.

    [...]
    And it isn't necessarily not-C just because it's not *modern* C.

    One could argue that it is C (because that's what C was like back then)
    or that it isn't (because it doesn't conform to any C standard). I'd
    say that comp.lang.c is the most appropriate place to discuss the code
    -- as long as it's made clear how old it is and how much the language
    has evolved since then.

    Some interesting quotes from the cited web page:

    The earlier compiler does not know about structures at all: the
    string "struct" does not appear anywhere. The second tape has a
    compiler that does implement structures in a way that begins to
    approach their current meaning. Their declaration syntax seems
    to use () instead of {}, but . and -> for specifying members
    of a structure itself and members of a pointed-to structure
    are both there.

    Neither compiler yet handled the general declaration syntax of
    today or even K&R I, with its compound declarators like the one
    in int **ipp; . The compilers have not yet evolved the notion
    of compounding of type constructors ("array of pointers to
    functions", for example). These would appear, though, by 5th
    or 6th edition Unix (say 1975), as described (in Postscript)
    in the C manual a couple of years after these versions.

    Instead, pointer declarations were written in the style int
    ip[];. A fossil from this era survives even in modern C, where
    the notation can be used in declarations of arguments. On the
    other hand, the later of the two does accept the * notation,
    even though it doesn't use it. (Evolving compilers written in
    their own language are careful not to take advantage of their
    own latest features.)

    It's interesting to note that the earlier compiler has a
    commented-out preparation for a "long" keyword; the later one
    takes over its slot for "struct." Implementation of long was
    a few years away.

    Aside from their small size, perhaps the most striking
    thing about these programs is their primitive construction,
    particularly the many constants strewn throughout; they are
    used for names of tokens, for example. This is because the
    preprocessor didn't exist at the time.

    The code is of great *historical* interest to programming language
    historians -- and it helps explain some of the odd quirks that have
    survived into modern ISO C. It's probably not the best resource
    for understanding either modern C or modern compiler construction.
     
    Keith Thompson, Jun 2, 2013
    #10
  11. It predates K&R C by 6-7 years -- years in which the language changed *a
    lot*.
     
    Keith Thompson, Jun 2, 2013
    #11
  12. fir

    James Kuyper Guest

    On 06/02/2013 01:57 PM, Eric Sosman wrote:
    ....
    The very first message in this thread contained the following link:
    <http://cm.bell-labs.com/cm/cs/who/dmr/primevalC.html>, which explains
    what language it was written in. Calling it "not C" is technically
    correct, but it would give a better idea of what is going on to say "not
    yet C". Calling it "C with extras" is funny, given what it actually is.
     
    James Kuyper, Jun 2, 2013
    #12
  13. fir

    Eric Sosman Guest

    Thanks, Richard, Keith, and James. The fossil-nature of the
    link had not penetrated my fossilized skull.

    As for "with extras:" Where is it written that `extras' is
    an unsigned type? ;-)
     
    Eric Sosman, Jun 2, 2013
    #13
  14. fir

    fir Guest

    W dniu niedziela, 2 czerwca 2013 21:27:45 UTC+2 użytkownik Seebs napisał:
    I would like to understand how this
    compiler works (how it uses the main
    tables in it, what is the main flow of
    it and what it just do - It is possible
    imo (I got no trouble in understanding
    this pre-c language but i do not have
    knowledge in compiler writing so I
    do not understand the basic idea of its
    control flow and its basic tables this
    flow operates on (And I am searching
    for help with just this)
     
    fir, Jun 2, 2013
    #14
  15. fir

    fir Guest

    W dniu niedziela, 2 czerwca 2013 21:40:03 UTC+2 użytkownik glen herrmannsfeldt napisał:
    (sorry for google groups spoils the contents above )

    this isn advice - interesting, what do you
    think is such statement here, the 'lines'
    seperated by ";" ? is this counted as
    statements numbers through all the source
    file or just in one compiled function -
    how do you think?

    how do you thing the other entties here are?
    i think sw can be short for switch (as this is
    probably routine that compiles switch
    construction into assembly

    some comments to the code and full source
    is under the first link in initial post

    (prof. fir)
     
    fir, Jun 2, 2013
    #15
  16. fir

    Seebs Guest

    It's a compiler for a language which hasn't been in use for ~30-40 years,
    and which is unlikely to ever be in use again, written in just such a
    language. I can't really see a lot of applicability here.

    So, basically: I think it's neat that your curious, but speaking for
    myself, I am unwilling to put in many hours of researching a language
    which hasn't existed since before I had all my adult teeth just to
    explore an idle curiousity.

    -s
     
    Seebs, Jun 3, 2013
    #16
  17. fir

    fir Guest

    W dniu poniedziałek, 3 czerwca 2013 01:08:43 UTC+2 użytkownik Seebs napisał:
    I THINK it should be NOT SO HARD for someone who have some general knovledge in compiler
    writing (and reversing it may be no so idle
    could give some insights too ) also I never
    saw a smaller/simpler c (or simpler-c)
    compiler than this one (if someone knows
    some simpler to understand you can tell me )
    (maybe some easy to read articles aboud handwriting simple c-dialect compilers?)
     
    fir, Jun 3, 2013
    #17
  18. You will need at least a little knowledge of compiler writing.

    You might find some 40 year old compiler books to help you.

    Compilers back then were often much simpler than most today.
    For one, the languages were simpler, but also they had to be
    to fit into smaller memory. (Though sometimes that makes things
    harder. For example, they might use overlays.)

    My favorite from those days is:

    http://www.amazon.com/dp/047132776X

    Which is available for $0.01 (plus $3.99 shipping).

    They don't get much cheaper than that.

    Well, if you aren't in the US you might want to find a more local
    source.

    There are likely many other 40 year old compiler books for similar
    prices. The basic ideas haven't changed that much.

    -- glen
     
    glen herrmannsfeldt, Jun 3, 2013
    #18
  19. fir

    fir Guest

    W dniu poniedziałek, 3 czerwca 2013 01:58:57 UTC+2 użytkownik glen herrmannsfeldt napisał:
    I am afraid i will not manage to buy
    this, :C couldnt you try to tell me the
    mostimportant things if you understand
    some in c'onjunction' with linked "dmr c compiler"? (if you look at it you will see
    a really small source)
    Or maybe some free article on this 40 yr
    old compilers (I also think that those
    40 year old and 16 kb short should be much
    easier than those of today) ;]
     
    fir, Jun 3, 2013
    #19
  20. fir

    Öö Tiib Guest

    It is program written in ancient C that translates ancient C text
    into assembler of ancient processor. It outputs that assember
    using printf. It is hard to see how you could use that for anything.
    It just takes understanding of two languages and skill of writing
    a program that translates one language to other to write such a
    compiler. It is hard to find experts for several decades old
    dialect of C and for that assembler so what is the point?

    Maybe find source code for somewhat more modern C compiler? I have
    seen things advertised now and then like "TinyCC" or "Portable C
    Compiler". It is more likely to get such to work on devices
    that you have and may be even to compile you something.
     
    Öö Tiib, Jun 3, 2013
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.