understanding this (dmc c compiler)

Discussion in 'C Programming' started by fir, May 26, 2013.

  1. fir

    Seebs Guest

    Well, the thing is.

    You're asking for help because you don't understand the code. Which means
    you don't actually know what it takes to understand the code. So maybe you
    should take my word for it that it would be a lot of work to figure out
    how it works and explain it.
    So, I was hanging out under a street light, looking intently at the ground,
    and a guy asked me what I was doing. "I'm looking for a coin. I dropped it
    over there." I pointed at a dim alley about two hundred feet away. "So, if
    you dropped it over there, why are you looking here?" "Light's better."

    Trying to understand a compiler that is neither for C, nor written in C,
    is not going to do you much good. If compilers are too complicated (which
    they probably are), it might make sense to study something other than
    compilers until you understand C a lot better.

    Me, I learned C from roguelikes.

    -s
     
    Seebs, Jun 3, 2013
    #21
    1. Advertisements

  2. Ah, that explains that oddity in modern C.
    Is this, perhaps, the basis of why GCC uses L#### for labels, with the
    number being the ISN? I'd idly wondered about that for years.

    Thanks; I learned two things today, which is more than my quota for
    Sundays :)

    S
     
    Stephen Sprunk, Jun 3, 2013
    #22
    1. Advertisements

  3. (snip, I wrote)
    Reminds me of a story from when I took a compiler class, a little
    less than 40 years ago.

    Story is that one group wrote a compiler that generated assembly
    code with labels of the form XXXXnnnn. That is, four X's followed
    by a four digit number.

    At the same time, the assembler group wrote the assembler that hashed
    symbols using only the first four characters of the name. Oops.

    -- glen
     
    glen herrmannsfeldt, Jun 3, 2013
    #23
  4. fir

    Noob Guest

    Noob, Jun 3, 2013
    #24
  5. fir

    BartC Guest

    I've written loads of compilers of my own. But I have tremendous problems
    following my own code after being away from it for a couple of months, let
    alone for decades, and of someone else's compiler, and in an obsolete
    language.

    It's the bigger picture that is needed, not just some tiny fragment of code.

    Perhaps look at some other, more recent, and better documented compiler
    sources, then go back to that one. (And remember very old compilers might
    have needed to work differently because of memory constraints: a single pass
    for example, or multiple discrete passes, whatever.)
     
    BartC, Jun 3, 2013
    #25
  6. fir

    Mark Bluemel Guest


    This group, mainly, considers the C language not compilers. It is not
    really a suitable place to seek compiler development tuition, least of
    all for a compiler which (again) is neither written in, nor compiles,
    what we now regard as C.

    If you want to learn about compiler writing. I suggest you try the
    comp.compilers newsgroup. There will probably be more compiler writing
    expertise there than there is here.

    If for some reason you want to learn about the specific compilers on
    the webpage you cite, you should hold a seance and try to contact the
    spirit of Dennis Ritchie to discuss it.
     
    Mark Bluemel, Jun 3, 2013
    #26

  7. Generally, the compiler handles differently expressions and
    all other constructs. First part of compiler is lexical
    analyser, namely 'symbol()' function in c00.c -- this
    function returns code of current symbol. Parsing mostly
    uses recursive descent method, for example 'statement()'
    function is responsible for parsing (and generating code!)
    of a single (possibly compound statemnt). Expressions
    are parsed using priority based parsed (the 'tree()'
    function) building parse trees which are stored in a
    temporary file for the second pass. The first pass
    produces assembly code for most constructs. However,
    for expressions instead of real code the '#' sign
    is printed to assembly file (and tree is stored in the
    temporary file). The 'pswitch' routine is responsible
    for parsing and generating code for body of switch
    expression. It is called after the 'switch (....)'
    part has been handled. In particular code which
    computes value to switch on is already generated.
    First part of 'pswitch()' handles variables: info about
    switch cases is stored in the global array 'swtab',
    the 'swp' variable points to info abit current switch.
    'deflab' variable stores label number of default
    label of current switch, brklab stores label number
    of label ending current switch. 'pswitch()' stores
    info about (possible) surrounding switch in its
    local variables, so that it can restore it at the
    end -- that way switches can nest. Note 'pswitch()'
    restores 'swp' and 'deflab' while 'brlab' is
    restored by code calling 'pswitch()'. After
    setting up variables 'pswitch()' generates code
    to perform actual switch, using:

    printf("jsr pc,bswitch; l%d\n", swlab);
    ^^^^
    label
    I know only little about PDP-11 assembly but this seem
    to be call to a subroutine resposible for performing
    actual switch action. When 'jsr' is executed
    value to be switched on is in machine register.
    Apparently the label after the 'jsr' instruction serves
    as the second argument. More precisely, assembler is
    supposed to replace label by corresponding address.
    This addres points to table of pairs (constant, address)
    were constants correspond to cases, while address is
    addres of code for the case. After emmiting the
    'jsr' instruction and setting variables 'pswitch()'
    calls 'statement(0)' to parse swich body (which is
    a single (usuallly compound) C statement). 'statement'
    generates code for the body. When 'statement' meets 'case'
    it generates (emits) a new label and puts (constant, label)
    pair in place pointed by 'swp' and increments 'swp' to
    point to free space. When 'statement' meets 'break' it
    generates jump to 'brklab'. After 'statement' finished
    its work 'pswitch()' first checks if it needs add
    defalut label (if there is no explicit defaut case in
    the switch the compiler must add a fake one). Then
    it emits 'brklab' and then emits
    table of data used by 'jsr'. The '.data' assembler
    directive means that corresponding output will be
    put in "data space" so that it is not mistaken as
    instructions. After emmiting table of
    switch data 'pswitch()' emits '.text' assembler
    directive which means that after it there will be
    code (assembly instructions). Then 'pswitch()'
    restores variables and returns.
     
    Waldek Hebisch, Jul 29, 2013
    #27
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.