writing a compiler for Monster language using C language

Discussion in 'C Programming' started by Shravani, Mar 16, 2008.

  1. Shravani

    Shravani Guest

    hi,
    i have to write a compiler for Monster language using C.example
    programs of the language are:

    EXAMPLE 1

    /* Test Program - Module Structure */

    module fact

    export N2N;
    export fact;

    type N2N is -> int ret int throws;

    val fact : N2N;

    fact := fun(n) { if n=0 then return 1 else return n*fact(n-1); };

    end

    module main

    import fact.N2N;
    import fact.fact;

    type TLF is -> int ret throws;

    val x:int;

    val main : TLF;

    main := fun(n) { val result:int; result:= fact(n); print result; };


    end

    EXAMPLE 2

    module Stack

    export Stack;

    type OverFlowException is * errMsg : str * size : int;
    type UnderFlowException is * errMsg : str;
    type Stack is * push : -> int ret throws OverflowException
    * pop : -> ret throws UnderFlowException
    * top : -> ret int throws UnderFlowException
    * size : -> ret int throws
    * isEmpty : -> ret int throws;
    type IntArray is [] int;


    val s : Stack;

    val maxSize : int;

    val ofe : OverFlowException;
    maxSize := 100;
    ofe := { "push invalid: stack is full", 0 };

    s := obj {
    val contents : IntArray;
    val _size : int;

    contents := [0, 0, 0, 0 ,0];
    _size := 0;

    size := fun() { return _size; };
    isEmpty := fun() { return (size()) = 0; };
    top := fun() { val ufe : UnderFlowException;
    ufe.errMsg := "top invalid: stack is empty" ;
    if isEmpty() then throw ufe else return contents[size()];
    };
    pop := fun() { val ufe : UnderFlowException;
    ufe.errMsg := "top invalid: stack is empty" ;
    if isEmpty() then throw ufe else _size := _size - 1 ;
    };
    push := fun(value) { if _size<maxSize then { _size := _size+1;
    contents[_size] := value; }
    else throw ofe;
    };
    };

    end

    module mymodule

    import Stack.Stack;
    import Stack.s;

    type TopLevelFun is -> ret throws;

    val stackTest:TopLevelFun;

    stackTest:=fun() { val x:int; s.push(0); s.push(1); s.push(2);
    s.pop(); s.push(3);
    x:= s.top();
    print x; s.pop();
    y:= s.top();
    print y;
    };



    end



    What i have to do is to write a compiler for such programs.in the
    first phase i need to write a scanner that produces the token stream
    of the input program.that is:
    * Scanner
    Requirements Specification:
    Input: Program File
    Output: Token Stream
    Side Effects: Comments and White spaces removed
    Exceptions: Invalid tokens, Invalid comment
    Interface Requirements:
    void initializeScanner(char *filename);
    Token nextToken(); // should scan and return the next token void
    printTokenStream(FILE f);

    here's the list of tokens:
    /* Keywords */
    1. TK_KEY_MOD "module"
    2. TK_KEY_END "end"
    3. TK_KEY_IMPORT "import"
    4. TK_KEY_EXPORT "export"
    5. TK_KEY_TYPE "type"
    6. TK_KEY_IS "is"
    7. TK_KEY_INT "int"
    8. TK_KEY_FLOAT "float"
    9. TK_KEY_STR "str"
    10. TK_KEY_RET "ret"
    11. TK_KEY_THROWS "throws"
    12. TK_KEY_VAL "val"
    13. TK_KEY_FUN "fun"
    14. TK_KEY_IF "if"
    15. TK_KEY_THEN "then"
    16. TK_KEY_ELSE "else"
    17. TK_KEY_PRINT "print"
    18. TK_KEY_OBJ "obj"
    19. TK_KEY_RETURN "return"
    20. TK_KEY_THROW "throw"
    21. TK_KEY_THIS "this"


    /* Separators */
    1. TK_SEMI ';'
    2. TK_LPAR '('
    3. TK_RPAR ')'
    4. TK_COMMA ','
    5. TK_COLON ':'
    6. TK_LSQUA '['
    7. TK_RSQUA ']'
    8. TK_LBRACE '{'
    9. TK_RBRACE '}'


    /* Operators */
    1. TK_ARROW '->'
    2. TK_STAR '*'
    3. TK_ASSIGN ':='
    4. TK_DOT '.'
    5. TK_MINUS '-'
    6. TK_PLUS '+'
    7. TK_DIV '/'
    8. TK_HASH '#'
    9. TK_LESS '<'
    10. TK_GREAT '>'
    11. TK_EQUAL '='
    12. TK_AND '&'
    13. TK_OR '|'
    14. TK_NOT '!'

    /* Other */
    1. TK_ID (ALPHA | UNDER) (ALPHA | DIGIT | UNDER)*
    2. TK_INTVAL DIGIT+
    3. TK_FLOATVAL DIGIT+ '.' DIGIT+
    4. TK_STRVAL '"' (NOQUOTE)* '"'

    where ALPHA is any alphabetic character,
    DIGIT is any digit character,
    UNDER is the underscore character, and
    NOQUOTE is any character except the double-quote.


    could anyone please help me out in writing this scanner?i need to take
    the source program as input and produce the token stream,consisting of
    tokens of the form described above,as the output.
    Shravani, Mar 16, 2008
    #1
    1. Advertising

  2. Shravani

    Morris Dovey Guest

    Shravani wrote:
    >
    > hi,
    > i have to write a compiler for Monster language using C.example
    > programs of the language are:


    <snip>

    > could anyone please help me out in writing this scanner?i need to take
    > the source program as input and produce the token stream,consisting of
    > tokens of the form described above,as the output.


    You might find it helpful to write a BNF description of Monster.

    You'll probably want:

    A function to return the character at the current point in the
    input stream.

    A function to "eat" whitespace.

    A function to match a specific character sequence at the current
    point in the input stream.

    A function to match a specific single character at the current
    point in the input stream.

    The ability to revert the current point in the input stream to
    where it was before a specification failed, so that another
    specification can be tested.

    The ability to produce latent output.

    The ability to promote latent output to actual output.

    Function(s) for symbol table manipulation.

    --
    Morris Dovey
    DeSoto Solar
    DeSoto, Iowa USA
    http://www.iedu.com/DeSoto/
    Morris Dovey, Mar 16, 2008
    #2
    1. Advertising

  3. "Shravani" <> wrote in message
    news:...
    > hi,
    > i have to write a compiler for Monster language using C.example
    > programs of the language are:
    >

    Use MiniBasic as the basis for your compiler.

    Strip out all the BASIC keywords and add monster ones. You'll also need to
    do a considerable amount of work on the variables (MiniBasic's are all
    global). Basically start from a simple expression parser and work back up.

    I've got a semi-compiled version of MiniBasic I could pass to you. This
    tokenises expressions, indexes variables, and precalculates jumps. It also
    needs to put expression into reverse Polish and optimise them, which isn't
    done yet.

    --
    Free games and programming goodies.
    http://www.personal.leeds.ac.uk/~bgy1mm
    Malcolm McLean, Mar 16, 2008
    #3
  4. Shravani <> writes:
    > i have to write a compiler for Monster language using C.example
    > programs of the language are:
    >
    > EXAMPLE 1
    >
    > /* Test Program - Module Structure */

    [98 lines deleted]
    > end
    >
    >
    >
    > What i have to do is to write a compiler for such programs.in the
    > first phase i need to write a scanner that produces the token stream
    > of the input program.that is:

    [74 lines deleted]
    >
    >
    > could anyone please help me out in writing this scanner?i need to take
    > the source program as input and produce the token stream,consisting of
    > tokens of the form described above,as the output.


    What have you done so far? What specific problems have you run into?

    Don't expect us to do your homework for you.

    --
    Keith Thompson (The_Other_Keith) <>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Mar 16, 2008
    #4
  5. Shravani

    santosh Guest

    Shravani wrote:

    > hi,
    > i have to write a compiler for Monster language using C.example
    > programs of the language are:

    [...]
    > could anyone please help me out in writing this scanner?i need to take
    > the source program as input and produce the token stream,consisting of
    > tokens of the form described above,as the output.


    See:

    <http://www.compilers.net/>
    <http://www.personal.kent.edu/~rmuhamma/Compilers/compiler.html>
    <http://cs.wwc.edu/~aabyan/464/Book/>
    <http://www.cs.sjsu.edu/~louden/cmptext/>
    <http://channel9.msdn.com/ShowPost.aspx?PostID=305338>
    <http://www.gnu.org/software/bison/>
    <http://www.garshol.priv.no/download/text/bnf.html>
    <http://www.cs.chalmers.se/~markus/BNFC/>

    If you have problems in specific pieces of C code, you can post them
    here.
    santosh, Mar 16, 2008
    #5
  6. Shravani

    Bartc Guest

    "Shravani" <> wrote in message
    news:...
    > hi,
    > i have to write a compiler for Monster language using C.example
    > programs of the language are:
    >


    Couldn't resist trying out my new C skills. My attempt at this tokeniser is
    below.

    The specification isn't quite the same as yours, function names are
    different, the tokens are just printed to the console.

    Not all punctuation symbols are recognised; these are easy to add. I've
    added a couple extra tokens, tk_error, and tk_eof.

    Some tokens with an associated value: tk_id, tk_intval, tk_strval, return
    the value in global variables (this is frowned on apparently). Also floating
    number constants are not recognised. And letter case I've assumed is
    significant.

    --
    Bart

    /* tokenise file "input" */

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #define tk_key_mod 1
    #define tk_key_end 2
    #define tk_key_import 3
    #define tk_key_export 4
    #define tk_key_type 5
    #define tk_key_is 6
    #define tk_key_int 7
    #define tk_key_float 8
    #define tk_key_str 9
    #define tk_key_ret 10
    #define tk_key_throws 11
    #define tk_key_val 12
    #define tk_key_fun 13
    #define tk_key_if 14
    #define tk_key_then 15
    #define tk_key_else 16
    #define tk_key_print 17
    #define tk_key_obj 18
    #define tk_key_return 19
    #define tk_key_throw 20
    #define tk_key_this 21
    #define tk_semi 22
    #define tk_lpar 23
    #define tk_rpar 24
    #define tk_comma 25
    #define tk_colon 26
    #define tk_lsqua 27
    #define tk_rsqua 28
    #define tk_lbrace 29
    #define tk_rbrace 30
    #define tk_arrow 31
    #define tk_star 32
    #define tk_assign 33
    #define tk_dot 34
    #define tk_minus 35
    #define tk_plus 36
    #define tk_div 37
    #define tk_hash 38
    #define tk_less 39
    #define tk_great 40
    #define tk_equal 41
    #define tk_and 42
    #define tk_or 43
    #define tk_not 44
    #define tk_id 45
    #define tk_intval 46
    #define tk_floatval 47
    #define tk_strval 48
    #define tk_eof 49
    #define tk_error 50

    #define eofchar EOF
    #define maxid 257

    char* tokennames[] = {
    "",
    "module", //tk_key_mod
    "end", //tk_key_end
    "import", //tk_key_import
    "export", //tk_key_export
    "type", //tk_key_type
    "is", //tk_key_is
    "int", //tk_key_int
    "float", //tk_key_float
    "str", //tk_key_str
    "ret", //tk_key_ret
    "throws", //tk_key_throws
    "val", //tk_key_val
    "fun", //tk_key_fun
    "if", //tk_key_if
    "then", //tk_key_then
    "else", //tk_key_else
    "print", //tk_key_print
    "obj", //tk_key_obj
    "return", //tk_key_return
    "throw", //tk_key_throw
    "this", //tk_key_this
    ";", //tk_semi
    "(", //tk_lpar
    ")", //tk_rpar
    ",", //tk_comma
    ":", //tk_colon
    "[", //tk_lsqua
    "]", //tk_rsqua
    "{", //tk_lbrace
    "}", //tk_rbrace
    "->", //tk_arrow
    "*", //tk_star
    ":=", //tk_assign
    ".", //tk_dot
    "-", //tk_minus
    "+", //tk_plus
    "/", //tk_div
    "#", //tk_hash
    "<", //tk_less
    ">", //tk_great
    "=", //tk_equal
    "&", //tk_and
    "|", //tk_or
    "!", //tk_not
    "$ident", //tk_id
    "$intconst", //tk_intval
    "$realconst", //tk_floatval
    "$stringconst", //tk_strval
    "<EOF>", //tk_eof
    "!ERROR!"}; //tk_error

    int lookedahead;
    int eofreached;
    int nextchar;

    char alphaname[maxid];
    int numbervalue;
    char stringvalue[maxid];

    #define testfile "input"

    void scannerinit(void);
    int getnextchar(FILE*);
    int peeknextchar(FILE*);
    int getnexttoken(FILE*);
    int lookupkeyword(char*);

    int main(void)
    {FILE *f;
    int t;

    f=fopen(testfile,"rb");
    if (f==0)
    {puts("Couldn't find input file");
    return EXIT_FAILURE;
    };

    scannerinit();

    while(1)
    {
    t=getnexttoken(f);

    switch (t)
    {
    case tk_id:
    printf("Id: %s\n",alphaname);
    break;
    case tk_intval:
    printf("Int const: %d\n",numbervalue);
    break;
    case tk_strval:
    printf("String: %s\n",stringvalue);
    break;
    default:
    printf("Token: %s\n",tokennames[t]);
    };

    if (t==tk_eof) break;
    };

    fclose(f);
    return EXIT_SUCCESS;
    }

    void scannerinit()
    {
    lookedahead=0;
    eofreached=0;
    }

    int getnexttoken(FILE* f)
    {int c,d;
    char name[maxid];
    int i,t;

    while(1)
    {
    c=getnextchar(f);

    if (c==eofchar) return tk_eof;

    switch (c)
    {
    case 13: case 10: case 32: case 9:
    break;

    case ';': return tk_semi;
    case '(': return tk_lpar;
    case ')': return tk_rpar;
    case '{': return tk_lbrace;
    case '}': return tk_rbrace;
    case ',': return tk_rpar;

    case ':':
    d=peeknextchar(f);
    if (d=='=')
    { getnextchar(f);
    return tk_assign;
    };
    return tk_colon;

    case '/':
    d=peeknextchar(f);
    if (d!='*') return tk_div;
    getnextchar(f);

    while (1)
    {
    c=getnextchar(f);
    if (c==eofchar) return tk_eof;
    if (c=='*' && peeknextchar(f)=='/')
    { getnextchar(f);
    break;
    }
    };
    break;

    case '"':
    i=0;
    while(1)
    {c=getnextchar(f);
    if (c==eofchar) break;
    if (c=='"') break;
    stringvalue[i++]=c;
    };
    stringvalue=0;

    return tk_strval;
    default:
    goto xyzzy;

    };

    };

    xyzzy:

    if ((c>='A' && c<'Z') || (c>='a' && c<='z') || c=='_')
    {
    alphaname[0]=c;
    i=1;

    while(1)
    {c=peeknextchar(f);
    if ((c>='A' && c<'Z') || (c>='a' && c<='z') || c=='_' || (c>='0' &&
    c<='9'))
    {getnextchar(f);
    alphaname[i++]=c;
    }
    else
    break;
    };
    alphaname=0;

    t=lookupkeyword(alphaname);
    if (t) return t;
    return tk_id;
    };

    if (c>='0' && c<'9')
    {numbervalue=c-'0';
    while (1)
    {c=peeknextchar(f);
    if (c>='0' && c<='9')
    {getnextchar(f);
    numbervalue = numbervalue*10+c-'0';
    }
    else
    break;

    };

    return tk_intval;
    };

    return tk_error;
    }

    int getnextchar(FILE* f)
    {
    int c;

    if (eofreached) return eofchar;

    if (lookedahead)
    {
    lookedahead=0;
    return nextchar;
    }

    c=fgetc(f);

    if (c==EOF)
    { eofreached=1;
    return eofchar;
    };
    return c;
    }

    int peeknextchar(FILE* f)
    {
    if (lookedahead) return nextchar;
    nextchar=getnextchar(f);
    lookedahead=1;
    return nextchar;
    }


    int lookupkeyword(char* name)
    {int i;

    for (i=tk_key_mod; i<=tk_key_this; ++i)
    if (strcmp(tokennames,name)==0)
    return i;
    return 0;
    }
    Bartc, Mar 16, 2008
    #6
  7. Shravani

    Bartc Guest

    "Bartc" <> wrote in message
    news:sW8Dj.24075$...
    >
    > "Shravani" <> wrote in message
    > news:...
    >> hi,
    >> i have to write a compiler for Monster language using C.example
    >> programs of the language are:
    >>

    >
    > Couldn't resist trying out my new C skills. My attempt at this tokeniser
    > is below.



    Oh, and I've assumed ASCII input. If this is homework you might lose marks
    for making any such assumptions. The constants 13, 10, 32 and 9 represent
    likely whitespace in ASCII.

    --
    Bart
    Bartc, Mar 16, 2008
    #7
  8. "Bartc" <> writes:
    > "Bartc" <> wrote in message
    > news:sW8Dj.24075$...
    >> "Shravani" <> wrote in message
    >> news:...
    >>> hi,
    >>> i have to write a compiler for Monster language using C.example
    >>> programs of the language are:
    >>>

    >>
    >> Couldn't resist trying out my new C skills. My attempt at this tokeniser
    >> is below.

    >
    > Oh, and I've assumed ASCII input. If this is homework you might lose marks
    > for making any such assumptions. The constants 13, 10, 32 and 9 represent
    > likely whitespace in ASCII.


    And if he submits the solution you posted, he might lose marks for
    plagiarism.

    We traditionally don't do people's homework for them.

    --
    Keith Thompson (The_Other_Keith) <>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Mar 16, 2008
    #8
  9. Shravani

    Bartc Guest

    "Keith Thompson" <> wrote in message
    news:...
    > "Bartc" <> writes:
    >> "Bartc" <> wrote in message
    >> news:sW8Dj.24075$...
    >>> "Shravani" <> wrote in message
    >>> news:...
    >>>> hi,
    >>>> i have to write a compiler for Monster language using C.example
    >>>> programs of the language are:
    >>>>
    >>>
    >>> Couldn't resist trying out my new C skills. My attempt at this tokeniser
    >>> is below.

    >>
    >> Oh, and I've assumed ASCII input. If this is homework you might lose
    >> marks
    >> for making any such assumptions. The constants 13, 10, 32 and 9 represent
    >> likely whitespace in ASCII.

    >
    > And if he submits the solution you posted, he might lose marks for
    > plagiarism.


    My code was just to give ideas. And there's enough missing or wrong bits to
    give scope for much more work. Anyway I'm sure the OPs style will be very
    different and he knows that. And f he's doing a whole compiler, it will get
    a lot harder.

    > We traditionally don't do people's homework for them.


    I was using it to exercise my own skills, then decided to upload my attempt.

    --
    Bart
    Bartc, Mar 16, 2008
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?Q2FsdmluIEtE?=

    Cookies ... monster. Please help

    =?Utf-8?B?Q2FsdmluIEtE?=, Jun 9, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    480
    =?Utf-8?B?Q2FsdmluIEtE?=
    Jun 15, 2005
  2. HIK
    Replies:
    3
    Views:
    523
  3. giob
    Replies:
    7
    Views:
    446
  4. Fred
    Replies:
    1
    Views:
    594
    Neredbojias
    Sep 26, 2005
  5. Maria
    Replies:
    0
    Views:
    1,674
    Maria
    Sep 8, 2010
Loading...

Share This Page