Obfuscated Languages Interpreter

Discussion in 'C Programming' started by G. Nick D'Andrea, Jun 30, 2003.

  1. I've just written an interpreter for an obfuscated language of my own
    design, which I call "tarfu." The language itself has no way of storing
    information except within the code itself, meaning there are no variables.
    Here's the source to the interpreter, I figured I'd post it here (because I
    couldn't find a newsgroup devoted to obfuscated languages) and see if you
    had any comments or questions. The interpreter is fairly straightfoward
    and therefore I don't think it's necessary to include documentation about
    the language.
    G. Nick D'Andrea, Jun 30, 2003
    #1
    1. Advertising

  2. On Mon, 30 Jun 2003, G. Nick D'Andrea wrote:
    >
    > I've just written an interpreter for an obfuscated language of my own
    > design, which I call "tarfu." The language itself has no way of storing
    > information except within the code itself, meaning there are no variables.
    > Here's the source to the interpreter, I figured I'd post it here (because I
    > couldn't find a newsgroup devoted to obfuscated languages) and see if you
    > had any comments or questions.


    comp.lang.misc is the usual place for posting random language
    announcements. If you post to c.l.c you're going to get critiques
    of your C coding style, which maybe is what you want.

    Also, in either place you're going to get the admonishment not to try
    attaching files to your message. If it's a binary, post to one of
    the binaries groups. If it's a short text file, put it in the body
    of your message. If it's large, put it online and give a link to it in
    your message.

    I've opened your attachment anyway, but some people may not see it at all.
    So here it is, with my comments.

    ==begin file tarfu.c==

    /* tarfu interpreter
    * (c) 2003 by Harry Altman and G. Nick D'Andrea
    * Provided under the terms of
    * the GNU General Public License
    */

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #define RIGHT 1 /* Will we ever actually use these? */
    #define LEFT -1

    unsigned char *p;
    size_t ip=0;
    size_t sp=0;
    size_t size;
    int dir=1;
    FILE *afile=NULL;

    (My comment: You really should try to give global variables
    expressive names. 'size' is NOT a good name to give to a global.
    At least make them static, so they're not overly polluting
    if you ever expand the program to more than one file.)
    (snip)

    int main(int argc, char **argv)
    {
    FILE *pfile;

    (Oh dear, another 8-character tabber. Either that, or you didn't
    detab your program before posting. That's ugly.)

    int i;
    if(argc==1)
    {
    puts("tarfu Interpreter.\n"
    "Usage: tarfu [OPTIONS] FILE\n"
    "Use \"tarfu -h\" for help");
    exit(1);

    (Non-standard return code.)

    }
    for(i=1;i<argc;i++)
    {
    if(!strcmp(argv,"-h"))

    As a style note, it might be more user-friendly to allow
    "-H" to be a help screen, too.

    {
    puts("tarfu Interpreter.\n"
    "Usage: tarfu [OPTIONS] FILE\n"
    "FILE: Script file\n"
    "OPTIONS:\n"
    "-a\tInput file for the script");
    exit(0);
    }
    else if(!strcmp(argv,"-a"))
    {
    if(!(afile=fopen(argv[++i],"r")))

    What if i==argc-1? Then you're passing NULL to fopen.
    That's not particularly good design, even if it is defined
    to do something sensible (which I'm not going to bother looking
    up).
    {
    fprintf(stderr, "File \"%s\" Not Found\n", argv);
    exit(1);
    }
    }
    else
    {
    if(!(pfile=fopen(argv,"r")))
    {
    fprintf(stderr, "File \"%s\" Not Found\n", argv);
    exit(1);
    }
    }
    }
    if (afile==NULL) afile=stdin;
    if(!(pfile))
    {
    fputs("Please specify an input file", stderr);
    }
    if(!(p=(char*)malloc((size=getsize(pfile))+1)))

    Casting malloc is only required in C++, and never good style.
    There's no reason to combine the assignments, either.

    size = getsize(pfile);
    p = malloc(size+1);
    if (p == NULL) ...

    See how much simpler that looks?

    {
    fputs("Not enough memory.\n",stderr);
    exit(1);
    }

    /* Get the input */
    slurp(pfile,p);

    /* Do stuff with the input */
    run();

    Don't forget to
    return 0;
    at the end of main().

    }

    size_t getsize(FILE *f)
    /* Precondition: FILE *f exists
    * Postcondition: Returns the size of file *f
    */
    {
    size_t i=0;
    while(getc(f)!=EOF)
    i++;
    rewind(f);
    return i;
    }

    /* From Harry's Hexed Interpreter */
    void slurp(FILE *infile, char *out)
    /* Precondition: *infile exists, *out is the program text
    * Postcondition: There is none, you fool!
    */
    {
    int c;
    size_t i=0;
    *out='\0';
    while((c=fgetc(infile))!=EOF)
    {
    out[i++]=(char)c; out='\0';
    }
    }

    (The non-I/O part of function can be speeded up by a factor of two.

    void slurp(FILE *in, char *out)
    {
    int c;
    while ((c = getc(in)) != EOF)
    *out++ = (char) c;
    *out = '\0';
    return;
    }

    Why copy other people's code if it's so bad? Find code that
    does what you want *quickly*.)

    /* End code from Harry's Hexed Interpreter */

    void run(void)
    /* Precondition: Why don't you figure this out for yourself
    * Postcondition: The script is executed
    */
    {
    while(1)
    {
    pre();
    doStuff();
    ip+=dir;
    }
    }

    void pre(void)
    /* Precondition: None
    * Postcondition: Does the preprocessing for the iteration
    */
    {
    size_t i;
    int dir2=(dir==RIGHT);
    for(i=(dir2 ? 0 : size-1) ; i<size ; i+=dir) /* i is a size_t, and therefore unsigned */
    {
    if(p=='*')
    {
    if(i+dir==size || i+dir ==-1)
    {
    break; /*continue would also work here*/
    }
    if(p[i-dir]=='*')
    {
    continue;
    }

    (It is now obvious that you are not even trying to write clearly.
    I'll make, say, two more comments and then stop reading.)

    if('0' <= p[i+dir] && p[i+dir] <= '9')
    {
    p[i+dir]-='0';
    if(!dir2) swap(p+i,p+i-1);
    if(sp>= i+dir2) sp--;
    if(ip>= i+dir2) ip--;
    memmove(p+i-!dir2,p+i+dir2,size-i+!dir2); /*Nick is more confused.*/
    (One.)
    }
    else if('a' <= p[i+dir] && p[i+dir] <= 'a'+30)
    {
    p[i+dir] -= ('a'-10);
    if(!dir2) swap(p+i,p+i-1);
    if(sp>= i+dir2) sp--;
    if(ip>= i+dir2) ip--;
    memmove(p+i-!dir2,p+i+dir2,size-i+!dir2); /*Nick is most confused.*/
    (Two.)

    -Arthur
    Arthur J. O'Dwyer, Jun 30, 2003
    #2
    1. Advertising

  3. [OT] Re: Obfuscated Languages Interpreter

    On Tue, 1 Jul 2003, John Smith wrote:
    >
    > "Arthur J. O'Dwyer" <> wrote in message
    > news:p...
    > >
    > > On Mon, 30 Jun 2003, G. Nick D'Andrea wrote:

    >
    > Gees Arthur, you must be bored to plough through that lot... Even worse than
    > I can produce :)


    Well, he asked for it. ;) Besides, I'm a fan of obfuscated languages,
    in general, so it caught my eye. OTOH, I *do* like to see some actual
    indication of what a program is supposed to do. It's fun to write
    intentionally obfuscated stuff; it's usually not fun to try to read it.
    So when I saw memmove() being scattered randomly around the code, I
    just stopped reading it. :)

    -Arthur
    Arthur J. O'Dwyer, Jul 1, 2003
    #3
  4. G. Nick D'Andrea

    David Rubin Guest

    Arthur J. O'Dwyer wrote:

    [snip]
    > int main(int argc, char **argv)
    > {
    > FILE *pfile;
    >
    > (Oh dear, another 8-character tabber. Either that, or you didn't
    > detab your program before posting. That's ugly.)
    >
    > int i;
    > if(argc==1)
    > {
    > puts("tarfu Interpreter.\n"
    > "Usage: tarfu [OPTIONS] FILE\n"
    > "Use \"tarfu -h\" for help");
    > exit(1);
    >
    > (Non-standard return code.)
    >
    > }
    > for(i=1;i<argc;i++)
    > {
    > if(!strcmp(argv,"-h"))
    >
    > As a style note, it might be more user-friendly to allow
    > "-H" to be a help screen, too.


    IMO, you should print the usage whenever you get an *unrecognized* option. This
    allows people to type -h, -H, -?, and whatever else seems natural to them. If
    they are all unrecognized by your program, they all trigger the usage message.

    >
    > {
    > puts("tarfu Interpreter.\n"
    > "Usage: tarfu [OPTIONS] FILE\n"
    > "FILE: Script file\n"
    > "OPTIONS:\n"
    > "-a\tInput file for the script");
    > exit(0);
    > }
    > else if(!strcmp(argv,"-a"))


    This way of parsing command line options is a little clunky because it uses a
    lot of code, doesn't allow you to combine options (e.g., -qvt, -qtv, -vtq, etc),
    and is not easily maintainable.

    [snip - malloc error]
    > {
    > fputs("Not enough memory.\n",stderr);
    > exit(1);
    > }


    consider

    perror("tarfu");
    exit(EXIT_FAILURE);

    perror will print something like

    tarfu: out of memory

    Even better would be to assign

    argv0 = aregv[0];

    in the beginning of your program and use it throughout to refer to the program
    name. That way, the errors refleft the name the user gives to the program, not
    the one you choose.

    >
    > /* Get the input */
    > slurp(pfile,p);


    fclose(pfile);

    > /* Do stuff with the input */
    > run();
    >
    > Don't forget to
    > return 0;
    > at the end of main().
    >
    > }


    [snip]
    > /* From Harry's Hexed Interpreter */
    > void slurp(FILE *infile, char *out)
    > /* Precondition: *infile exists, *out is the program text
    > * Postcondition: There is none, you fool!
    > */
    > {
    > int c;
    > size_t i=0;
    > *out='\0';
    > while((c=fgetc(infile))!=EOF)
    > {
    > out[i++]=(char)c; out='\0';
    > }
    > }


    This would be a lot more efficient if you used fgets.

    It would be useful if you provided a grammar for your language so people could
    see how to program in it as well as check your interpreter for correctness.

    /david

    --
    FORTRAN was the language of choice
    for the same reason that three-legged races are popular.
    -- Ken Thompson, "Reflections on Trusting Trust"
    David Rubin, Jul 2, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    677
  2. chris kramer
    Replies:
    5
    Views:
    557
    chris kramer
    Apr 9, 2004
  3. fix

    Obfuscated xsl

    fix, Nov 27, 2003, in forum: XML
    Replies:
    4
    Views:
    581
    Patrick TJ McPhee
    Nov 28, 2003
  4. Replies:
    3
    Views:
    758
    Ziga Seilnacht
    Jan 3, 2007
  5. Rafael El Frederico
    Replies:
    4
    Views:
    114
    Robert Klemme
    Dec 27, 2008
Loading...

Share This Page