Obfuscated Languages Interpreter

G

G. Nick D'Andrea

I've just written an interpreter for an obfuscated language of my own
design, which I call "tarfu." The language itself has no way of storing
information except within the code itself, meaning there are no variables.
Here's the source to the interpreter, I figured I'd post it here (because I
couldn't find a newsgroup devoted to obfuscated languages) and see if you
had any comments or questions. The interpreter is fairly straightfoward
and therefore I don't think it's necessary to include documentation about
the language.
 
A

Arthur J. O'Dwyer

I've just written an interpreter for an obfuscated language of my own
design, which I call "tarfu." The language itself has no way of storing
information except within the code itself, meaning there are no variables.
Here's the source to the interpreter, I figured I'd post it here (because I
couldn't find a newsgroup devoted to obfuscated languages) and see if you
had any comments or questions.

comp.lang.misc is the usual place for posting random language
announcements. If you post to c.l.c you're going to get critiques
of your C coding style, which maybe is what you want.

Also, in either place you're going to get the admonishment not to try
attaching files to your message. If it's a binary, post to one of
the binaries groups. If it's a short text file, put it in the body
of your message. If it's large, put it online and give a link to it in
your message.

I've opened your attachment anyway, but some people may not see it at all.
So here it is, with my comments.

==begin file tarfu.c==

/* tarfu interpreter
* (c) 2003 by Harry Altman and G. Nick D'Andrea
* Provided under the terms of
* the GNU General Public License
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define RIGHT 1 /* Will we ever actually use these? */
#define LEFT -1

unsigned char *p;
size_t ip=0;
size_t sp=0;
size_t size;
int dir=1;
FILE *afile=NULL;

(My comment: You really should try to give global variables
expressive names. 'size' is NOT a good name to give to a global.
At least make them static, so they're not overly polluting
if you ever expand the program to more than one file.)
(snip)

int main(int argc, char **argv)
{
FILE *pfile;

(Oh dear, another 8-character tabber. Either that, or you didn't
detab your program before posting. That's ugly.)

int i;
if(argc==1)
{
puts("tarfu Interpreter.\n"
"Usage: tarfu [OPTIONS] FILE\n"
"Use \"tarfu -h\" for help");
exit(1);

(Non-standard return code.)

}
for(i=1;i<argc;i++)
{
if(!strcmp(argv,"-h"))

As a style note, it might be more user-friendly to allow
"-H" to be a help screen, too.

{
puts("tarfu Interpreter.\n"
"Usage: tarfu [OPTIONS] FILE\n"
"FILE: Script file\n"
"OPTIONS:\n"
"-a\tInput file for the script");
exit(0);
}
else if(!strcmp(argv,"-a"))
{
if(!(afile=fopen(argv[++i],"r")))

What if i==argc-1? Then you're passing NULL to fopen.
That's not particularly good design, even if it is defined
to do something sensible (which I'm not going to bother looking
up).
{
fprintf(stderr, "File \"%s\" Not Found\n", argv);
exit(1);
}
}
else
{
if(!(pfile=fopen(argv,"r")))
{
fprintf(stderr, "File \"%s\" Not Found\n", argv);
exit(1);
}
}
}
if (afile==NULL) afile=stdin;
if(!(pfile))
{
fputs("Please specify an input file", stderr);
}
if(!(p=(char*)malloc((size=getsize(pfile))+1)))

Casting malloc is only required in C++, and never good style.
There's no reason to combine the assignments, either.

size = getsize(pfile);
p = malloc(size+1);
if (p == NULL) ...

See how much simpler that looks?

{
fputs("Not enough memory.\n",stderr);
exit(1);
}

/* Get the input */
slurp(pfile,p);

/* Do stuff with the input */
run();

Don't forget to
return 0;
at the end of main().

}

size_t getsize(FILE *f)
/* Precondition: FILE *f exists
* Postcondition: Returns the size of file *f
*/
{
size_t i=0;
while(getc(f)!=EOF)
i++;
rewind(f);
return i;
}

/* From Harry's Hexed Interpreter */
void slurp(FILE *infile, char *out)
/* Precondition: *infile exists, *out is the program text
* Postcondition: There is none, you fool!
*/
{
int c;
size_t i=0;
*out='\0';
while((c=fgetc(infile))!=EOF)
{
out[i++]=(char)c; out='\0';
}
}

(The non-I/O part of function can be speeded up by a factor of two.

void slurp(FILE *in, char *out)
{
int c;
while ((c = getc(in)) != EOF)
*out++ = (char) c;
*out = '\0';
return;
}

Why copy other people's code if it's so bad? Find code that
does what you want *quickly*.)

/* End code from Harry's Hexed Interpreter */

void run(void)
/* Precondition: Why don't you figure this out for yourself
* Postcondition: The script is executed
*/
{
while(1)
{
pre();
doStuff();
ip+=dir;
}
}

void pre(void)
/* Precondition: None
* Postcondition: Does the preprocessing for the iteration
*/
{
size_t i;
int dir2=(dir==RIGHT);
for(i=(dir2 ? 0 : size-1) ; i<size ; i+=dir) /* i is a size_t, and therefore unsigned */
{
if(p=='*')
{
if(i+dir==size || i+dir ==-1)
{
break; /*continue would also work here*/
}
if(p[i-dir]=='*')
{
continue;
}

(It is now obvious that you are not even trying to write clearly.
I'll make, say, two more comments and then stop reading.)

if('0' <= p[i+dir] && p[i+dir] <= '9')
{
p[i+dir]-='0';
if(!dir2) swap(p+i,p+i-1);
if(sp>= i+dir2) sp--;
if(ip>= i+dir2) ip--;
memmove(p+i-!dir2,p+i+dir2,size-i+!dir2); /*Nick is more confused.*/
(One.)
}
else if('a' <= p[i+dir] && p[i+dir] <= 'a'+30)
{
p[i+dir] -= ('a'-10);
if(!dir2) swap(p+i,p+i-1);
if(sp>= i+dir2) sp--;
if(ip>= i+dir2) ip--;
memmove(p+i-!dir2,p+i+dir2,size-i+!dir2); /*Nick is most confused.*/
(Two.)

-Arthur
 
A

Arthur J. O'Dwyer

Gees Arthur, you must be bored to plough through that lot... Even worse than
I can produce :)

Well, he asked for it. ;) Besides, I'm a fan of obfuscated languages,
in general, so it caught my eye. OTOH, I *do* like to see some actual
indication of what a program is supposed to do. It's fun to write
intentionally obfuscated stuff; it's usually not fun to try to read it.
So when I saw memmove() being scattered randomly around the code, I
just stopped reading it. :)

-Arthur
 
D

David Rubin

Arthur J. O'Dwyer wrote:

[snip]
int main(int argc, char **argv)
{
FILE *pfile;

(Oh dear, another 8-character tabber. Either that, or you didn't
detab your program before posting. That's ugly.)

int i;
if(argc==1)
{
puts("tarfu Interpreter.\n"
"Usage: tarfu [OPTIONS] FILE\n"
"Use \"tarfu -h\" for help");
exit(1);

(Non-standard return code.)

}
for(i=1;i<argc;i++)
{
if(!strcmp(argv,"-h"))

As a style note, it might be more user-friendly to allow
"-H" to be a help screen, too.


IMO, you should print the usage whenever you get an *unrecognized* option. This
allows people to type -h, -H, -?, and whatever else seems natural to them. If
they are all unrecognized by your program, they all trigger the usage message.
{
puts("tarfu Interpreter.\n"
"Usage: tarfu [OPTIONS] FILE\n"
"FILE: Script file\n"
"OPTIONS:\n"
"-a\tInput file for the script");
exit(0);
}
else if(!strcmp(argv,"-a"))


This way of parsing command line options is a little clunky because it uses a
lot of code, doesn't allow you to combine options (e.g., -qvt, -qtv, -vtq, etc),
and is not easily maintainable.

[snip - malloc error]
{
fputs("Not enough memory.\n",stderr);
exit(1);
}

consider

perror("tarfu");
exit(EXIT_FAILURE);

perror will print something like

tarfu: out of memory

Even better would be to assign

argv0 = aregv[0];

in the beginning of your program and use it throughout to refer to the program
name. That way, the errors refleft the name the user gives to the program, not
the one you choose.
/* Get the input */
slurp(pfile,p);

fclose(pfile);
/* Do stuff with the input */
run();

Don't forget to
return 0;
at the end of main().

}
[snip]
/* From Harry's Hexed Interpreter */
void slurp(FILE *infile, char *out)
/* Precondition: *infile exists, *out is the program text
* Postcondition: There is none, you fool!
*/
{
int c;
size_t i=0;
*out='\0';
while((c=fgetc(infile))!=EOF)
{
out[i++]=(char)c; out='\0';
}
}


This would be a lot more efficient if you used fgets.

It would be useful if you provided a grammar for your language so people could
see how to program in it as well as check your interpreter for correctness.

/david
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,765
Messages
2,569,568
Members
45,042
Latest member
icassiem

Latest Threads

Top