compiling to python byte codes

J

Jeremy Bowers

I really think you should study programming language implementations
for some time before approaching your problem.

With respect Maurice, I think I have to agree on this.

In your situation, I can promise you that it is *faster* to take the time
to learn about this stuff correctly then to try to power through it
without learning; it is one of those places in computer technology
where there are such powerful tools to help you that it is better to
learn how to use them then to kludge through. Unfortunately, it is too
large a topic to cover in a Usenet posting.

If your institution offers a compilers class (a sadly diminishing number),
try to take or audit that. (You most likely don't want a *compiler*, but
an *interpreter*; the course will explain the difference. An interpreter
typically uses much the same technology to implement, parsers and abstract
syntax trees and such, but is usually much easier to implement.) (I think
you hinted this was thesis project, hence this suggestion. Failing that,
you may need a compilers book and some self-study time. Again, I promise
you this is faster almost immediately than trying to power through this
without it.)

If you are responsible for creating the opcodes directly, you may find a
better way to do what you are doing anyhow. Assembly is easy to implement
but (with apologies to the more experienced among us) sucks to program in.

Stepping up a level, are you sure you can't just implement a C or Python
library and let people write their own programs in Python? You'll never be
able to match Python-the-language's feature set.
 
M

Maurice LING

In your situation, I can promise you that it is *faster* to take the time
to learn about this stuff correctly then to try to power through it
without learning; it is one of those places in computer technology
where there are such powerful tools to help you that it is better to
learn how to use them then to kludge through. Unfortunately, it is too
large a topic to cover in a Usenet posting.

I realised that there are powerful tools such as lex and yacc around
that can save me a lot of time. I'll be using PLY for my purpose.
If your institution offers a compilers class (a sadly diminishing number),
try to take or audit that. (You most likely don't want a *compiler*, but
an *interpreter*; the course will explain the difference. An interpreter
typically uses much the same technology to implement, parsers and abstract
syntax trees and such, but is usually much easier to implement.) (I think
you hinted this was thesis project, hence this suggestion. Failing that,
you may need a compilers book and some self-study time. Again, I promise
you this is faster almost immediately than trying to power through this
without it.)

I can only have the self-study options and good books on compiler
construction are rare. I am a molecular biologist by professional
training. There are things that are tough for me to understand and to
just find the answer about stacks vs register computers will take ages,
and I always appreciate people who do not treat me as an idiot. I'm sure
there are much more idiotic questions being asked in newsgroups.
Stepping up a level, are you sure you can't just implement a C or Python
library and let people write their own programs in Python? You'll never be
able to match Python-the-language's feature set.

What I'm doing is a special-purpose language (for modelling purposes).
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Maurice said:
I can only have the self-study options and good books on compiler
construction are rare. I am a molecular biologist by professional
training. There are things that are tough for me to understand and to
just find the answer about stacks vs register computers will take ages,
and I always appreciate people who do not treat me as an idiot. I'm sure
there are much more idiotic questions being asked in newsgroups.

I certainly did not mean to declare you an idiot. Instead, I tried to
point out that this is a complex topic, one where a Usenet thread can
hardly give sufficient introduction. Instead, in such threads, posters
typically assume common background, with respect to grammars, syntax,
abstract syntax, intermediate representation (using trees or opcodes),
interpretation vs. compilation, and so on.

Regards,
Martin
 
M

Maurice LING

Probably my question should be phrased as, given what x86/PPC processors
are register-based (even after more than a decade from the publication
of the book "Stack Machines - the new wave") and there isn't much
examples of stack-based processors, why is there a difference? It seems
wierd to me that if stack-based machines (physical processors or VMs)
are so good, why hadn't the processor engineering caught up?

You've totally missed my question but thanks anyway, I've learnt. My
actual question had been partially answered.

Thanks
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Maurice said:
Probably my question should be phrased as, given what x86/PPC processors
are register-based (even after more than a decade from the publication
of the book "Stack Machines - the new wave") and there isn't much
examples of stack-based processors, why is there a difference? It seems
wierd to me that if stack-based machines (physical processors or VMs)
are so good, why hadn't the processor engineering caught up?

Stack-based microprocessors would be very inefficient. If you don't
have registers, every operation will need to access the stack, which
is an access to main memory, which is expensive. The counter-argument
of interpreters against registers (difficult to decode opcodes, long
opcodes) does not hold for microprocessors, as they can decode the
instruction in parallel with doing other things (which an interpreter
couldn't).

For interpreters, the same rationale does not hold - even registers
would live in main memory, so there would be no performance gained.

Virtual machines are quite different from real machines, in many
respects.

Regards,
Martin
 
J

Jeremy Bowers

I realised that there are powerful tools such as lex and yacc around
that can save me a lot of time. I'll be using PLY for my purpose.

That helps.
I can only have the self-study options and good books on compiler
construction are rare. I am a molecular biologist by professional
training. There are things that are tough for me to understand and to
just find the answer about stacks vs register computers will take ages,
and I always appreciate people who do not treat me as an idiot. I'm sure
there are much more idiotic questions being asked in newsgroups.

I'm not sure if you feel I'm treating you as an idiot, or if you mean that
literally. Regardless, it isn't my intent. It is challenging because we
end up unable to share vocabulary.
What I'm doing is a special-purpose language (for modelling purposes).

OK, makes sense.

Now that you say you are using PLY, that at least gives us a common frame
of reference with code. Take a look at the calc.py file that should have
been included with your PLY distribution.

It implements a simple calculator interpreter. It is an "interpreter"
because as it encounters the input, it is dynamically executing it. A
"compiler" would actually just store it as a tree, then later output it
into some other format without execution, which is what a C++ compiler
does, outputting the opcodes for the CPU.

Because of that extra step in the middle, "building a tree", a compiler is
typically harder to write than an interpreter. Getting the output right
can also be tricky, and a challenge to debug.

Thus, in terms of PLY, my suggestion has been to write an interpreter,
like calc.py, not a compiler. Unfortunately, like I said earlier, control
flow can be a pain, because you can't execute directly like calc.py does.

You want something else, though, that builds something and then executes
it later.
 
G

greg

Maurice said:
Probably my question should be phrased as, given what x86/PPC processors
are register-based (even after more than a decade from the publication
of the book "Stack Machines - the new wave") and there isn't much
examples of stack-based processors, why is there a difference?

If you're implementing a machine in hardware, access to
registers is much faster than access to memory. Since the
current trend in hardware design seems to be "as fast as
possible, whatever it takes", today's architectures are
increasingly register-based.

But with an interpreter, things are very different. The
"registers" of the VM probably aren't going to be real
registers, but memory locations. Even if you do manage to
keep them in real registers, the time spent accessing them
is going to be small compared to the time spent fetching
instructions, decoding them and figuring out what operands
they refer to, so the speed advantage would be quite minimal.

Given that, and the fact that stack architectures are much
easier to generate code for, it's not surprising that most
VMs tend to have stack architectures.

Greg
 
M

Maurice LING

I'm not sure if you feel I'm treating you as an idiot, or if you mean that
literally. Regardless, it isn't my intent. It is challenging because we
end up unable to share vocabulary.

Figurative speech intended here. I wish to maintain the thought that
python users are helpful. :)
It implements a simple calculator interpreter. It is an "interpreter"
because as it encounters the input, it is dynamically executing it. A
"compiler" would actually just store it as a tree, then later output it
into some other format without execution, which is what a C++ compiler
does, outputting the opcodes for the CPU.

I'm looking at generating either python source or MA (intermediate
representation in assembly-like form) on the fly (as the lines are being
interpreted). The design of MA (2-operands code) works pretty much like
functions themselves, as in, each "opcode" can be represented by a
function in python (or any other language, I presume) and the "operands"
are like parameters. When MA was thought of, it was meant to target to
Java, but I suppose it is possible to target to python.

So I think the "MA virtual machine" is like a python library.
Because of that extra step in the middle, "building a tree", a compiler is
typically harder to write than an interpreter. Getting the output right
can also be tricky, and a challenge to debug.

I can see this coming. It may be tricky to isolate the error to
tree-building or the test codes.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Maurice said:
Sorry if I had misunderstood your intentions. VM is something of a
rather old concept, since Forth days, but is revived for the sake of
portability. With the exception of the book by Joshua Engel, I've not
seen any books that is devoted on VM. Do you know of any?

One of the older ones is

Goldberg, Robson. Smalltalk-80: The Language and its Implementation.
Addison-Wesley, 1983.

A thing that just turned up in a Google search is

Christian Queinnec. Lisp in Small Pieces.
Cambridge University Press, 1996

This covers 11 interpreters and 2 compilers.

For Scheme, there is an online book

http://www.cs.utexas.edu/users/wilson/schintro/schintro_toc.html

There also is an Icon book

Ralph E. Griswold and Madge T. Griswold.
The Implementation of the Icon Programming Language
Princeton University Press, 1986

On the language-independent/cross-language side, we have

Samuel Kamin. Languages: An Interpreter-based Approach
Addison-Wesley, 1990.

and, of course

Aho, Sethi, Ullman
Compilers : Principles, Techniques, and Tools.
Addison-Wesley, 1988 (with many reprints)
As for books on compiler construction, many explains the same topics and
it doesn't quite help when something I want to know is so precise, or
are just too old for any good use.

It turns out that this is an area of computing that is very old
(compared to the total age of electronic computing), and many of its
foundations have been built years ago. So even the old books are still
"valid".

Now, for *specific* questions, Usenet is the right medium, although
comp.compilers may be a better forum.

Regards,
Martin
 
J

Jacob Hallen

I am using SBML (system biology markup language) as a front-end
modelling language for my project. And for ease of further maintenance
of the model and interoperability purposes, my project requires me to
convert it into an intermediate form (MA), which is somewhat assembly is
structure, as in, each instruction takes the form of <opcode>
<operand>*. Here I am, attempting to write a virtual machine that can
run MA, using python. So, it becomes a MA virtual machine running on
python virtual machine.

My concern is, is it simpler to convert MA to python codes or python
bytecodes. What are the pros and cons? Assuming that to convert to
python source code is a choice, I'm thinking that MA virtual machine can
then read a MA instruction and output the corresponding python source
codes, but are there facilities in python to run python codes, line by
line, as it is being thrown out by MA virtual machine?

The Python virtual machine is not something that is fixed. It may change
between Python versions. For this reason, it is a bad idea to generate
bytecode directly, since you may have to redo the work many times. It is
much better to have Python as the target language of your SBML compilation.

Jacob Hallén

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,252
Latest member
MeredithPl

Latest Threads

Top