interpreter vs. compiled

C

castironpi

I'm curious about some of the details of the internals of the Python
interpreter:

I understand C from a hardware perspective.

x= y+ 1;

Move value from place y into a register
Add 1 to the value in the register
Move the addition's result to place x

The Python disassembly is baffling though.
0 SETUP_LOOP 31037 (to 31040)
3 STORE_SLICE+3
4 <49>

What are SETUP_LOOP and STORE_SLICE? What are these instructions?
 
I

I V

The Python disassembly is baffling though.

You can't disassemble strings of python source (well, you can, but, as
you've seen, the results are not meaningful). You need to compile the
source first:
1 0 LOAD_NAME 0 (x)
3 LOAD_CONST 0 (1)
6 BINARY_ADD
7 STORE_NAME 1 (y)
10 LOAD_CONST 1 (None)
13 RETURN_VALUE

You may well find these byte codes more meaningful. Note that there is a
list of opcodes at http://docs.python.org/lib/bytecodes.html
 
C

castironpi

You can't disassemble strings of python source (well, you can, but, as
you've seen, the results are not meaningful). You need to compile the
source first:


  1           0 LOAD_NAME                0 (x)
              3 LOAD_CONST               0 (1)
              6 BINARY_ADD          
              7 STORE_NAME               1 (y)
             10 LOAD_CONST               1 (None)
             13 RETURN_VALUE

You may well find these byte codes more meaningful. Note that there is a
list of opcodes athttp://docs.python.org/lib/bytecodes.html

Oh. How is the stack represented? Does it keep track of which stack
positions (TOS, TOS1, etc.) are in what registers? Does stack
manipulation consume processor cycles? Here is what I'm thinking:

LOAD_NAME: stack= [ x ]
reg0: x
tos: reg0
LOAD_CONST: stack= [ 1, x ]
reg0: x
reg1: 1
tos: reg1
BINARY_ADD: stack= [ x+ 1, x ]
reg0: x
reg1: x+ 1
tos: reg1
STORE_NAME: y= [ x+ 1], stack= same
reg0: x
reg1: x+ 1
tos: reg1

I may be totally off.
 
T

Terry Reedy

castironpi said:
Oh. How is the stack represented?

As usual, as successive locations in memory.
I have the impression that CPython uses the same stack C does.
While conceptually, CPython may put objects on the stack, I am pretty
sure it actually stacks references (C pointers) to objects in heap memory.
> Does it keep track of which stack
positions (TOS, TOS1, etc.) are in what registers?

I am sure they are not in registers, just normal memory.
The C code that implements bytecodes to act on stack values will use
registers just like any other C code. So using registers for the stack
would get in the way. Of course, the C code might load pointers on the
stack into address registers when actually needed. But this depends on
the address scheme of a particular processor and how the C code is
compiled to its object code.
Does stack manipulation consume processor cycles?

Of course. For much more, you should peruse the CPython source.
 
K

Kay Schluehr

Oh. How is the stack represented?

As a pointer to a pointer of PyObject structs.
Does it keep track of which stack
positions (TOS, TOS1, etc.) are in what registers? Does stack
manipulation consume processor cycles?

Python does not store values in registers. It stores locals in arrays
and accesses them by position ( you can see the positional index in
the disassembly right after the opcode name ) and globals / object
attributes in dicts.

For more information you might just download the source distribution
and look for src/Python/ceval.c. This file contains the main
interpreter loop.
 
C

castironpi

As a pointer to a pointer of PyObject structs.


Python does not store values in registers. It stores locals in arrays
and accesses them by position ( you can see the positional index in
the disassembly right after the opcode name ) and globals / object
attributes in dicts.

For more information you might just download the source distribution
and look for src/Python/ceval.c. This file contains the main
interpreter loop.

Oh. I was interpreting, no pun, that the column of numbers to the
left indicated how many processor cycles were consumed in each
operation. It doesn't quite make sense, unless BINARY_ADD can refer
to memory outside of the registers, which I doubt on the basis that
two addresses would have to fit into a single operation, plus the
architecture opcode. Given that, what does that column indicate?

I'm intimidated by the source but I may look.
 
C

castironpi

As a pointer to a pointer of PyObject structs.


Python does not store values in registers. It stores locals in arrays
and accesses them by position ( you can see the positional index in
the disassembly right after the opcode name ) and globals / object
attributes in dicts.

For more information you might just download the source distribution
and look for src/Python/ceval.c. This file contains the main
interpreter loop.

Ah, found it. The parts that are making sense are:

register PyObject **stack_pointer;
#define TOP() (stack_pointer[-1])
#define BASIC_POP() (*--stack_pointer)

...(line 1159)...
w = POP();
v = TOP();
if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
/* INLINE: int + int */
register long a, b, i;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
i = a + b;
if ((i^a) < 0 && (i^b) < 0)
goto slow_add;
x = PyInt_FromLong(i);

... Which is more than I was picturing was involved. I understand it
is also specific to CPython. Thanks for the pointer to the code.

My basic question was, what is the difference between compilers and
interpreters, and why are interpreters slow? I'm looking at some of
the answer right now in "case BINARY_ADD:".
 
D

Dan

As a pointer to a pointer of PyObject structs.
Python does not store values in registers. It stores locals in arrays
and accesses them by position ( you can see the positional index in
the disassembly right after the opcode name ) and globals / object
attributes in dicts.
For more information you might just download the source distribution
and look for src/Python/ceval.c. This file contains the main
interpreter loop.

Ah, found it. The parts that are making sense are:

register PyObject **stack_pointer;
#define TOP() (stack_pointer[-1])
#define BASIC_POP() (*--stack_pointer)

...(line 1159)...
w = POP();
v = TOP();
if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
/* INLINE: int + int */
register long a, b, i;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
i = a + b;
if ((i^a) < 0 && (i^b) < 0)
goto slow_add;
x = PyInt_FromLong(i);

... Which is more than I was picturing was involved. I understand it
is also specific to CPython. Thanks for the pointer to the code.

My basic question was, what is the difference between compilers and
interpreters, and why are interpreters slow? I'm looking at some of
the answer right now in "case BINARY_ADD:".

The basic difference between a (traditional) compiler and an
interpreter is that a compiler emits (assembly) code for a specific
machine. Therefore it must know the specifics of the machine (how many
registers, memory addressing modes, etc), whereas interpreters
normally define themselves by their conceptual state, that is, a
virtual machine. The instructions (bytecode) of the virtual machine
are generally more high-level than real machine instructions, and the
semantics of the bytecode are implemented by the interpreter, usually
in a sort-of high level language like C. This means the interpreter
can run without detailed knowledge of the machine as long as a C
compiler exists. However, the trade off is that the interpreter
semantics are not optimized for that machine.

This all gets a little more hairy when you start talking about JITs,
runtime optimizations, and the like. For a real in-depth look at the
general topic of interpretation and virtual machines, I'd recommend
Virtual Machines by Smith and Nair (ISBN:1-55860910-5).

-Dan
 
C

castironpi

Ah, found it.  The parts that are making sense are:
register PyObject **stack_pointer;
#define TOP()           (stack_pointer[-1])
#define BASIC_POP()     (*--stack_pointer)
...(line 1159)...
w = POP();
v = TOP();
if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: int + int */
        register long a, b, i;
        a = PyInt_AS_LONG(v);
        b = PyInt_AS_LONG(w);
        i = a + b;
        if ((i^a) < 0 && (i^b) < 0)
                goto slow_add;
        x = PyInt_FromLong(i);
... Which is more than I was picturing was involved.  I understand it
is also specific to CPython.  Thanks for the pointer to the code.
My basic question was, what is the difference between compilers and
interpreters, and why are interpreters slow?  I'm looking at some of
the answer right now in "case BINARY_ADD:".

The basic difference between a (traditional) compiler and an
interpreter is that a compiler emits (assembly) code for a specific
machine. Therefore it must know the specifics of the machine (how many
registers, memory addressing modes, etc), whereas interpreters
normally define themselves by their conceptual state, that is, a
virtual machine. The instructions (bytecode) of the virtual machine
are generally more high-level than real machine instructions, and the
semantics of the bytecode are implemented by the interpreter, usually
in a sort-of high level language like C. This means the interpreter
can run without detailed knowledge of the machine as long as a C
compiler exists. However, the trade off is that the interpreter
semantics are not optimized for that machine.

This all gets a little more hairy when you start talking about JITs,
runtime optimizations, and the like. For a real in-depth look at the
general topic of interpretation and virtual machines, I'd recommend
Virtual Machines by Smith and Nair (ISBN:1-55860910-5).

-Dan

You're saying the VM can't compile code. That makes sense, it's not a
compiler. Do I understand correctly that JIT does compile to native
code in some cases?

Python: x= y+ 1
Python VM: push, push, add, store
Assembly: load, load, add, store

Except, the assembly doesn't contain the type-checking that
PyInt_AS_LONG does. But that's not the only thing that stops python
from precompiling to assembly directly. GNU doesn't come with
Python. What sorts of minimal information would be necessary to take
from the GNU libs for the user's specific processor, (the one they're
downloading their version of Python for), to move Python to the
further step of outputting the machine code?
 
T

Tim Roberts

castironpi said:
You're saying the VM can't compile code. That makes sense, it's not a
compiler.

I wouldn't say "can't". The current CPython VM does not compile code. It
COULD. The C#/.NET VM does. IronPython, for example, is an implementation
of Python that uses .NET. In that case, the code *IS* JIT compiled to
assembly when the program starts.
Do I understand correctly that JIT does compile to native
code in some cases?

VMs that use JIT do, yes.
But that's not the only thing that stops python
from precompiling to assembly directly. GNU doesn't come with
Python.

Do you mean Linux?
What sorts of minimal information would be necessary to take
from the GNU libs for the user's specific processor, (the one they're
downloading their version of Python for), to move Python to the
further step of outputting the machine code?

I don't know why you think GNU has anything to do with this. There's
nothing that prevents the Python run-time from JIT compiling the code.
IronPython does this. CPython does not. It's an implementation decision.
 
C

castironpi

I wouldn't say "can't".  The current CPython VM does not compile code.  It
COULD.  The C#/.NET VM does.  IronPython, for example, is an implementation
of Python that uses .NET.  In that case, the code *IS* JIT compiled to
assembly when the program starts.


VMs that use JIT do, yes.


Do you mean Linux?


I don't know why you think GNU has anything to do with this.  There's
nothing that prevents the Python run-time from JIT compiling the code.
IronPython does this.  CPython does not.  It's an implementation decision.

Compiling a program is different than running it. A JIT compiler is a
kind of compiler and it makes a compilation step. I am saying that
Python is not a compiler and in order to implement JIT, it would have
to change that fact.
of Python that uses .NET. In that case, the code *IS* JIT compiled to
assembly when the program starts.

But still not the user's code, only the interpreter, which is running
in assembly already anyway in CPython.
 
M

Martin v. Löwis

Oh. How is the stack represented?
As usual, as successive locations in memory.
I have the impression that CPython uses the same stack C does.

Actually, it doesn't (at least not for the evaluation stack).

In CPython, when a Python function starts, the maximum depth of the
evaluation stack is known, but it depends on the specific function
(of course). So Python needs to allocate an array for the evaluation
stack with known size, but can't do so on the C stack (at least not
portably), since you can't allocate dynamically-sized array as
a local variable in C.

So instead, pymalloc is used to allocate the evaluation stack, and
it is part of the frame object (so the entire frame object is allocated
in one chunk, and then split up into local variables and evaluation
stack.
While conceptually, CPython may put objects on the stack, I am pretty
sure it actually stacks references (C pointers) to objects in heap memory.
Correct.


I am sure they are not in registers, just normal memory.

Correct. As discussed above, they are located on the heap (making
Python's frame stack a spaghetti stack).

Regards,
Martin
 
F

Fredrik Lundh

castironpi said:
Compiling a program is different than running it. A JIT compiler is a
kind of compiler and it makes a compilation step. I am saying that
Python is not a compiler and in order to implement JIT, it would have
to change that fact.

good thing Python don't have to listen to you, then, so it can keep
using its built-in compiler to produce programs for its built-in VM.
> But still not the user's code, only the interpreter, which is running
> in assembly already anyway in CPython.

good luck with your future career in computing!

</F>
 
C

castironpi

Actually, it doesn't (at least not for the evaluation stack).

In CPython, when a Python function starts, the maximum depth of the
evaluation stack is known, but it depends on the specific function
(of course). So Python needs to allocate an array for the evaluation
stack with known size, but can't do so on the C stack (at least not
portably), since you can't allocate dynamically-sized array as
a local variable in C.

So instead, pymalloc is used to allocate the evaluation stack, and
it is part of the frame object (so the entire frame object is allocated
in one chunk, and then split up into local variables and evaluation
stack.



Correct. As discussed above, they are located on the heap (making
Python's frame stack a spaghetti stack).

Regards,
Martin

Martin,

I am curious and pursuing it as an independent project. I'd like to
write a specialized function to allocate memory from an memory-mapped
file instead of the heap. On Windows, to use CreateFileMapping and
MapViewOfFile. The companion function, premalloc, would re-open an
existing Python object from a handle. (It would need a name or index -
offset look-up.)

svn.python.org is down, so I can't tell if Python already implements
its own memory management, and if so how that would extrapolate to a
byte-array allocated specifically.
 
T

Tim Roberts

castironpi said:
Compiling a program is different than running it. A JIT compiler is a
kind of compiler and it makes a compilation step. I am saying that
Python is not a compiler and in order to implement JIT, it would have
to change that fact.

And I'm saying you are wrong. There is NOTHING inherent in Python that
dictates that it be either compiled or interpreted. That is simply an
implementation decision. The CPython implementation happens to interpret.
The IronPython implementation compiles the intermediate language to native
machine language.
But still not the user's code, only the interpreter, which is running
in assembly already anyway in CPython.

In CPython, yes. In IronPython, no; the user's code is compiled into
machine language. Both of them are "Python".
 
B

Bob Martin

in 75186 20080725 050433 Tim Roberts said:
And I'm saying you are wrong. There is NOTHING inherent in Python that
dictates that it be either compiled or interpreted. That is simply an
implementation decision. The CPython implementation happens to interpret.
The IronPython implementation compiles the intermediate language to native
machine language.


In CPython, yes. In IronPython, no; the user's code is compiled into
machine language. Both of them are "Python".

It's amazing how many people cannot differentiate between language and implementation.
How many times have I read "x is an interpreted language"?
I know many languages are designed for either compilation or interpretation, but I have
used C and Pascal interpreters as well as Java and Rexx compilers.
 
J

John Nagle

Tim said:
And I'm saying you are wrong. There is NOTHING inherent in Python that
dictates that it be either compiled or interpreted. That is simply an
implementation decision. The CPython implementation happens to interpret.
The IronPython implementation compiles the intermediate language to native
machine language.

Well, actually there are some Python language features which make
static compilation to machine code difficult. PyPy and Shed Skin
have to impose some restrictions on dynamism to make efficient
compilation feasible. The big problem is "hidden dynamism", where the code
looks static, but at run time, some external piece of code replaces a
function or adds an unexpected attribute to what looked like a simple object
or function in the defining module.

In CPython, everything is a general object internally, and all the
names are resolved over and over again at run time by dictionary lookup.
This is simple, but there's a sizable speed penalty.

John Nagle
 
C

castironpi

And I'm saying you are wrong.  There is NOTHING inherent in Python that
dictates that it be either compiled or interpreted.  That is simply an
implementation decision.  The CPython implementation happens to interpret.
The IronPython implementation compiles the intermediate language to native
machine language.



In CPython, yes.  In IronPython, no; the user's code is compiled into
machine language.  Both of them are "Python".

In CPython yes. In IronPython yes: the parts that are compiled into
machine code are the interpreter, *not user's code*. Without that
step, the interpreter would be running on an interpreter, but that
doesn't get the user's statement 'a= b+ 1' into registers-- it gets
'push, push, add, pop' into registers.
 
F

Fuzzyman

In CPython yes.  In IronPython yes:  the parts that are compiled into
machine code are the interpreter, *not user's code*.  Without that
step, the interpreter would be running on an interpreter, but that
doesn't get the user's statement 'a= b+ 1' into registers-- it gets
'push, push, add, pop' into registers.

Well - in IronPython user code gets compiled to in memory assemblies
which can be JIT'ed.

Michael Foord
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top