Python Virtual Machine Reference

K

Kevin Albrecht

Fellow Pythonites,

I am trying to find a reference on the design and structure
of the Python Virtual Machine, but I can't seem to find one
anywhere. In particular, I am looking for a list of all the
instructions in the Python VM's "assembly language".

Thanks in advance,
Kevin Albrecht
 
T

Terry Reedy

Kevin Albrecht said:
Fellow Pythonites,

I am trying to find a reference on the design and structure
of the Python Virtual Machine, but I can't seem to find one
anywhere. In particular, I am looking for a list of all the
instructions in the Python VM's "assembly language".

Lib Ref 18.10.1 Python Byte Code Instructions (doc for dis module)
for more, see source code (ceval.c, I believe, has main vm loop).
Both are subject to change with each release.

Terry J. Reedy
 
D

Dave Kuhlman

Kevin said:
Fellow Pythonites,

I am trying to find a reference on the design and structure
of the Python Virtual Machine, but I can't seem to find one
anywhere. In particular, I am looking for a list of all the
instructions in the Python VM's "assembly language".

The byte codes are defined in the Python Library Reference, sect.
"18.10.1 Python Byte Code Instructions":

http://www.python.org/doc/current/lib/bytecodes.html

The dis module lets you see what byte codes are generated for a
given piece of code. Example:
['EXTENDED_ARG', 'HAVE_ARGUMENT', '__all__', '__builtins__',
'__doc__', '__file__', '__name__', '_test', 'cmp_op', 'dis',
'disassemble', 'disassemble_string', 'disco', 'distb',
'findlabels', 'hascompare', 'hasconst', 'hasfree', 'hasjabs',
'hasjrel', 'haslocal', 'hasname', 'opmap', 'opname', 'sys',
'types'] 5 0 LOAD_CONST 1 (2)
3 STORE_FAST 0 (count)

6 6 SETUP_LOOP 22 (to 31)
9 LOAD_GLOBAL 1 (sys)
12 LOAD_ATTR 2 (path)
15 GET_ITER 19 STORE_FAST 1 (path)

7 22 LOAD_FAST 1 (path)
25 PRINT_ITEM
26 PRINT_NEWLINE
27 JUMP_ABSOLUTE 16

The interpreter itself is implemented in
Python-2.3.3/Python/ceval.c in the Python source code distribution.
Search for "Interpreter main loop" and "Main switch on opcode". I
suppose you could use the source as a definition.

The byte code is generated in Python-2.3.3/Python/compile.c

You may also want to look at the Python Library Reference, Sect.
"19. Python compiler package":

http://www.python.org/doc/current/lib/lib.html

But, I don't understand what the connection is between the AST
(abstract syntax tree) and the byte code interpreter's op-codes.
Maybe someone else can explain that. Perhaps the real Python
compiler does not use the AST.

Dave
 
J

Jeff Epler

But, I don't understand what the connection is between the AST
(abstract syntax tree) and the byte code interpreter's op-codes.
Maybe someone else can explain that. Perhaps the real Python
compiler does not use the AST.

The parser transforms the string of tokens into an AST, and then
generates bytecode by traversing the AST. After bytecode generation,
the AST is discarded.

The numeric values for nodes in the AST are subject to change between
releases. The numbers for the AST nodes for tokens are chosen manually,
and the numbers for productions (stmt, expr_stmt, atom, etc) are a
consequence of the order of appearance in the Grammar file.

Like the bytecode, the AST is really an implementation detail of
cpython.

Jeff
 
M

Michael Hudson

Jeff Epler said:
The parser transforms the string of tokens into an AST, and then
generates bytecode by traversing the AST. After bytecode generation,
the AST is discarded.

Which AST are we talking about here?

Python's built in parser produces an exceedingly concrete syntax tree
(ECST? :). This is the input to the builtin compiler
(Python/compile.c). The Lib/compiler package translates the ECST into
a more genuine AST, and then walks over *that* to generate bytecode.

(The docs for the builtin parser module occasionally refer to the ECST
as an AST, but that's basically a lie).

What happens on the ast-branch (the "new compiler" branch), I'm not so
sure about. An AST (AFAIK different from Lib/compiler's) is involved
somewhere, but I'm not sure how it's produced.

Cheers,
mwh
 
L

logistix at cathoderaymission.net

Michael Hudson said:
What happens on the ast-branch (the "new compiler" branch), I'm not so
sure about. An AST (AFAIK different from Lib/compiler's) is involved
somewhere, but I'm not sure how it's produced.

It uses the existing mechanism to generate the parse-tree, and then
transforms into a real AST prior to bytecode generation. Writing a
new parser-generator has been deferred at this point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top