Understanding Python's interpreter

  • Thread starter Gabriel Genellina
  • Start date
G

Gabriel Genellina

Rafael said:
I'm studying compilers now on my university and I can't quite
understand one thing about the python interpreter. Why is its input a
binary file (pyc)? The LOAD_CONST opcode is 100 (dec) and STORE_FAST's
is 125 (dec). The translation of the following code:

foo.py:
x = 10

Could be this:

foo.pyc:
100 10
125 0

That way you wouldn't need code such as
static void
w_long(long x, WFILE *p)
{
w_byte((char)( x & 0xff), p);
w_byte((char)((x>> 8) & 0xff), p);
w_byte((char)((x>>16) & 0xff), p);
w_byte((char)((x>>24) & 0xff), p);
}
since you could read it with strtol and write back using a simple
printf. So you wouldn't need a buch of casts by simple using ascii
input and output.

A "simple" printf or strtol involves hundreds of instructions. And
people want Python to execute fast...
What's the reason for having it in a binary form instead of writting
it in ascii? (yeah, I know ascii would be binary also, but I think you
get my point.)

Speed? Eficiency? File size? Ease of use?
A .pyc *could* be written in ASCII, but what do you gain? Replacing a
few trivial functions in the Python core with a printf/scanf equivalent?
At the same time you lose a lot of speed, so I don't see the point.
So I was writting this interpreter for some assembly language defined
in class and I did something like python does, so I had a binary file
to interpret. After a while I thought it was far harder to program.

Why harder? Once you read the file, they're just numbers. Anyway, being
harder to program the *interpreter* is not a problem, if you gain
something like speed or eficiency for the interpreted language.
And when I tried to code an assembler my problems got greater, as I
wanted to code it in python (the interpreter was in C++) and I had a
hard time trying to figure out how I would print something that's not a
ascii or unicode string. As for the benefits, I couldn't figure out any.

Sorry, I could not understand what you said here.
 
R

Rafael Almeida

Hello,

I'm studying compilers now on my university and I can't quite
understand one thing about the python interpreter. Why is its input a
binary file (pyc)? The LOAD_CONST opcode is 100 (dec) and STORE_FAST's
is 125 (dec). The translation of the following code:

foo.py:
x = 10

Could be this:

foo.pyc:
100 10
125 0

That way you wouldn't need code such as
static void
w_long(long x, WFILE *p)
{
w_byte((char)( x & 0xff), p);
w_byte((char)((x>> 8) & 0xff), p);
w_byte((char)((x>>16) & 0xff), p);
w_byte((char)((x>>24) & 0xff), p);
}
since you could read it with strtol and write back using a simple
printf. So you wouldn't need a buch of casts by simple using ascii
input and output.

What's the reason for having it in a binary form instead of writting
it in ascii? (yeah, I know ascii would be binary also, but I think you
get my point.)

So I was writting this interpreter for some assembly language defined
in class and I did something like python does, so I had a binary file
to interpret. After a while I thought it was far harder to program.
And when I tried to code an assembler my problems got greater, as I
wanted to code it in python (the interpreter was in C++) and I had a
hard time trying to figure out how I would print something that's not a
ascii or unicode string. As for the benefits, I couldn't figure out any.

I hope I'm not too offtopic here, but I thought this was probably the
best news to ask this. If someone thinks there's another news that's
more appropriate, please tell me.

[]'s
Rafael
 
R

Rafael Almeida

Speed? Eficiency? File size? Ease of use?
A .pyc *could* be written in ASCII, but what do you gain? Replacing a
few trivial functions in the Python core with a printf/scanf equivalent?
At the same time you lose a lot of speed, so I don't see the point.

Hm, I didn't realise that it would be that much slower.
Why harder? Once you read the file, they're just numbers. Anyway, being
harder to program the *interpreter* is not a problem, if you gain
something like speed or eficiency for the interpreted language.

Well, it's harder to get 4 bytes and create an int out of it in a
portable way than just call strtol or scanf, that's what I thought
while I was coding my interpreter. It's not the hardest thing to do,
of course, but it made me wonder why not just do the simplest thing.

Since I've never seen a .pyc bigger than a few kilobytes, I thought an
ascii file would take more space, but it wouldn't be anything really
prohibitive.

I didn't think using strtol would make that much difference in speed.
But now you talked about it, and after thinking a little bit more about
it, I'm convinced that the speed difference may be relevant.
Sorry, I could not understand what you said here.

It's not anything important, I was just saying that I had to write a
little more code to make an integer such as 0xff into '\0\0\0\377' than
it would need to just print the integer. Well, unless there's already a
python function that does just that and I didn't know about. It's was
just an example on how writting in ascii is easier.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Why harder? Once you read the file, they're just numbers. Anyway, being
Well, it's harder to get 4 bytes and create an int out of it in a
portable way than just call strtol or scanf

That's not necessarily true for C. To deal with ASCII-printed
integers in C, you need to deal with memory management, and
variable-sized buffer. For example, if you want to print to
memory, you need to overallocate memory, print to it, and
then shrink the extra allocation (e.g. by copying the string
elsewhere).

For 4-byte integers in binary, no memory-management issue arises.
You know exactly how much memory you will need.
Since I've never seen a .pyc bigger than a few kilobytes, I thought an
ascii file would take more space, but it wouldn't be anything really
prohibitive.

That would probably defeat the point of .pyc files entirely: you
already *have* an ASCII version of it, the .py file. So why create
a second file?
It's not anything important, I was just saying that I had to write a
little more code to make an integer such as 0xff into '\0\0\0\377' than
it would need to just print the integer. Well, unless there's already a
python function that does just that and I didn't know about. It's was
just an example on how writting in ascii is easier.

Sure: Take a look at the struct module.

Regards,
Martin
 
S

Steve Holden

Rafael said:
Hm, I didn't realise that it would be that much slower.

You may mistakenly believe that casts are an expensive operation, when
in fact they take no time at all - they merely instruct the compiler to
treat specific pieces of data in specific ways.
Well, it's harder to get 4 bytes and create an int out of it in a
portable way than just call strtol or scanf, that's what I thought
while I was coding my interpreter. It's not the hardest thing to do,
of course, but it made me wonder why not just do the simplest thing.
Because they aer smarter than you, without wishing to be too rude.
Since I've never seen a .pyc bigger than a few kilobytes, I thought an
ascii file would take more space, but it wouldn't be anything really
prohibitive.

I didn't think using strtol would make that much difference in speed.
But now you talked about it, and after thinking a little bit more about
it, I'm convinced that the speed difference may be relevant.
Which is a good reason to think about things *before* you post.
It's not anything important, I was just saying that I had to write a
little more code to make an integer such as 0xff into '\0\0\0\377' than
it would need to just print the integer. Well, unless there's already a
python function that does just that and I didn't know about. It's was
just an example on how writting in ascii is easier.

Speed, baby, speed.

regards
Steve
 
M

Mike C. Fletcher

Steve said:
Rafael Almeida wrote:
....
Because they aer smarter than you, without wishing to be too rude.
Replace that with "more experienced", please. Otherwise it is a bit
rude, despite your wishes. We've always been newbie positive on
Python-list, and new compiler writers are just as important educational
targets as new Python programmers.

Have fun,
Mike

--
________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
 
S

Steve Holden

Mike said:
Replace that with "more experienced", please. Otherwise it is a bit
rude, despite your wishes. We've always been newbie positive on
Python-list, and new compiler writers are just as important educational
targets as new Python programmers.
Maybe I was feeling a little crabby this morning. "More experienced" is
certainly a more appropriate way to express the sentiment. Thanks.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,264
Messages
2,571,065
Members
48,770
Latest member
ElysaD

Latest Threads

Top