Why is it impossible to create a compiler than can compile Python tomachinecode like C?

Discussion in 'Python' started by kramer65, Feb 28, 2013.

  1. kramer65

    kramer65 Guest

    Hello,

    I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode?

    My reasoning is as follows:
    When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programswritten in Python.

    Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode thatcan handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it?

    I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue..

    Any insights on this would be highly appreciated!
    kramer65, Feb 28, 2013
    #1
    1. Advertising

  2. kramer65

    Matty Sarro Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    Python is an interpreted language, not a compiled language. This is
    actually a good thing! What it means is that there is a "scripting engine"
    (we just call it the interpreter) that actually executes everything for
    you. That means that any operating system that has an interpreter written
    for it is capable of running the exact same code (there are lots of
    exceptions to this, but in general it is true). It makes code much more
    portable. Also, it makes it easy to troubleshoot (compiled programs are a
    pain in the butt unless you add additional debugging elements to them).

    A compiled program on the other hand must be specifically compiled for the
    destination architecture (so if you're trying to write an OSX executable on
    windows, you need a compiler capable of doing that). So doing any sort of
    cross platform development can take significantly longer. Plus then, as I
    said, debugging will require additional debug tracing elements to be added
    to the code you write. The benefit though is that compilers can optimize
    code for you when they compile, and the compiled code will tend to run
    faster since you're not dealing with an interpreter between you and the
    machine.

    Now, there are places where this line is blurred. For instance perl is an
    interpreted language, but capable of running EXTREMELY fast. Python is a
    little slower, but significantly easier to read and write than perl. You
    also have some weird ones like JAVA which actually have a virtual machine,
    and "half compile" source code into java "bytecode." This is then executed
    by the virtual machine.

    I guess the ultimate point is that they're all designed for different
    purposes, and to solve different problems. Python was intended to make
    fast-to-write, easily understandable, easily portable code which can be
    executed on any system which has the Python interpreter. It's not really
    intended for things which require lower level access to hardware. It's what
    we call a "high level" programming language.

    C (your example) was intended for very low level programming, things like
    operating systems, device drivers, networking stacks, where the speed of a
    compiled executable and direct access to hardware was a necessity. That's
    what Dennis Ritchie wrote it for. We call it a "mid level" programming
    language, or a "low level" programming language depending on who you talk
    to. I'd have to say mid level because low level would be writing in
    assembly or playing with a hex editor :)

    Different tools for different jobs.

    HTH.

    -Matty


    On Thu, Feb 28, 2013 at 3:25 PM, kramer65 <> wrote:

    > Hello,
    >
    > I'm using Python for a while now and I love it. There is just one thing I
    > cannot understand. There are compilers for languages like C and C++. why is
    > it impossible to create a compiler that can compile Python code to
    > machinecode?
    >
    > My reasoning is as follows:
    > When GCC compiles a program written in C++, it simply takes that code and
    > decides what instructions that would mean for the computer's hardware. What
    > does the CPU need to do, what does the memory need to remember, etc. etc.
    > If you can create this machinecode from C++, then I would suspect that it
    > should also be possible to do this (without a C-step in between) for
    > programs written in Python.
    >
    > Where is my reasoning wrong here? Is that because Python is dynamically
    > typed? Does machinecode always need to know whether a variable is an int or
    > a float? And if so, can't you build a compiler which creates machinecode
    > that can handle both ints and floats in case of doubt? Or is it actually
    > possible to do, but so much work that nobody does it?
    >
    > I googled around, and I *think* it is because of the dynamic typing, but I
    > really don't understand why this would be an issue..
    >
    > Any insights on this would be highly appreciated!
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    Matty Sarro, Feb 28, 2013
    #2
    1. Advertising

  3. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    kramer65, 28.02.2013 21:25:
    > I'm using Python for a while now and I love it. There is just one thing
    > I cannot understand. There are compilers for languages like C and C++.
    > why is it impossible to create a compiler that can compile Python code
    > to machinecode?


    All projects that implement such compilers prove that it's quite possible.

    The most widely used static Python compiler is Cython, but there are also a
    couple of experimental compilers that do similar things in more or less
    useful or usable ways. And there are also a couple of projects that do
    dynamic runtime compilation, most notably PyPy and Numba.

    You may want to take a look at the Python implementations page,
    specifically the list of Python compilers:

    http://wiki.python.org/moin/PythonImplementations#Compilers


    > Does machinecode always need to know whether a variable is an int or a
    > float?


    Not at all. You're mixing different levels of abstraction here.


    > And if so, can't you build a compiler which creates machinecode
    > that can handle both ints and floats in case of doubt?


    Sure. Cython does just that, for example, unless you tell it explicitly to
    restrict a variable to a specific type. Basically, you get Python semantics
    by default and C semantics if you want to.

    Stefan
    Stefan Behnel, Feb 28, 2013
    #3
  4. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    On Fri, Mar 1, 2013 at 7:50 AM, Matty Sarro <> wrote:
    > C (your example) was intended for very low level programming, things like
    > operating systems, device drivers, networking stacks, where the speed of a
    > compiled executable and direct access to hardware was a necessity. That's
    > what Dennis Ritchie wrote it for. We call it a "mid level" programming
    > language, or a "low level" programming language depending on who you talk
    > to. I'd have to say mid level because low level would be writing in assembly
    > or playing with a hex editor :)


    Assembly is for people who write C compilers.
    C is for people who write language interpreters/compilers.
    Everyone else uses a high level language.

    Not 100% accurate but a reasonable rule of thumb.

    ChrisA
    Chris Angelico, Feb 28, 2013
    #4
  5. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    Stefan Behnel, 28.02.2013 22:03:
    > there are also a couple of projects that do
    > dynamic runtime compilation, most notably PyPy and Numba.


    Oh, and HotPy, I keep forgetting about that.

    > You may want to take a look at the Python implementations page,
    > specifically the list of Python compilers:
    >
    > http://wiki.python.org/moin/PythonImplementations#Compilers


    Stefan
    Stefan Behnel, Feb 28, 2013
    #5
  6. kramer65

    Dave Angel Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    On 02/28/2013 03:25 PM, kramer65 wrote:
    > Hello,
    >
    > I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode?
    >
    > My reasoning is as follows:
    > When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python.
    >
    > Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it?
    >
    > I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue..
    >
    > Any insights on this would be highly appreciated!
    >


    Sure, python could be compiled into machine code. But what machine? Do
    you refer to the hardware inside one of the Pentium chips? Sorry, but
    Intel doesn't expose those instructions to the public. Instead, they
    wrote a microcode interpreter, and embedded it inside their processor,
    and the "machine languages" that are documented as the Pentium
    Instruction sets are what that interpreter handles. Good thing too, as
    the microcode machine language has changed radically over time, and I'd
    guess there have been at least a dozen major variants, and a hundred
    different sets of details.

    So if we agree to ignore that interpreter, and consider the externally
    exposed machine language, we can pick a subset of the various such
    instruction sets, and make that our target.

    Can Python be compiled directly into that instruction set? Sure, it
    could. But would it be practical to write a compiler that went directly
    to it, or is it simpler to target C, and use gcc?

    Let's look at gcc. When you run it, does it look like it compiles C
    directly to machine language? Nope. It has 3 phases (last I looked,
    which was admittedly over 20 years ago). The final phase translates an
    internal form of program description into a particular "machine
    language". Even the mighty gcc doesn't do it in one step. Guess what,
    that means other languages can use the same back end, and a given
    language can use different back ends for different target machine
    languages. (Incidentally, Microsoft C compiler does the exact same
    thing, and a few of my patents involve injecting code between front end
    and back end)

    So now we have three choices. We could target the C language, and use
    all of gcc, or we could target the intermediate language, and use only
    the backend of gcc. Unfortunately, that intermediate language isn't
    portable between compilers, so you'd either have to write totally
    separate python compilers for each back end, or skip that approach, or
    abandon total portability.

    Well, we could write a Python compiler that targets an "abstract
    intermediate language," which in turn gets translated into each of the
    supported compiler's intermediate language. But that gets remarkably
    close to just targeting C in the first place.

    So how hard would it be just to directly target one machine language?
    Not too bad if you didn't try to do any optimizations, or adapt to the
    different quirks and missing features of the different implementations
    of that machine language. But I expect what you got would be neither
    smaller nor noticeably faster than the present system. Writing simple
    optimizations that improve some things is easy. Writing great
    optimizers that are also reliable and correct is incredibly hard. I'd
    expect that gcc has hundreds of man years of effort in it.

    Now, no matter which of these approaches you would take, there are some
    issues. The tricky part is not being flexible between int and float
    (and long, which is not part of the Intel machine instruction set), but
    between an unlimited set of possible meanings for each operation. Just
    picking on a+b, each class type that a and b might be can provide their
    own __add__ and/or __radd__ methods. All those have to be searched for,
    one has to be picked, and the code has to branch there. And that
    decision, in general, has to be made at runtime, not by the compiler.

    So by default, the code ends up being a twisted set of 4way
    indirections, calls to dict lookups, and finally calling a function that
    actually does an instruction or two of real work. Guess what, an
    interpreter can store those details much more succinctly (code size),
    and can run those choices nearly as quickly. So we're back to CPython.

    Could it be improved? Sure, that's why there are multiple projects
    which try to improve performance of the reference implementation. But
    each project seems to get to the point where the early promise of
    dozen-fold improvement dwindles down to a few times as fast, and not for
    everything. There are lots of things that can be improved with static
    analysis (so we're sure of the types of certain things), restricted
    language (so the developer gives us extra clues). But that work is
    nothing compared to what it would take to re-implement the equivalent of
    the back ends of gcc.

    Java works roughly the same way as Python, compiling to byte code files,
    then interpreting them. The interpreter is given the fancy name
    "virtual machine" because it really is an instruction set, one that
    could have been interpreted by Intel in their internal microcode. But
    they have their own history to stay compatible with. Look at the Merced
    and how it's taken the world by storm (NOT).

    But Java is much stricter about its byte code files, so each function is
    much closer to machine level. Nearly all those Python indirections are
    eliminated by the compiler (because it's not as dynamic a language), and
    they do JIT compiling. The latter is why they're quick.



    --
    DaveA
    Dave Angel, Feb 28, 2013
    #6
  7. kramer65

    Modulok Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    > I'm using Python for a while now and I love it. There is just one thing I
    > cannot understand. There are compilers for languages like C and C++. why is
    > it impossible to create a compiler that can compile Python code to
    > machinecode?


    Not exactly what you describe, but have you checked out PyPy?

    http://pypy.org/


    -Modulok-
    Modulok, Feb 28, 2013
    #7
  8. Re: Why is it impossible to create a compiler than can compilePython to machinecode like C?

    On do, feb 28, 2013 at 12:25:07pm -0800, kramer65 wrote:
    > Hello,
    >
    > I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode?
    >
    > My reasoning is as follows:
    > When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python.
    >
    > Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it?
    >
    > I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue..
    >
    > Any insights on this would be highly appreciated!
    >

    Guido actually encourages people to try to build different compilers for
    python. He thinks it might, one day, be possible to have a compiler for
    python.
    But this could only be possible if there was some kind of global
    file based annotation saying you will not use some of the dynamic parts
    of python. Else it won't be possible to create a compiler for such a
    highly dynmaic language as python.

    You can view the key-note where he talks about this here:
    http://www.youtube.com/watch?v=EBRMq2Ioxsc

    Jonas.
    Jonas Geiregat, Feb 28, 2013
    #8
  9. kramer65

    Nobody Guest

    Re: Why is it impossible to create a compiler than can compile Python to machinecode like C?

    On Thu, 28 Feb 2013 12:25:07 -0800, kramer65 wrote:

    > I'm using Python for a while now and I love it. There is just one thing
    > I cannot understand. There are compilers for languages like C and C++.
    > why is it impossible to create a compiler that can compile Python code
    > to machinecode?


    It's not impossible, it's just pointless.

    Because Python is dynamically-typed and late-bound, practically nothing is
    fixed at compile time. So a compiled Python program would just be a
    sequence of calls to interpreter functions.

    > Where is my reasoning wrong here? Is that because Python is dynamically
    > typed? Does machinecode always need to know whether a variable is an int
    > or a float?


    Yes.

    > And if so, can't you build a compiler which creates
    > machinecode that can handle both ints and floats in case of doubt?


    Yes. But it's not just ints and floats. E.g. Python's "+" operator works
    on any pair of objects provided that either the left-hand operand has an
    __add__ method or the right-hand operand has a __radd__ method.

    > Or is it actually possible to do, but so much work that nobody does it?


    It's not that it's "so much work" as much as the fact that the resulting
    executable wouldn't be any faster than using the interpreter. IOW, it's so
    much work for little or no gain.
    Nobody, Feb 28, 2013
    #9
  10. kramer65

    Terry Reedy Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    The subject line is wrong. There are multiple compilers. Someone just
    listed some of them today in another post.

    On 2/28/2013 3:50 PM, Matty Sarro wrote:
    > Python is an interpreted language, not a compiled language.


    A language is just a language. Implementations are implementations*.
    That aside, I pretty much agree with the rest of the response.

    * For instance, C is usually compiled, but I once used a C interpreter
    on unix.

    --
    Terry Jan Reedy
    Terry Reedy, Feb 28, 2013
    #10
  11. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    On Thu, 28 Feb 2013 22:33:36 +0100, Jonas Geiregat <>
    declaimed the following in gmane.comp.python.general:

    > But this could only be possible if there was some kind of global
    > file based annotation saying you will not use some of the dynamic parts
    > of python. Else it won't be possible to create a compiler for such a
    > highly dynmaic language as python.
    >

    Oh, you could create a compiler -- but it would have to link to a
    library that included a Python interpreter to handle the dynamic
    operations <G>
    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Mar 1, 2013
    #11
  12. Re: Why is it impossible to create a compiler than can compilePythonto machinecode like C?

    On Thu, 28 Feb 2013 15:50:00 -0500, Matty Sarro wrote:

    > Python is an interpreted language, not a compiled language.


    Actually, *languages* are neither interpreted nor compiled. A language is
    an abstract description of behaviour and syntax. Whether something is
    interpreted or compiled or a mixture of both is a matter of the
    implementation. There are C interpreters and Python compilers.



    [...]
    > Now, there are places where this line is blurred. For instance perl is
    > an interpreted language, but capable of running EXTREMELY fast. Python
    > is a little slower, but significantly easier to read and write than
    > perl. You also have some weird ones like JAVA which actually have a
    > virtual machine, and "half compile" source code into java "bytecode."
    > This is then executed by the virtual machine.


    Welcome to the 20th century -- nearly all so-called "interpreted"
    languages do that, including Python. Why do you think Python has a
    function called "compile", and what do you think the "c" in .pyc files
    stands for?

    The old model that you might have learned in school:

    * interpreters read a line of source code, execute it, then read the next
    line, execute it, then read the next one, and so forth...

    * compilers convert the entire source code to machine code, then execute
    the machine code.


    hasn't been generally true since, well, probably forever, but certainly
    not since the 1980s.

    These days, the best definition of "interpreted language" that I have
    read comes from Roberto Ierusalimschy, one of the creators of Lua:

    "...the distinguishing feature of interpreted languages is not that they
    are not compiled, but that the compiler is part of the language runtime
    and that, therefore, it is possible (and easy) to execute code generated
    on the fly."

    (Programming in Lua, 2nd edition, page 63.)

    In that sense, being an interpreter is a feature, and pure compilers are
    deficient.


    Oh, by the way, while it is true that the original version of Java used a
    pure virtual machine model, these days many Java compilers are capable of
    producing machine code.

    Just to drive home the lesson that *languages* aren't compiled or
    interpreted, but *implementations* are, consider these Python
    implementations with radically different execution styles:

    1) CPython, the one you are used to, compiles code to byte-code for a
    custom-made virtual machine;

    2) Jython generates code to run on a Java virtual machine;

    3) IronPython does the same for the .Net CLR;

    4) PyPy has a JIT compiler that generates machine code at runtime;

    5) Pynie compiles to byte-code for the Parrot virtual machine;

    6) Nuitka includes a static compiler that compiles to machine code;

    7) Berp generates Haskell code, which is then compiled and executed by a
    Haskell compiler, which may or may not generate machine code;

    8) Pyjamas compiles Python to Javascript;

    and others.


    And even machine code is not actually machine code. Some CPUs have an
    even lower level of micro-instructions, and an interpreter to translate
    the so-called "machine code" into micro-instructions before executing
    them.



    --
    Steven
    Steven D'Aprano, Mar 1, 2013
    #12
  13. Re: Why is it impossible to create a compiler than can compilePythonto machinecode like C?

    On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote:

    > The most widely used static Python compiler is Cython


    Cython is not a Python compiler. Cython code will not run in a vanilla
    Python implementation. It has different keywords and syntax, e.g.:

    cdef inline int func(double num):
    ...


    which gives SyntaxError in a Python compiler.

    Cython is an excellent language and a great addition to the Python
    ecosystem, but it is incorrect to call it "Python".


    --
    Steven
    Steven D'Aprano, Mar 1, 2013
    #13
  14. Re: Why is it impossible to create a compiler than can compilePython to machinecode like C?

    On Thu, 28 Feb 2013 12:25:07 -0800, kramer65 wrote:

    > Hello,
    >
    > I'm using Python for a while now and I love it. There is just one thing
    > I cannot understand. There are compilers for languages like C and C++.
    > why is it impossible to create a compiler that can compile Python code
    > to machinecode?


    Your assumption is incorrect. You can compile Python to machine-code, at
    least sometimes. It is quite tricky, for various reasons, but it can be
    done, at various levels of efficiency.

    One of the oldest such projects was Psyco, which was a Just-In-Time
    compiler for Python. When Psyco was running, it would detect at run time
    that you were doing calculations on (say) standard ints, compile on the
    fly a machine-code function to perform those calculations, and execute
    it. Psyco has more or less been made obsolete by PyPy, which does the
    same thing only even more so.

    http://en.wikipedia.org/wiki/Psyco
    http://en.wikipedia.org/wiki/PyPy


    > My reasoning is as follows:
    > When GCC compiles a program written in C++, it simply takes that code
    > and decides what instructions that would mean for the computer's
    > hardware. What does the CPU need to do, what does the memory need to
    > remember, etc. etc. If you can create this machinecode from C++, then I
    > would suspect that it should also be possible to do this (without a
    > C-step in between) for programs written in Python.


    In principle, yes, but in practice it's quite hard, simply because Python
    does so much more at runtime than C++ (in general).

    Take an expression like:

    x = a + b

    In C++, the compiler knows what kind of data a and b are, what kind of
    data x is supposed to be. They are often low-level machine types like
    int32 or similar, which the CPU can add directly (or at least, the
    compiler can fake it). Even if the variables are high-level objects, the
    compiler can usually make many safe assumptions about what methods will
    be called, and can compile instructions something like this pseudo-code:

    10 get the int64 at location 12348 # "a"
    20 get the int64 at location 13872 # "b"
    30 jump to the function at location 93788 # add two int64s
    40 store the result at location 59332 # "x"

    which is fast and efficient because most of the hard work is done at
    compile time. But it's also quite restrictive, because you can't change
    code on the fly, create new types or functions, etc. (Or, where you can,
    then you lose some of the advantages of C++ and end up with something
    like Python but with worse syntax.)

    In Python, you don't know what a and b are until runtime. They could be
    ints, or lists, or strings, or anything. The + operator could call a
    custom __add__ method, or a __radd__ method, from some arbitrary class.
    Because nearly everything is dynamic, the Python compiler cannot safely
    make many assumptions about the code at compile time. So you end up with
    code like this:

    10 search for the name "a" and take note of it
    20 search for the name "b" and take note of it
    30 decide whether to call a.__add__ or b.__radd__
    40 call the appropriate method
    60 bind the result to the name "x"


    You can get an idea of what Python actually does by disassembling the
    byte code into pseudo-assembly language:


    py> code = compile("x = a + b", '', 'single')
    py> from dis import dis
    py> dis(code)
    1 0 LOAD_NAME 0 (a)
    3 LOAD_NAME 1 (b)
    6 BINARY_ADD
    7 STORE_NAME 2 (x)
    10 LOAD_CONST 0 (None)
    13 RETURN_VALUE


    Nevertheless, PyPy can often speed up Python code significantly,
    sometimes to the speed of C or even faster.

    http://morepypy.blogspot.com.au/2011/02/pypy-faster-than-c-on-carefully-crafted.html

    http://morepypy.blogspot.com.au/2011/08/pypy-is-faster-than-c-again-string.html



    --
    Steven
    Steven D'Aprano, Mar 1, 2013
    #14
  15. kramer65

    alex23 Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    On Mar 1, 1:47 pm, Steven D'Aprano <steve
    > wrote:
    > Cython is not a Python compiler. Cython code will not run in a vanilla
    > Python implementation. It has different keywords and syntax, e.g.:
    >
    > cdef inline int func(double num):
    >     ...
    >
    > which gives SyntaxError in a Python compiler.


    Cython has had a "pure Python" mode for several years now that allows
    you to decorate Python code or augment it with additional files
    containing the C specific declarations:

    http://docs.cython.org/src/tutorial/pure.html

    Both of which will be ignored by the regular Python interpreter,
    allowing you to write Python that is also suitable for Cython without
    the errors you mention.
    alex23, Mar 1, 2013
    #15
  16. kramer65

    alex23 Guest

    Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    On Mar 1, 6:25 am, kramer65 <> wrote:
    > There are compilers for languages like C and C++. why
    > is it impossible to create a compiler that can compile
    > Python code to machinecode?


    This is a nice site list a lot of current approaches to that subject:

    http://compilers.pydata.org/
    alex23, Mar 1, 2013
    #16
  17. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    kramer65æ–¼ 2013å¹´3月1日星期五UTC+8上åˆ4時25分07秒寫é“:
    > Hello,
    >
    >
    >
    > I'm using Python for a while now and I love it. There is just one thing Icannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode?
    >
    >
    >
    > My reasoning is as follows:
    >
    > When GCC compiles a program written in C++, it simply takes that code anddecides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc.If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python.
    >
    >
    >
    > Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it?
    >
    >
    >
    > I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue..
    >
    >
    >
    > Any insights on this would be highly appreciated!


    I think a smart object can perform some experiments in its lifetime
    in sensing and collecting data to improve its methods in the long run.

    This will require a dynamical language definitely.
    88888 Dihedral, Mar 1, 2013
    #17
  18. Re: Why is it impossible to create a compiler than can compile Pythonto machinecode like C?

    Steven D'Aprano, 01.03.2013 04:47:
    > On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote:
    >
    >> The most widely used static Python compiler is Cython

    >
    > Cython is not a Python compiler. Cython code will not run in a vanilla
    > Python implementation. It has different keywords and syntax, e.g.:
    >
    > cdef inline int func(double num):
    > ...
    >
    > which gives SyntaxError in a Python compiler.


    Including Cython, if you're compiling a ".py" file. The above is only valid
    syntax in ".pyx" files. Two languages, one compiler. Or three languages, if
    you want, because Cython supports both Python 2 and Python 3 code in
    separate compilation modes.

    The old model, which you might have learned at school:

    * a Python implementation is something that runs Python code

    * a Cython implementation is something that does not run Python code

    hasn't been generally true since, well, probably forever. Even Cython's
    predecessor Pyrex was capable of compiling a notable subset of Python code,
    and Cython has gained support for pretty much all Python language features
    about two years ago. Quoting the project homepage: "the Cython language is
    a superset of the Python language".

    http://cython.org/

    If you don't believe that, just try it yourself. Try to compile some Python
    3 code with it, if you find the time. Oh, and pass the "-3" option to the
    compiler in that case, so that it knows that it should switch to Python 3
    syntax/semantics mode. It can't figure that out from the file extension
    (although you can supply the language level of the file in a header comment
    tag). And while you're at it, also pass the "-a" option to let it generate
    an HTML analysis of your code that highlights CPython interaction and thus
    potential areas for manual optimisation.

    The "superset" bit doesn't mean I've stopped fixing bugs from time to time
    that CPython's regression test suite reveals. If you want to get an idea of
    Cython's compatibility level, take a look at the test results, there are
    still about 470 failing tests left out of 26000 in the test suites of Py2.7
    and 3.4:

    https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests-pyregr/

    One reason for a couple of those failures (definitely not all of them) is
    that Cython rejects some code at compile time that CPython only rejects at
    runtime. That's because the tests were explicitly written for CPython and
    assume that the runtime cannot detect some errors before executing the
    code. So, in a way, being capable of doing static analysis actually
    prevents Cython from being fully CPython compatible. I do not consider that
    a bad thing.

    And, BTW, we also compile most of Python's benchmark suite by now:

    https://sage.math.washington.edu:8091/hudson/view/bench/

    The results are definitely not C-ishly fast, usually only some 10-80%
    improvement or so, e.g. only some 35% in the Django benchmark, but some of
    the results are quite ok for plain Python code that is not manually
    optimised for compilation. Remember, there are lots of optimisations that
    we deliberately do not apply, and static analysis generally cannot detect a
    lot of dynamic code patterns, runtime determined types, etc. That's clearly
    PyPy's domain, with its own set of pros and cons.

    The idea behind Cython is not that it will magically make your plain Python
    code incredibly fast. The idea is to make it really, really easy for users
    to bring their code up to C speed *themselves*, in the exact spots where
    the code really needs it. And yes, as was already mentioned in this thread,
    there is a pure Python mode for this that allows you to keep your code in
    plain Python syntax while optimising it for compilation. The "Cython
    optimised" benchmarks on the page above do exactly that.

    I wrote a half-rant about static Python compilation in a recent blog post.
    It's in English, and you might actually want to read it. I would say that I
    can claim to know what I'm talking about.

    http://blog.behnel.de/index.php?p=241

    Stefan
    Stefan Behnel, Mar 1, 2013
    #18
  19. Re: Why is it impossible to create a compiler than can compilePythonto machinecode like C?

    On Fri, 01 Mar 2013 08:48:34 +0100, Stefan Behnel wrote:

    > Steven D'Aprano, 01.03.2013 04:47:
    >> On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote:
    >>
    >>> The most widely used static Python compiler is Cython

    >>
    >> Cython is not a Python compiler. Cython code will not run in a vanilla
    >> Python implementation. It has different keywords and syntax, e.g.:
    >>
    >> cdef inline int func(double num):
    >> ...
    >>
    >> which gives SyntaxError in a Python compiler.

    >
    > Including Cython, if you're compiling a ".py" file. The above is only
    > valid syntax in ".pyx" files. Two languages, one compiler. Or three
    > languages, if you want, because Cython supports both Python 2 and Python
    > 3 code in separate compilation modes.



    Ah, that's very interesting, and thank you for the correction. I have re-
    set my thinking about Cython.



    --
    Steven
    Steven D'Aprano, Mar 2, 2013
    #19
  20. Re: Why is it impossible to create a compiler than can compilePython to machinecode like C?

    On 2013-02-28, kramer65 <> wrote:

    > I'm using Python for a while now and I love it. There is just one
    > thing I cannot understand. There are compilers for languages like C
    > and C++. why is it impossible to create a compiler that can compile
    > Python code to machinecode?


    The main issue is that python has dynamic typing. The type of object
    that is referenced by a particular name can vary, and there's no way
    (in general) to know at compile time what the type of object "foo" is.

    That makes generating object code to manipulate "foo" very difficult.


    > My reasoning is as follows: When GCC compiles a program written in
    > C++, it simply takes that code and decides what instructions that
    > would mean for the computer's hardware. What does the CPU need to do,
    > what does the memory need to remember, etc. etc. If you can create
    > this machinecode from C++, then I would suspect that it should also
    > be possible to do this (without a C-step in between) for programs
    > written in Python.
    >
    > Where is my reasoning wrong here? Is that because Python is
    > dynamically typed?


    Yes.

    > Does machinecode always need to know whether a
    > variable is an int or a float?


    Yes. Not only might it be an int or a float, it might be a string, a
    list, a dictionary, a network socket, a file, or some user-defined
    object type that the compiler has no way of knowing about.

    > And if so, can't you build a compiler which creates machinecode that
    > can handle both ints and floats in case of doubt?


    That's pretty much what you've got now. The Python compiler compiles
    the source code as much as it can, and the VM is the "machinecode that
    can handle both ints and floats".

    > Or is it actually possible to do, but so much work that nobody does
    > it?
    >
    > I googled around, and I *think* it is because of the dynamic typing,
    > but I really don't understand why this would be an issue..


    Can you explain how to generate machine code to handle any possible
    object type than any Python user might ever create?

    --
    Grant Edwards grant.b.edwards Yow! for ARTIFICIAL
    at FLAVORING!!
    gmail.com
    Grant Edwards, Mar 4, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. RickMuller
    Replies:
    4
    Views:
    692
    Alexey Shamrin
    Mar 26, 2005
  2. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,761
    Smokey Grindel
    Dec 2, 2006
  3. Vittorix
    Replies:
    5
    Views:
    334
    Vittorix
    Nov 17, 2006
  4. Replies:
    12
    Views:
    718
    Lionel B
    Jun 24, 2008
  5. Replies:
    5
    Views:
    249
    Michele Dondi
    Jun 30, 2006
Loading...

Share This Page