Re: Python is faster than C

Discussion in 'Python' started by Armin Rigo, Apr 3, 2004.

  1. Armin Rigo

    Armin Rigo Guest

    Hello Robert,

    On Sat, Apr 03, 2004 at 12:30:38PM -0800, Robert Brewer wrote:
    > > enumerate() should return a normal list, and
    > > it should be someone else's job to ensure that it is
    > > correctly optimized away if possible

    >
    > I'd like to think I'm not understanding your point, but you made it so
    > danged *clear*.
    >
    > Enumerate should absolutely *not* return a normal list.


    You missed my point indeed. There are two levels here: one is the language
    specification (the programmer's experience), and one is the CPython
    implementation. My point is that with some more cleverness in the
    implementation, iterators would be much less needed at the language
    specification level (I'm not saying never, I think generators are great, for
    example).

    > The use case I think you're missing is when I do not want the enumeration
    > optimized at all; I want it performed on-the-fly on purpose:


    This is what I mean by "optimized": done lazily, on-the-fly. I want better
    implementations of lists, callbacks-on-changes, static bytecode analysis, and
    more. I don't want another notion than lists at the language level. Your
    example:

    > for i, line in enumerate(file('40GB.csv')):


    is among the easiest to optimize, even if the language specification said that
    enumerate returns a list. I can think of several ways to do that. For
    example, because the result of enumerate() is only ever used in a for loop, it
    knows it can internally return an iterator instead of the whole list. There
    are some difficulties, but nothing critical. Another option which is harder
    in CPython but which we are experimenting with in PyPy would be to return a
    Python object of type 'list' but with a different, lazy implementation.

    > Forcing enumerate to return a list would drag not only the entire
    > 40GB.csv into memory, but also the entire set of i. Using an iterator in
    > this case instead of a list *is* the optimization.


    Yes, and I'm ranting against the idea that the programmer should be bothered
    about it, when it could be as efficient automatically. From the programmer's
    perspective, iterators are mostly like a sequence that you can only access
    once and in order. A better implementation can figure out for itself when you
    are only accessing this sequence once and in order. I mean, it is just like
    range(1000000) which is a list all right, but there is just no reason why this
    list should consume 4MB of CPython's memory when the same information can be
    encoded in a couple of ints as long as you don't change the list. The
    language doesn't need xrange() -- it is an implementation issue that shows up
    in the Python language.


    Armin
     
    Armin Rigo, Apr 3, 2004
    #1
    1. Advertising

  2. Armin Rigo wrote:

    >> Forcing enumerate to return a list would drag not only the entire
    >> 40GB.csv into memory, but also the entire set of i. Using an iterator
    >> in this case instead of a list *is* the optimization.

    >
    > Yes, and I'm ranting against the idea that the programmer should be
    > bothered about it, when it could be as efficient automatically. From
    > the programmer's perspective, iterators are mostly like a sequence
    > that you can only access once and in order. A better implementation
    > can figure out for itself when you are only accessing this sequence
    > once and in order.


    It seems bad to me to teach programmers to depend on such optimizations
    happening automatically. Then people will sometimes depend on an
    optimization at a time Python does not perform it, and the program will
    unexpectedly try to consume 4G memory or whatever. In particular if too
    many such optimizations are added, so programmers lose track of which
    optimizations are performed when. Debugging such a problem will be no
    fun either, for the same reason.

    By all means use the same types and language constructs that already
    exist instead of heaping on new ones, but add a way to say 'optimize
    this!' and raise an exception if the construct is used in a way which
    prevents the optimization.

    --
    Hallvard
     
    Hallvard B Furuseth, Apr 4, 2004
    #2
    1. Advertising

  3. Armin Rigo

    Aahz Guest

    In article <>,
    Armin Rigo <> wrote:
    >
    >You missed my point indeed. There are two levels here: one is the
    >language specification (the programmer's experience), and one is the
    >CPython implementation. My point is that with some more cleverness in
    >the implementation, iterators would be much less needed at the language
    >specification level (I'm not saying never, I think generators are
    >great, for example).


    Yes, exactly. Without generators, I'm not sure iterators would have
    taken off to the extent they have.

    >Yes, and I'm ranting against the idea that the programmer should be
    >bothered about it, when it could be as efficient automatically. From
    >the programmer's perspective, iterators are mostly like a sequence that
    >you can only access once and in order. A better implementation can
    >figure out for itself when you are only accessing this sequence once
    >and in order. I mean, it is just like range(1000000) which is a list
    >all right, but there is just no reason why this list should consume
    >4MB of CPython's memory when the same information can be encoded in
    >a couple of ints as long as you don't change the list. The language
    >doesn't need xrange() -- it is an implementation issue that shows up in
    >the Python language.


    While I'm generally in favor of what you're talking about, it seems to a
    certain extent that you're simply shifting complexity. Maintaining the
    simplicity of the Python VM is an important goal, I think, and some of
    your suggestions run counter to that goal.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "usenet imitates usenet" --Darkhawk
     
    Aahz, Apr 5, 2004
    #3
  4. Armin Rigo

    Matthias Guest

    (Aahz) writes:
    > While I'm generally in favor of what you're talking about, it seems to a
    > certain extent that you're simply shifting complexity. Maintaining the
    > simplicity of the Python VM is an important goal, I think, and some of
    > your suggestions run counter to that goal.


    Isn't the whole idea of very high level languages to shift complexity
    from the user code to the language implementation?

    That's not a rhetorical question: Why is it that "simplicity of the
    Python VM is an important goal"? I would guess that overall it pays
    to have a more complex language implementation and be rewarded by
    simpler user code: For any decent language there's much more user code
    out there than language implementation code.

    One example where Python in the past made (in my opinion, for my
    particular projects) the wrong choice is speed: People argued that
    "simplicity of the Python VM" is more important than speed gains. The
    result (for my code) was that after profiling, etc., I was coding
    significant parts of my programs in C. No productivity gain
    observed. With JIT compilation (psyco) this step might become
    unnecessary: More complex VM, greatly simplified user code.
     
    Matthias, Apr 5, 2004
    #4
  5. Matthias wrote:
    > That's not a rhetorical question: Why is it that "simplicity of the
    > Python VM is an important goal"?


    Replace 'simplicity' with 'portability'. This is especially true for JIT
    compilers, which are not only complex, but are unportable by design.

    > I would guess that overall it pays
    > to have a more complex language implementation and be rewarded by
    > simpler user code: For any decent language there's much more user code
    > out there than language implementation code.


    The question is not 'does it pay?', the question is 'who pays?'.

    Daniel
     
    Daniel Dittmar, Apr 5, 2004
    #5
  6. Matthias <> writes:

    > (Aahz) writes:
    > > While I'm generally in favor of what you're talking about, it seems to a
    > > certain extent that you're simply shifting complexity. Maintaining the
    > > simplicity of the Python VM is an important goal, I think, and some of
    > > your suggestions run counter to that goal.

    >
    > Isn't the whole idea of very high level languages to shift complexity
    > from the user code to the language implementation?
    >
    > That's not a rhetorical question: Why is it that "simplicity of the
    > Python VM is an important goal"?


    Well, possibly because it's easier to predict what a simple
    implementation is up to.

    Cheers,
    mwh

    --
    You sound surprised. We're talking about a government department
    here - they have procedures, not intelligence.
    -- Ben Hutchings, cam.misc
     
    Michael Hudson, Apr 5, 2004
    #6
  7. Armin Rigo

    Joe Mason Guest

    In article <-mannheim.de>, Matthias wrote:
    > Isn't the whole idea of very high level languages to shift complexity
    > from the user code to the language implementation?


    Yes, but there's always a tradeoff to be made. Going through
    contortions in the VM to make user code only slightly simpler isn't
    worthwhile. Neither is going through contortions to optimize an
    extremely rare bit of user code.

    Also, a big part of keeping user code simple is making the language
    conceptually simple. If lots of optimizations are done behind the
    user's back, it can be confusing to remember which operations are
    already optimized.

    Take the example of storing large lists as a small pattern and a
    repetition counter. I'd argue that this is hard to get right at the VM
    level because you need to consider lots of cases - it's much easier for
    the user who knows exactly what patterns need to be optimized. I also
    doubt it will come up too often. Finally, a user might assume the
    optimization is more powerful than it is - for instance, noticing that
    huge lists of repeating numbers magically take little memory, they might
    fill a huge list in randomly and assume it will work.

    Joe
     
    Joe Mason, Apr 5, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Armin Rigo

    Python is faster than C

    Armin Rigo, Apr 3, 2004, in forum: Python
    Replies:
    36
    Views:
    870
    Stephen Horne
    Apr 6, 2004
  2. Robert Brewer

    RE: Python is faster than C

    Robert Brewer, Apr 4, 2004, in forum: Python
    Replies:
    2
    Views:
    291
    Andrew Dalke
    Apr 5, 2004
  3. Replies:
    37
    Views:
    946
    Thomas Bartkus
    Jul 11, 2005
  4. Stef Mientki

    Wow, Python much faster than MatLab

    Stef Mientki, Dec 29, 2006, in forum: Python
    Replies:
    11
    Views:
    664
    sturlamolden
    Jan 1, 2007
  5. Doran, Harold

    RE: Wow, Python much faster than MatLab

    Doran, Harold, Dec 30, 2006, in forum: Python
    Replies:
    10
    Views:
    1,118
    Wensui Liu
    Jan 1, 2007
Loading...

Share This Page