I could use some help making this Python code run faster using only Python code.

Discussion in 'Python' started by Python Maniac, Sep 20, 2007.

  1. I am new to Python however I would like some feedback from those who
    know more about Python than I do at this time.

    def scrambleLine(line):
    s = ''
    for c in line:
    s += chr(ord(c) | 0x80)
    return s

    def descrambleLine(line):
    s = ''
    for c in line:
    s += chr(ord(c) & 0x7f)
    return s

    def scrambleFile(fname,action=1):
    if (path.exists(fname)):
    try:
    f = open(fname, "r")
    toks = fname.split('.')
    while (len(toks) > 2):
    toks.pop()
    fname = '.'.join(toks)
    if (action == 1):
    _fname = fname + '.scrambled'
    elif (action == 0):
    _fname = fname + '.descrambled'
    if (path.exists(_fname)):
    os.remove(_fname)
    ff = open(_fname, "w+")
    if (action == 1):
    for l in f:
    ff.write(scrambleLine(l))
    elif (action == 0):
    for l in f:
    ff.write(descrambleLine(l))
    except Exception, details:
    print 'ERROR :: (%s)' % details
    finally:
    f.close()
    ff.close()
    else:
    print 'WARNING :: Missing file "%s" - cannot continue.' % fname
     
    Python Maniac, Sep 20, 2007
    #1
    1. Advertising

  2. Python Maniac

    Paul Hankin Guest

    On Sep 20, 10:59 pm, Python Maniac <> wrote:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def descrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) & 0x7f)
    > return s
    > ...


    Well, scrambleLine will remove line-endings, so when you're
    descrambling
    you'll be processing the entire file at once. This is particularly bad
    because of the way your functions work, adding a character at a time
    to
    s.

    Probably your easiest bet is to iterate over the file using read(N)
    for some small N rather than doing a line at a time. Something like:

    process_bytes = (descrambleLine, scrambleLine)[action]
    while 1:
    r = f.read(16)
    if not r: break
    ff.write(process_bytes(r))

    In general, rather than building strings by starting with an empty
    string and repeatedly adding to it, you should use ''.join(...)

    For instance...
    def descrambleLine(line):
    return ''.join(chr(ord(c) & 0x7f) for c in line)

    def scrambleLine(line):
    return ''.join(chr(ord(c) | 0x80) for c in line)

    It's less code, more readable and faster!

    --
    Paul Hankin
     
    Paul Hankin, Sep 20, 2007
    #2
    1. Advertising

  3. Re: I could use some help making this Python code run faster usingonly Python code.

    Python Maniac schrieb:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def descrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) & 0x7f)
    > return s


    These might benefit from using a lookup-dictionary that maps ord(c) to
    the outcome of the operation. Actually, it becomes twice as fast:

    import time

    lt = {}
    for i in xrange(255):
    lt[chr(i)] = chr(i | 0x80)

    def scrambleLine(line):
    s = ''
    for c in line:
    s += chr(ord(c) | 0x80)
    return s

    def scrambleLineLU(line):
    s = ''
    for c in line:
    s += lt[c]
    return s


    if __name__ == "__main__":
    line = "abcdefghijklmnop" * 1000000
    start = time.time()
    scrambleLine(line)
    print time.time() - start

    start = time.time()
    scrambleLineLU(line)
    print time.time() - start



    Diez
     
    Diez B. Roggisch, Sep 20, 2007
    #3
  4. Re: I could use some help making this Python code run faster usingonly Python code.

    On 9/20/07, Python Maniac <> wrote:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.


    Well, you could save some time by not applying the scramble one line
    at a time (that is if you don't mind losing the line endings in the
    scrambled version). For that to be effective though, you probably want
    to open in binary mode. Also, your scramble can be written using list
    comprehension.

    Code:
    def scramble(s, key=0x80):
       return ''.join([chr(ord(c) ^ key) for c in s])
    
    output = scramble(f.read())
    
    If you use xor (^) as above, you can use the same method for scramble
    as descramble (running it again with the same key will descramble) and
    you can use an arbitrary key. Though, with 255 combinations, it isn't
    very strong encryption.

    If you want stronger encryption you can use the following AESish algorithm:

    Code:
    import random
    def scramble(s, key):
        random.seed(key)
        return ''.join([chr(ord(c) ^ random.randint(0,255)) for c in s])
    
    This allows you to use much larger keys, but with a similar effect.
    Still not strong enough to be unbreakable, but much better than the
    origional. It is strong enough that someone knowing how you scrambled
    it will have trouble unscrambling it even if they don't know the key.

    Matt
     
    Matt McCredie, Sep 20, 2007
    #4
  5. Python Maniac

    Tim Williams Guest

    Re: I could use some help making this Python code run faster usingonly Python code.

    On 20/09/2007, Python Maniac <> wrote:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def descrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) & 0x7f)
    > return ''.join( [chr(ord(c) & 0x7f) for c in line] )
    >
    > def scrambleFile(fname,action=1):
    > if (path.exists(fname)):
    > try:
    > f = open(fname, "r")
    > toks = fname.split('.')
    > while (len(toks) > 2):
    > toks.pop()
    > fname = '.'.join(toks)
    > if (action == 1):
    > _fname = fname + '.scrambled'
    > elif (action == 0):
    > _fname = fname + '.descrambled'
    > if (path.exists(_fname)):
    > os.remove(_fname)
    > ff = open(_fname, "w+")
    > if (action == 1):
    > for l in f:
    > ff.write(scrambleLine(l))
    > elif (action == 0):
    > for l in f:
    > ff.write(descrambleLine(l))
    > except Exception, details:
    > print 'ERROR :: (%s)' % details
    > finally:
    > f.close()
    > ff.close()
    > else:
    > print 'WARNING :: Missing file "%s" - cannot continue.' % fname
    >
    > --



    def scrambleLine(line):
    return ''.join( [chr(ord(c) | 0x80) for c in line] )

    def descrambleLine(line):
    return ''.join( [chr(ord(c) & 0x7f) for c in line] )

    def scrambleFile(fname,action=1):
    try:
    f = open(fname, "r")
    fname = '.'.join(fname.split('.')[:2])
    if action:
    _fname = fname + '.scrambled'
    else:
    _fname = fname + '.descrambled'
    ff = open(_fname, "w")
    if action:
    ff.write('\r\n.join([scrambleLine(l) for l in f ]))
    else :
    ff.write('\r\n.join([descrambleLine(l) for l in f ]))
    f.close()
    ff.close()
    except Exception, details:
    print 'ERROR :: (%s)' % details

    HTH :)
     
    Tim Williams, Sep 21, 2007
    #5
  6. Python Maniac

    Tim Williams Guest

    Re: I could use some help making this Python code run faster usingonly Python code.

    On 21/09/2007, Tim Williams <> wrote:
    > On 20/09/2007, Python Maniac <> wrote:
    > > I am new to Python however I would like some feedback from those who
    > > know more about Python than I do at this time.
    > >
    > > def scrambleLine(line):
    > > s = ''
    > > for c in line:
    > > s += chr(ord(c) | 0x80)
    > > return s
    > >
    > > def descrambleLine(line):
    > > s = ''
    > > for c in line:
    > > s += chr(ord(c) & 0x7f)
    > > return ''.join( [chr(ord(c) & 0x7f) for c in line] )
    > >
    > > def scrambleFile(fname,action=1):
    > > if (path.exists(fname)):
    > > try:
    > > f = open(fname, "r")
    > > toks = fname.split('.')
    > > while (len(toks) > 2):
    > > toks.pop()
    > > fname = '.'.join(toks)
    > > if (action == 1):
    > > _fname = fname + '.scrambled'
    > > elif (action == 0):
    > > _fname = fname + '.descrambled'
    > > if (path.exists(_fname)):
    > > os.remove(_fname)
    > > ff = open(_fname, "w+")
    > > if (action == 1):
    > > for l in f:
    > > ff.write(scrambleLine(l))
    > > elif (action == 0):
    > > for l in f:
    > > ff.write(descrambleLine(l))
    > > except Exception, details:
    > > print 'ERROR :: (%s)' % details
    > > finally:
    > > f.close()
    > > ff.close()
    > > else:
    > > print 'WARNING :: Missing file "%s" - cannot continue.' % fname
    > >
    > > --

    >
    >
    > def scrambleLine(line):
    > return ''.join( [chr(ord(c) | 0x80) for c in line] )
    >
    > def descrambleLine(line):
    > return ''.join( [chr(ord(c) & 0x7f) for c in line] )
    >
    > def scrambleFile(fname,action=1):
    > try:
    > f = open(fname, "r")
    > fname = '.'.join(fname.split('.')[:2])
    > if action:
    > _fname = fname + '.scrambled'
    > else:
    > _fname = fname + '.descrambled'
    > ff = open(_fname, "w")
    > if action:
    > ff.write('\r\n.join([scrambleLine(l) for l in f ]))
    > else :
    > ff.write('\r\n.join([descrambleLine(l) for l in f ]))
    > f.close()
    > ff.close()
    > except Exception, details:
    > print 'ERROR :: (%s)' % details
    >
    > HTH :)
    >


    or maybe even this:

    Apologies for the self-reply, its late here !
    (a couple of typos fixed too! )

    def scrambleFile(fname,action=1):
    try:
    f = open(fname, "r")
    fname = '.'.join(fname.split('.')[:2])
    if action:
    ff = open(fname + '.scrambled', "w")
    ff.write('\r\n'.join([scrambleLine(l) for l in f ]))
    else :
    ff = open(fname + '.descrambled', "w")
    ff.write('\r\n'.join([descrambleLine(l) for l in f ]))
    f.close()
    ff.close()

    except Exception, details:
    print 'ERROR :: (%s)' % details

    :)
     
    Tim Williams, Sep 21, 2007
    #6
  7. On Sep 20, 3:57 pm, "Matt McCredie" <> wrote:
    > On 9/20/07, Python Maniac <> wrote:
    >
    > > I am new to Python however I would like some feedback from those who
    > > know more about Python than I do at this time.

    >
    > Well, you could save some time by not applying the scramble one line
    > at a time (that is if you don't mind losing the line endings in the
    > scrambled version). For that to be effective though, you probably want
    > to open in binary mode. Also, your scramble can be written using list
    > comprehension.
    >
    >
    Code:
    > def scramble(s, key=0x80):
    >    return ''.join([chr(ord(c) ^ key) for c in s])
    >
    > output = scramble(f.read())
    > 
    >
    > If you use xor (^) as above, you can use the same method for scramble
    > as descramble (running it again with the same key will descramble) and
    > you can use an arbitrary key. Though, with 255 combinations, it isn't
    > very strong encryption.
    >
    > If you want stronger encryption you can use the following AESish algorithm:
    >
    >
    Code:
    > import random
    > def scramble(s, key):
    >     random.seed(key)
    >     return ''.join([chr(ord(c) ^ random.randint(0,255)) for c in s])
    > 
    >
    > This allows you to use much larger keys, but with a similar effect.
    > Still not strong enough to be unbreakable, but much better than the
    > origional. It is strong enough that someone knowing how you scrambled
    > it will have trouble unscrambling it even if they don't know the key.
    >
    > Matt


    So far I like what was said in this reply however my ultimate goal is
    to allow the scramble method to be more than what is shown above.

    I considered using XOR however XOR only works for the case shown above
    where the scramble method simply sets or removes the MSB.

    What I want to be able to do is set or reset the MSB in addition to
    performing a series of additional steps without negatively impacting
    performance for the later cases when the MSB is not the only technique
    being employed.

    Hopefully this sheds more light on the goal I have in mind.

    BTW - My original code is able to scramble a 20 MB file in roughly 40
    secs using a Dell E6600 2.4 GHz. When I began writing the code for
    this problem my best runtime was about 65 secs so I know I was heading
    in the right direction.

    I was about to begin the process of using the D Language and pyd to
    gain better performance but then I thought I might also take this
    opportunity to learn something about Python before delving into D.
    Obviously I could simply code the whole process using D but that
    defeats the purpose for using Python in the first place and so I would
    tend to limit my low-level coding to the task of scrammbling each line
    of text.

    The ironic thing about this exorcise was the fact that the
    optimization techniques that worked for Python caused the Ruby version
    I coded to decrease performance. It seems Ruby has some definite
    ideas about what it feels is optimal that it ain't got much to do with
    what I would consider to be traditional optimization techniques where
    Ruby is concerned. For instance, Ruby felt the putc method was less
    optimal than the write method which makes no sense to me but that's
    life with Ruby.
     
    Python Maniac, Sep 21, 2007
    #7
  8. Python Maniac

    Guest

    On Sep 20, 5:46 pm, Paul Hankin <> wrote:
    > On Sep 20, 10:59 pm, Python Maniac <> wrote:
    >
    > > I am new to Python however I would like some feedback from those who
    > > know more about Python than I do at this time.

    >
    > > def scrambleLine(line):
    > > s = ''
    > > for c in line:
    > > s += chr(ord(c) | 0x80)
    > > return s

    >
    > > def descrambleLine(line):
    > > s = ''
    > > for c in line:
    > > s += chr(ord(c) & 0x7f)
    > > return s
    > > ...

    >
    > Well, scrambleLine will remove line-endings, so when you're
    > descrambling
    > you'll be processing the entire file at once. This is particularly bad
    > because of the way your functions work, adding a character at a time
    > to
    > s.
    >
    > Probably your easiest bet is to iterate over the file using read(N)
    > for some small N rather than doing a line at a time. Something like:
    >
    > process_bytes = (descrambleLine, scrambleLine)[action]
    > while 1:
    > r = f.read(16)
    > if not r: break
    > ff.write(process_bytes(r))
    >
    > In general, rather than building strings by starting with an empty
    > string and repeatedly adding to it, you should use ''.join(...)
    >
    > For instance...
    > def descrambleLine(line):
    > return ''.join(chr(ord(c) & 0x7f) for c in line)
    >
    > def scrambleLine(line):
    > return ''.join(chr(ord(c) | 0x80) for c in line)
    >
    > It's less code, more readable and faster!


    I would have thought that also from what I've heard here.

    def scrambleLine(line):
    s = ''
    for c in line:
    s += chr(ord(c) | 0x80)
    return s

    def scrambleLine1(line):
    return ''.join([chr(ord(c) | 0x80) for c in line])

    if __name__=='__main__':
    from timeit import Timer
    t = Timer("scrambleLine('abcdefghijklmnopqrstuvwxyz')", "from
    __main__ import scrambleLine")
    print t.timeit()

    ## scrambleLine
    ## 13.0013366039
    ## 12.9461998318
    ##
    ## scrambleLine1
    ## 14.4514098748
    ## 14.3594400695

    How come it's not? Then I noticed you don't have brackets in
    the join statement. So I tried without them and got

    ## 17.6010847978
    ## 17.6111472418

    Am I doing something wrong?

    >
    > --
    > Paul Hankin
     
    , Sep 21, 2007
    #8
  9. Re: I could use some help making this Python code run faster usingonly Python code.

    Python Maniac wrote:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def descrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) & 0x7f)
    > return s
    >


    Try using str.translate instead - it's usually much faster than a pure python loop:
    >>> scramble = "".join(chr(c | 0x80) for c in range(256))
    >>> "source text".translate(scramble)

    '\xf3\xef\xf5\xf2\xe3\xe5\xa0\xf4\xe5\xf8\xf4'
    >>> descramble = "".join(chr(c & 0x7F) for c in range(256))
    >>> '\xf3\xef\xf5\xf2\xe3\xe5\xa0\xf4\xe5\xf8\xf4'.translate(descramble)

    'source text'
    >>>


    You might then do the translation inline e.g., untested:

    > def scrambleFile(fname,action=1):

    translators = {0: descramble, 1: scramble} # defined as above
    try:
    translation_action = translators[action]
    except KeyError:
    raise ValueError("action must be 0 or 1")

    > if (path.exists(fname)):

    ....
    ....
    for line in f:
    ff.write(line.translate(translation_action))
    ....

    HTH
    Michael
     
    Michael Spencer, Sep 21, 2007
    #9
  10. Python Maniac

    Paul Rubin Guest

    Python Maniac <> writes:
    > I am new to Python however I would like some feedback from those who
    > know more about Python than I do at this time.


    Use the array module and do 32-bit or 64-bit operations with it.
    See http://nightsong.com/phr/crypto/p3.py for a more serious
    encryption module written that way.
     
    Paul Rubin, Sep 21, 2007
    #10
  11. Python Maniac

    Ian Clark Guest

    Re: I could use some help making this Python code run faster usingonly Python code.

    wrote:
    > On Sep 20, 5:46 pm, Paul Hankin <> wrote:
    >> On Sep 20, 10:59 pm, Python Maniac <> wrote:
    >>
    >>> I am new to Python however I would like some feedback from those who
    >>> know more about Python than I do at this time.
    >>> def scrambleLine(line):
    >>> s = ''
    >>> for c in line:
    >>> s += chr(ord(c) | 0x80)
    >>> return s
    >>> def descrambleLine(line):
    >>> s = ''
    >>> for c in line:
    >>> s += chr(ord(c) & 0x7f)
    >>> return s
    >>> ...

    >> Well, scrambleLine will remove line-endings, so when you're
    >> descrambling
    >> you'll be processing the entire file at once. This is particularly bad
    >> because of the way your functions work, adding a character at a time
    >> to
    >> s.
    >>
    >> Probably your easiest bet is to iterate over the file using read(N)
    >> for some small N rather than doing a line at a time. Something like:
    >>
    >> process_bytes = (descrambleLine, scrambleLine)[action]
    >> while 1:
    >> r = f.read(16)
    >> if not r: break
    >> ff.write(process_bytes(r))
    >>
    >> In general, rather than building strings by starting with an empty
    >> string and repeatedly adding to it, you should use ''.join(...)
    >>
    >> For instance...
    >> def descrambleLine(line):
    >> return ''.join(chr(ord(c) & 0x7f) for c in line)
    >>
    >> def scrambleLine(line):
    >> return ''.join(chr(ord(c) | 0x80) for c in line)
    >>
    >> It's less code, more readable and faster!

    >
    > I would have thought that also from what I've heard here.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def scrambleLine1(line):
    > return ''.join([chr(ord(c) | 0x80) for c in line])
    >
    > if __name__=='__main__':
    > from timeit import Timer
    > t = Timer("scrambleLine('abcdefghijklmnopqrstuvwxyz')", "from
    > __main__ import scrambleLine")
    > print t.timeit()
    >
    > ## scrambleLine
    > ## 13.0013366039
    > ## 12.9461998318
    > ##
    > ## scrambleLine1
    > ## 14.4514098748
    > ## 14.3594400695
    >
    > How come it's not? Then I noticed you don't have brackets in
    > the join statement. So I tried without them and got
    >
    > ## 17.6010847978
    > ## 17.6111472418
    >
    > Am I doing something wrong?
    >
    >> --
    >> Paul Hankin

    >
    >


    I got similar results as well. I believe the reason for join actually
    performing slower is because join iterates twice over the sequence. [1]
    The first time is to determine the size of the buffer to allocate and
    the second is to populate the buffer.

    Ian

    [1] http://mail.python.org/pipermail/python-list/2007-September/458119.html
     
    Ian Clark, Sep 21, 2007
    #11
  12. On Sep 20, 7:13 pm, "" <> wrote:

    > How come it's not? Then I noticed you don't have brackets in
    > the join statement. So I tried without them and got


    If memory serves me right newer versions of python will recognize and
    optimize string concatenation via the += operator, thus the advice to
    use join does not apply.

    i.
     
    Istvan Albert, Sep 21, 2007
    #12
  13. On Sep 20, 7:13 pm, "" <> wrote:
    > On Sep 20, 5:46 pm, Paul Hankin <> wrote:
    >
    >
    >
    > > On Sep 20, 10:59 pm, Python Maniac <> wrote:

    >
    > > > I am new to Python however I would like some feedback from those who
    > > > know more about Python than I do at this time.

    >
    > > > def scrambleLine(line):
    > > > s = ''
    > > > for c in line:
    > > > s += chr(ord(c) | 0x80)
    > > > return s

    >
    > > > def descrambleLine(line):
    > > > s = ''
    > > > for c in line:
    > > > s += chr(ord(c) & 0x7f)
    > > > return s
    > > > ...

    >
    > > Well, scrambleLine will remove line-endings, so when you're
    > > descrambling
    > > you'll be processing the entire file at once. This is particularly bad
    > > because of the way your functions work, adding a character at a time
    > > to
    > > s.

    >
    > > Probably your easiest bet is to iterate over the file using read(N)
    > > for some small N rather than doing a line at a time. Something like:

    >
    > > process_bytes = (descrambleLine, scrambleLine)[action]
    > > while 1:
    > > r = f.read(16)
    > > if not r: break
    > > ff.write(process_bytes(r))

    >
    > > In general, rather than building strings by starting with an empty
    > > string and repeatedly adding to it, you should use ''.join(...)

    >
    > > For instance...
    > > def descrambleLine(line):
    > > return ''.join(chr(ord(c) & 0x7f) for c in line)

    >
    > > def scrambleLine(line):
    > > return ''.join(chr(ord(c) | 0x80) for c in line)

    >
    > > It's less code, more readable and faster!

    >
    > I would have thought that also from what I've heard here.
    >
    > def scrambleLine(line):
    > s = ''
    > for c in line:
    > s += chr(ord(c) | 0x80)
    > return s
    >
    > def scrambleLine1(line):
    > return ''.join([chr(ord(c) | 0x80) for c in line])
    >
    > if __name__=='__main__':
    > from timeit import Timer
    > t = Timer("scrambleLine('abcdefghijklmnopqrstuvwxyz')", "from
    > __main__ import scrambleLine")
    > print t.timeit()
    >
    > ## scrambleLine
    > ## 13.0013366039
    > ## 12.9461998318
    > ##
    > ## scrambleLine1
    > ## 14.4514098748
    > ## 14.3594400695
    >
    > How come it's not? Then I noticed you don't have brackets in
    > the join statement. So I tried without them and got
    >
    > ## 17.6010847978
    > ## 17.6111472418
    >
    > Am I doing something wrong?



    It has to do with the input string length; try multiplying it by 10 or
    100. Below is a more complete benchmark; for largish strings, the imap
    version is the fastest among those using the original algorithm. Of
    course using a lookup table as Diez showed is even faster. FWIW, here
    are some timings (Python 2.5, WinXP):

    scramble: 1.818
    scramble_listcomp: 1.492
    scramble_gencomp: 1.535
    scramble_map: 1.377
    scramble_imap: 1.332
    scramble_dict: 0.817
    scramble_dict_map: 0.419
    scramble_dict_imap: 0.410

    And the benchmark script:

    from itertools import imap

    def scramble(line):
    s = ''
    for c in line:
    s += chr(ord(c) | 0x80)
    return s

    def scramble_listcomp(line):
    return ''.join([chr(ord(c) | 0x80) for c in line])

    def scramble_gencomp(line):
    return ''.join(chr(ord(c) | 0x80) for c in line)

    def scramble_map(line):
    return ''.join(map(chr, map(0x80.__or__, map(ord,line))))

    def scramble_imap(line):
    return ''.join(imap(chr, imap(0x80.__or__,imap(ord,line))))


    scramble_table = dict((chr(i), chr(i | 0x80)) for i in xrange(255))

    def scramble_dict(line):
    s = ''
    for c in line:
    s += scramble_table[c]
    return s

    def scramble_dict_map(line):
    return ''.join(map(scramble_table.__getitem__, line))

    def scramble_dict_imap(line):
    return ''.join(imap(scramble_table.__getitem__, line))


    if __name__=='__main__':
    funcs = [scramble, scramble_listcomp, scramble_gencomp,
    scramble_map, scramble_imap,
    scramble_dict, scramble_dict_map, scramble_dict_imap]
    s = 'abcdefghijklmnopqrstuvwxyz' * 100
    assert len(set(f(s) for f in funcs)) == 1
    from timeit import Timer
    setup = "import __main__; line = %r" % s
    for name in (f.__name__ for f in funcs):
    timer = Timer("__main__.%s(line)" % name, setup)
    print '%s:\t%.3f' % (name, min(timer.repeat(3,1000)))


    George
     
    George Sakkis, Sep 21, 2007
    #13
  14. Python Maniac

    Duncan Booth Guest

    George Sakkis <> wrote:

    > It has to do with the input string length; try multiplying it by 10 or
    > 100. Below is a more complete benchmark; for largish strings, the imap
    > version is the fastest among those using the original algorithm. Of
    > course using a lookup table as Diez showed is even faster. FWIW, here
    > are some timings (Python 2.5, WinXP):
    >
    > scramble: 1.818
    > scramble_listcomp: 1.492
    > scramble_gencomp: 1.535
    > scramble_map: 1.377
    > scramble_imap: 1.332
    > scramble_dict: 0.817
    > scramble_dict_map: 0.419
    > scramble_dict_imap: 0.410


    I added another one:

    import string
    scramble_translation = string.maketrans(''.join(chr(i) for i in xrange
    (256)), ''.join(chr(i|0x80) for i in xrange(256)))
    def scramble_translate(line):
    return string.translate(line, scramble_translation)

    ....
    funcs = [scramble, scramble_listcomp, scramble_gencomp,
    scramble_map, scramble_imap,
    scramble_dict, scramble_dict_map, scramble_dict_imap,
    scramble_translate
    ]


    and I think I win:

    scramble: 1.949
    scramble_listcomp: 1.439
    scramble_gencomp: 1.455
    scramble_map: 1.470
    scramble_imap: 1.546
    scramble_dict: 0.914
    scramble_dict_map: 0.415
    scramble_dict_imap: 0.416
    scramble_translate: 0.007
     
    Duncan Booth, Sep 21, 2007
    #14
  15. On Sep 21, 12:56 am, Duncan Booth <>
    wrote:
    > George Sakkis <> wrote:
    > > It has to do with the input string length; try multiplying it by 10 or
    > > 100. Below is a more complete benchmark; for largish strings, the imap
    > > version is the fastest among those using the original algorithm. Of
    > > course using a lookup table as Diez showed is even faster. FWIW, here
    > > are some timings (Python 2.5, WinXP):

    >
    > > scramble: 1.818
    > > scramble_listcomp: 1.492
    > > scramble_gencomp: 1.535
    > > scramble_map: 1.377
    > > scramble_imap: 1.332
    > > scramble_dict: 0.817
    > > scramble_dict_map: 0.419
    > > scramble_dict_imap: 0.410

    >
    > I added another one:
    >
    > import string
    > scramble_translation = string.maketrans(''.join(chr(i) for i in xrange
    > (256)), ''.join(chr(i|0x80) for i in xrange(256)))
    > def scramble_translate(line):
    > return string.translate(line, scramble_translation)
    >
    > ...
    > funcs = [scramble, scramble_listcomp, scramble_gencomp,
    > scramble_map, scramble_imap,
    > scramble_dict, scramble_dict_map, scramble_dict_imap,
    > scramble_translate
    > ]
    >
    > and I think I win:
    >
    > scramble: 1.949
    > scramble_listcomp: 1.439
    > scramble_gencomp: 1.455
    > scramble_map: 1.470
    > scramble_imap: 1.546
    > scramble_dict: 0.914
    > scramble_dict_map: 0.415
    > scramble_dict_imap: 0.416
    > scramble_translate: 0.007


    Wow !

    Now I am very impressed with Python !

    The difference between where I began (70.155 secs) and where we end
    (2.278 secs) is a whopping 30.8x faster using some rather simple
    techniques that are nothing more than variations on the theme of
    hoisting function calls out of loops along with using some very
    powerful iterator functions from Python.

    My best runtime with Ruby using the same machine and OS was 67.797
    secs which is 29.8x slower than the fastest Python runtime. This
    makes Ruby almost as slow as Python was made faster. The irony with
    Ruby was that the use of a hash in Ruby actually made the Ruby code
    run slower than when a hash was not used.

    Now I think I will code this little scrambler using nothing but the D
    Language just to see whether there is any benefit in using D over
    Python for this sort of problem.
     
    Python Maniac, Sep 21, 2007
    #15
  16. On Sep 21, 1:00 pm, Python Maniac <> wrote:

    > My best runtime with Ruby using the same machine and OS was 67.797
    > secs which is 29.8x slower than the fastest Python runtime. This
    > makes Ruby almost as slow as Python was made faster. The irony with
    > Ruby was that the use of a hash in Ruby actually made the Ruby code
    > run slower than when a hash was not used.


    I'm not familiar with Ruby but chances are that if you post at
    c.l.ruby you'll get some good suggestions on how to speed up your ruby
    code too (although the difference may not be so dramatic).

    > Now I think I will code this little scrambler using nothing but the D
    > Language just to see whether there is any benefit in using D over
    > Python for this sort of problem.


    And then you'll do it in assembly to see how much do you gain compared
    to D ? If this is not just for learning purposes, perhaps you should
    be thinking it the other way around: how fast is fast enough ?
    Optimizing just for the sake of optimization, without a specific
    performance goal in mind, is not the most productive way to do things
    (again, leaving learning motivations aside).

    George
     
    George Sakkis, Sep 21, 2007
    #16
  17. Re: I could use some help making this Python code run faster usingonly Python code.

    > Now I think I will code this little scrambler using nothing but the D
    > Language just to see whether there is any benefit in using D over
    > Python for this sort of problem.


    Isn't D compiled to machine code? I would expect it to win hands down.
    That is, unless it is horribly unoptimized.

    Matt
     
    Matt McCredie, Sep 21, 2007
    #17
  18. On Sep 21, 3:02 pm, "Matt McCredie" <> wrote:
    > > Now I think I will code this little scrambler using nothing but the D
    > > Language just to see whether there is any benefit in using D over
    > > Python for this sort of problem.

    >
    > Isn't D compiled to machine code? I would expect it to win hands down.
    > That is, unless it is horribly unoptimized.
    >
    > Matt


    Well D code is compiled into machine code that runs via a VM.

    My initial D code ran in about 6 secs as compare with the 2.278 secs
    for the optimized Python code.

    If I want the D code to run faster than the optimized Python I would
    have to use the same Pythonic optimizations as were used in Python
    when crafting the D code and then I would guess the optimized D code
    might run only 2x faster than the optimized Python code.

    In real terms < 3 secs to process a 20 MB file is more than reasonable
    performance with no need to perform any additional optimizations.

    For this particular problem Python performs as well as the D powered
    machine code using far less effort, for me, than what it would take to
    make the D code run faster than the Python code.

    All this tells me the following:

    * List Comprehensions are very powerful for Python.
    * String translation is a must when processing string based data in an
    iterative manner.
    * Ruby has no hope of being able to beat Python for this type of
    problem given the proper Python optimizations are used.
    * There is no value in wasting time with lower-level languages to make
    Python run faster for this type of problem.

    It would be nice if Python could be made to automatically detect the
    LC and string translation patterns used by the unoptimized Python code
    and make them into optimized Python code on the fly at runtime. I am
    more than a little amazed nobody has chosen to build a JIT (Just In-
    Time compiler) or cached-compiler into Python but maybe that sort of
    thing is just not needed given the fact that Python code can be easily
    optimized to run 30x faster.
     
    Python Maniac, Sep 22, 2007
    #18
  19. Re: I could use some help making this Python code run faster usingonly Python code.

    > It would be nice if Python could be made to automatically detect the
    > LC and string translation patterns used by the unoptimized Python code
    > and make them into optimized Python code on the fly at runtime. I am
    > more than a little amazed nobody has chosen to build a JIT (Just In-
    > Time compiler) or cached-compiler into Python but maybe that sort of
    > thing is just not needed given the fact that Python code can be easily
    > optimized to run 30x faster.


    See PyPy http://codespeak.net/pypy/ for a JIT comiler for python.
    Although it is in the research phase, but worth taking a look at.

    Matt
     
    Matt McCredie, Sep 22, 2007
    #19
  20. On Sep 21, 4:48 pm, "Matt McCredie" <> wrote:
    > > It would be nice if Python could be made to automatically detect the
    > > LC and string translation patterns used by the unoptimized Python code
    > > and make them into optimized Python code on the fly at runtime. I am
    > > more than a little amazed nobody has chosen to build a JIT (Just In-
    > > Time compiler) or cached-compiler into Python but maybe that sort of
    > > thing is just not needed given the fact that Python code can be easily
    > > optimized to run 30x faster.

    >
    > See PyPyhttp://codespeak.net/pypy/for a JIT comiler for python.
    > Although it is in the research phase, but worth taking a look at.
    >
    > Matt


    You need to check-out a project called Psyco (forerunner for pypy).

    I was able to get almost 2x better performance by adding 3 lines of
    code for Psyco.

    See also: http://psyco.sourceforge.net/download.html

    I am rather amazed ! Psyco was able to give much better performance
    above and beyond the already optimized Python code without negatively
    impacting performance during its analysis at runtime.
     
    Python Maniac, Sep 22, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Silver

    Could use some help...:)

    Silver, Dec 26, 2003, in forum: C++
    Replies:
    7
    Views:
    357
    Jeff Schwab
    Dec 27, 2003
  2. Phillip Vong

    VS2005 - Run Code only in Read Only mode.

    Phillip Vong, Apr 6, 2007, in forum: ASP .Net
    Replies:
    0
    Views:
    348
    Phillip Vong
    Apr 6, 2007
  3. _Who
    Replies:
    1
    Views:
    356
  4. David Garamond

    help on making ruby code faster

    David Garamond, Dec 28, 2004, in forum: Ruby
    Replies:
    16
    Views:
    258
    David Garamond
    Dec 29, 2004
  5. Ruby Maniac
    Replies:
    57
    Views:
    634
    Chad Perrin
    Sep 27, 2007
Loading...

Share This Page