Strings and Lists

Discussion in 'Python' started by Tom Longridge, Apr 18, 2005.

  1. My current Python project involves lots repeatating code blocks,
    mainly centred around a binary string of data. It's a genetic
    algorithm in which there are lots of strings (the chromosomes) which
    get mixed, mutated and compared a lot.

    Given Python's great list processing abilities and the relative
    inefficiencies some string operations, I was considering using a list
    of True and False values rather than a binary string.

    I somehow doubt there would be a clear-cut answer to this, but from
    this description, does anyone have any reason to think that one way
    would be much more efficient than the other? (I realise the best way
    would be to do both and `timeit` to see which is faster, but it's a
    sizeable program and if anybody considers it a no-brainer I'd much
    rather know now!)

    Any advice would be gladly recieved.
     
    Tom Longridge, Apr 18, 2005
    #1
    1. Advertising

  2. Tom Longridge

    Guest

    Hello Tom,

    I think it is more efficient if we can use list (with True,False)
    member to do genetics algorithms. Of course a lot of works to do to
    change from string binary into boolean list.

    I do programming genetics algorithms in C# I guess I have to modify my
    program also because my old program use binary string manipulation.

    Sincerely Yours,
    Pujo
     
    , Apr 18, 2005
    #2
    1. Advertising

  3. Hi,
    I not sure what sorts of operations you plan to do. But if you
    intend to use fixed length arrays or even carrying out repetetive
    operations. You should probably look at numeric
    http://numeric.scipy.org/


    On 18 Apr 2005 04:42:17 -0700, Tom Longridge <> wrote:
    > My current Python project involves lots repeatating code blocks,
    > mainly centred around a binary string of data. It's a genetic
    > algorithm in which there are lots of strings (the chromosomes) which
    > get mixed, mutated and compared a lot.
    >
    > Given Python's great list processing abilities and the relative
    > inefficiencies some string operations, I was considering using a list
    > of True and False values rather than a binary string.
    >
    > I somehow doubt there would be a clear-cut answer to this, but from
    > this description, does anyone have any reason to think that one way
    > would be much more efficient than the other? (I realise the best way
    > would be to do both and `timeit` to see which is faster, but it's a
    > sizeable program and if anybody considers it a no-brainer I'd much
    > rather know now!)
    >
    > Any advice would be gladly recieved.
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >



    --
    http://blogs.applibase.net/sidharth
     
    Sidharth Kuruvila, Apr 18, 2005
    #3
  4. Tom Longridge

    Bill Mill Guest

    On 18 Apr 2005 04:42:17 -0700, Tom Longridge <> wrote:
    > My current Python project involves lots repeatating code blocks,
    > mainly centred around a binary string of data. It's a genetic
    > algorithm in which there are lots of strings (the chromosomes) which
    > get mixed, mutated and compared a lot.
    >
    > Given Python's great list processing abilities and the relative
    > inefficiencies some string operations, I was considering using a list
    > of True and False values rather than a binary string.
    >
    > I somehow doubt there would be a clear-cut answer to this, but from
    > this description, does anyone have any reason to think that one way
    > would be much more efficient than the other? (I realise the best way
    > would be to do both and `timeit` to see which is faster, but it's a
    > sizeable program and if anybody considers it a no-brainer I'd much
    > rather know now!)
    >
    > Any advice would be gladly recieved.
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >


    Tom,

    it seems to me that, if efficiency is your main goal, you should store
    your data as a list of integers, and use the bit-twiddling operators
    to get at your data. These should be *very* fast, as well as memory
    efficient.

    Peace
    Bill Mill
    bill.mill at gmail.com
     
    Bill Mill, Apr 18, 2005
    #4
  5. Tom Longridge

    Peter Hansen Guest

    Tom Longridge wrote:
    > My current Python project involves lots repeatating code blocks,
    > mainly centred around a binary string of data. It's a genetic
    > algorithm in which there are lots of strings (the chromosomes) which
    > get mixed, mutated and compared a lot.
    >
    > Given Python's great list processing abilities and the relative
    > inefficiencies some string operations, I was considering using a list
    > of True and False values rather than a binary string.
    >
    > I somehow doubt there would be a clear-cut answer to this, but from
    > this description, does anyone have any reason to think that one way
    > would be much more efficient than the other? (I realise the best way
    > would be to do both and `timeit` to see which is faster, but it's a
    > sizeable program and if anybody considers it a no-brainer I'd much
    > rather know now!)
    >
    > Any advice would be gladly recieved.


    It depends more on the operations you are performing, and
    more importantly *which of those you have measured and
    found to be slow*, than on anything else.

    If, for example, you've got a particular complex set of
    slicing, dicing, and mutating operations going on, then
    that might say use one type of data structure.

    If, on the other hand, the evaluation of the fitness
    function is what is taking most of the time, then you
    should focus on what that algorithm does and needs
    and, after profiling (to get real data rather than
    shots-in-the-dark), you can pick an appropriate data
    structure for optimizing those operations.

    Strings seem to be what people pick, by default, without
    much thought, but I doubt they're the right thing
    for the majority of GA work...

    -Peter
     
    Peter Hansen, Apr 18, 2005
    #5
  6. Tom Longridge

    Donn Cave Guest

    In article <>,
    (Tom Longridge) wrote:

    > My current Python project involves lots repeatating code blocks,
    > mainly centred around a binary string of data. It's a genetic
    > algorithm in which there are lots of strings (the chromosomes) which
    > get mixed, mutated and compared a lot.
    >
    > Given Python's great list processing abilities and the relative
    > inefficiencies some string operations, I was considering using a list
    > of True and False values rather than a binary string.
    >
    > I somehow doubt there would be a clear-cut answer to this, but from
    > this description, does anyone have any reason to think that one way
    > would be much more efficient than the other? (I realise the best way
    > would be to do both and `timeit` to see which is faster, but it's a
    > sizeable program and if anybody considers it a no-brainer I'd much
    > rather know now!)


    I make no representations about how much mileage you would
    get out of it, but if character data suits your purposes and
    you just need a mutable array - in principle, I think it would
    be more efficient to use an array. Like,
    import array
    a = array.array('c', strdata)

    As I understand it, this object simply contains character data,
    not a list of 1 character string objects, so it's much more
    economical to store and examine.

    Donn Cave,
     
    Donn Cave, Apr 18, 2005
    #6
  7. Thank you all very much for your responses. It's especially reassuring
    to hear about other Python GA's as I have had some scepticism about
    Python's speed (or lack of it) being too big a problem for such an
    application.

    With regard to using numeric, arrays or integer lists -- I didn't
    mention that these strings can also contain wild cards (so I suppose
    it's not really binary -- sorry). This is traditionally done using a
    '#' symbol, but I was imagining using a value of None in a boolean list
    to represent this. Also there is currently a fair bit of research going
    into other representations (floating-point values, paired values etc)
    so I was hoping to be able to keep my framework extensible for the
    future.

    Many thanks again for your help. I will ``take the plunge'' and give
    the boolean list a go I think!
     
    Tom Longridge, Apr 19, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. robin
    Replies:
    10
    Views:
    543
    Dave Hansen
    Apr 12, 2006
  2. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    410
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  3. Daniel Nogradi
    Replies:
    3
    Views:
    350
    Dennis Lee Bieber
    Nov 10, 2006
  4. Ben

    Strings, Strings and Damned Strings

    Ben, Jun 22, 2006, in forum: C Programming
    Replies:
    14
    Views:
    767
    Malcolm
    Jun 24, 2006
  5. Gabriel Zachmann

    Bug with lists of pairs of lists and append()

    Gabriel Zachmann, Sep 28, 2007, in forum: Python
    Replies:
    2
    Views:
    235
    Gabriel Zachmann
    Oct 1, 2007
Loading...

Share This Page