How to check for single character change in a string?

Discussion in 'Python' started by tinnews@isbd.co.uk, Dec 24, 2011.

  1. Guest

    Can anyone suggest a simple/easy way to count how many characters have
    changed in a string?

    E.g. giving results as follows:-

    abcdefg abcdefh 1
    abcdefg abcdekk 2
    abcdefg gfedcba 6


    Note that position is significant, a character in a different position
    should not count as a match.

    Is there any simpler/neater way than just a for loop running through
    both strings and counting non-matching characters?

    --
    Chris Green
    , Dec 24, 2011
    #1
    1. Advertising

  2. Ian Kelly Guest

    On Sat, Dec 24, 2011 at 8:26 AM, <> wrote:
    > Can anyone suggest a simple/easy way to count how many characters have
    > changed in a string?
    >
    > E.g. giving results as follows:-
    >
    >    abcdefg     abcdefh         1
    >    abcdefg     abcdekk         2
    >    abcdefg     gfedcba         6
    >
    >
    > Note that position is significant, a character in a different position
    > should not count as a match.
    >
    > Is there any simpler/neater way than just a for loop running through
    > both strings and counting non-matching characters?


    No, but the loop approach is pretty simple:

    sum(a == b for a, b in zip(str1, str2))
    Ian Kelly, Dec 24, 2011
    #2
    1. Advertising

  3. Roy Smith Guest

    In article <>, wrote:

    > Can anyone suggest a simple/easy way to count how many characters have
    > changed in a string?


    Depending on exactly how you define "changed", you're probably talking
    about either Hamming Distance or Levenshtein Distance. I would start
    with the wikipedia articles on both those topics and explore from there.

    There are python packages for computing many of these metrics. For
    example, http://pypi.python.org/pypi/python-Levenshtein/

    > Is there any simpler/neater way than just a for loop running through
    > both strings and counting non-matching characters?


    If you don't care about insertions and deletions (and it sounds like you
    don't), then this is the way to do it. It's O(n), and you're not going
    to get any better than that. It's a one-liner in python:

    >>> s1 = 'abcdefg'
    >>> s2 = 'abcdekk'


    >>> len([x for x in zip(s1, s2) if x[0] != x[1]])

    2

    But go read the wikipedia articles. Computing distance between
    sequences is an interesting, important, and well-studied topic. It's
    worth exploring a bit.
    Roy Smith, Dec 24, 2011
    #3
  4. Roy Smith Guest

    In article <>,
    Roy Smith <> wrote:

    > >>> len([x for x in zip(s1, s2) if x[0] != x[1]])


    Heh, Ian Kelly's version:

    > sum(a == b for a, b in zip(str1, str2))


    is cleaner than mine. Except that Ian's counts matches and the OP asked
    for non-matches, but that's an exercise for the reader :)
    Roy Smith, Dec 24, 2011
    #4
  5. On 24 December 2011 16:10, Roy Smith <> wrote:
    > In article <>,
    >  Roy Smith <> wrote:
    >
    >> >>> len([x for x in zip(s1, s2) if x[0] != x[1]])

    >
    > Heh, Ian Kelly's version:
    >
    >> sum(a == b for a, b in zip(str1, str2))

    >
    > is cleaner than mine.  Except that Ian's counts matches and the OP asked
    > for non-matches, but that's an exercise for the reader :)


    Here's a variation on the same theme:

    sum(map(str.__ne__, str1, str2))

    --
    Arnaud
    Arnaud Delobelle, Dec 24, 2011
    #5
  6. Rick Johnson Guest

    On Dec 24, 11:09 am, Arnaud Delobelle <> wrote:

    > sum(map(str.__ne__, str1, str2))


    Mirror, mirror, on the wall. Who's the cleanest of them all?
    Rick Johnson, Dec 24, 2011
    #6
  7. Guest

    Roy Smith <> wrote:
    > In article <>,
    > Roy Smith <> wrote:
    >
    > > >>> len([x for x in zip(s1, s2) if x[0] != x[1]])

    >
    > Heh, Ian Kelly's version:
    >
    > > sum(a == b for a, b in zip(str1, str2))

    >
    > is cleaner than mine. Except that Ian's counts matches and the OP asked
    > for non-matches, but that's an exercise for the reader :)


    :)

    I'm actually walking through a directory tree and checking that file
    characteristics don't change in a sequence of files.

    What I'm looking for is 'unusual' changes in file characteristics
    (they're image files with camera information and such in them) in a
    sequential list of files.

    Thus if file001, file002, file003, file004 have the same camera type
    I'm happy, but if file003 appears to have been taken with a different
    camera something is probably amiss. I realise there will be *two*
    character changes when going from file009 to file010 but I can cope
    with that. I can't just extract the sequence number because in some
    cases they have non-numeric names, etc.

    --
    Chris Green
    , Dec 26, 2011
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Brand Bogard

    8 bit character string to 16 bit character string

    Brand Bogard, May 25, 2006, in forum: C Programming
    Replies:
    8
    Views:
    710
    those who know me have no need of my name
    May 28, 2006
  2. Replies:
    5
    Views:
    910
    X-Centric
    Jun 30, 2005
  3. Replies:
    3
    Views:
    487
    James Kanze
    Nov 18, 2007
  4. Tom de Neef

    Change a single character in a string

    Tom de Neef, Feb 17, 2008, in forum: Javascript
    Replies:
    9
    Views:
    217
    Evertjan.
    Feb 19, 2008
  5. Bart Vandewoestyne
    Replies:
    8
    Views:
    701
    Bart Vandewoestyne
    Sep 25, 2012
Loading...

Share This Page