This does not yield minimal edit sequences, but does tend to yield matches
that "look right" to people.
[end quote]
There are many ways that one can calculate the difference between two
strings, e.g. is the string "acab" an "ab" with "ac" inserted at the
front, or "ab" with "ca" inserted in the middle? The Unix diff utility
may return minimal diffs, but difflib does not make that claim.
You want to see "HIDEDCT1" match closer to "HIDESCT1" than "HIDEDST1":
HIDEDCT1 -- John's "best match" target string
HIDEDST1 -- difflib's "best match" target string
HIDESCT1 -- source string
John's best match matches in seven of eight positions, compared to six
of eight for the difflib best match. Disregarding order, both have seven
matching characters. That's a pretty slim difference between the two:
' H I D E- D S+ C T 1'
I honestly don't know how to interpret those results
.... print "a[%d] and b[%d] match for %d elements" % block
....
a[0] and b[0] match for 4 elements
a[5] and b[5] match for 3 elements
a[8] and b[8] match for 0 elements.... print "a[%d] and b[%d] match for %d elements" % block
....
a[0] and b[0] match for 4 elements
a[5] and b[4] match for 1 elements
a[6] and b[6] match for 2 elements
a[8] and b[8] match for 0 elements
I think what you are seeing is an artifact of the precise details of the
pattern matcher. Feel free to subclass the SequenceMatcher or Differ
classes to get the results you expect