almost equal strings

R

Roedy Green

The rsync remote-update protocol allows rsync to transfer just the dif-
ferences between two sets of files across the network connection, using
an efficient checksum-search algorithm described in the technical
report that accompanies this package.

I'm afraid i can't tell you any more than that!

You need to figure out how to break the document into chunks at some
an optimal boundary. It would be lines or sentences for text.
It would be lines for code.
It would be records for fixed length records.
It would be UTF-strings for a DataOutputStream consisting only of
strings.

You need that to easily recognise a piece moved, something quite
common in word processing and programming. You can hash your chunks to
help find duplicates in the old and new.
--
Roedy Green Canadian Mind Products
http://mindprod.com
PM Steven Harper is fixated on the costs of implementing Kyoto, estimated as high as 1% of GDP.
However, he refuses to consider the costs of not implementing Kyoto which the
famous economist Nicholas Stern estimated at 5 to 20% of GDP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

case sensitive filenames 62
Debugging regex 3
Browser news 4
Regex Puzzle 5
Constellations 38
Smoothing 2
word_set = set() def should_preceed_with_an(phrase): first_word = 1
Observations on operator overloading 9

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,525
Members
44,997
Latest member
mileyka

Latest Threads

Top