Looking for MD5-like fingerprint for JPG-files

D

dede

Dear all,

in order to provide a convenient method for photographer like me to
detect "equal" or "similar" pictures I am trying to develop a perl
function/method that does exactly this:

Input: JPG file
Output: MD5-like fingerprint of JPG (to be stored in a db)

It should be a hash-value that is very close if two pics are "almost
identical". It must be robust against at least JPG-rotations
(90/180/270 degrees) and "reasonable" scalings. The analysis will be
stored in the EXIF-data of the JPG so the analysed data should be only
the "naked JPG-data" itself.

My basic idea is to create a 2-dimensional bitmap that will be
"normalized", i.e. rotated to a "zero-position" and scaled to let's
say a 1000x1000 JPG.

"Sugar" for this algorithm could be robustness against primitiv
operations like flipping, clipping, changing contrast, watermarking,
etc.

Is there anyone in the community who has done this already? Any help
will be appreciated.

Thanx in advance. Merci.

Andreas
 
A

A. Sinan Unur

(e-mail address removed) (dede) wrote in @posting.google.com:
Dear all,

in order to provide a convenient method for photographer like me to
detect "equal" or "similar" pictures I am trying to develop a perl
function/method that does exactly this:

Input: JPG file
Output: MD5-like fingerprint of JPG (to be stored in a db)

Well, I don't think you want MD5-like: Those algorithms are designed so
that small variations in input cause large variations in the output (not
that I know much).
It should be a hash-value that is very close if two pics are "almost
identical". It must be robust against at least JPG-rotations
(90/180/270 degrees) and "reasonable" scalings. The analysis will be
stored in the EXIF-data of the JPG so the analysed data should be only
the "naked JPG-data" itself.

That is not an easy problem since JPEG is a lossy algorithm. This is the
same issue that crops up in trying to digitally watermark compressed
music files (again, I do not know that much about this stuff).

OTOH, I know there is research in this area. Google is your friend.

http://www.linux-mag.com/2003-08/perl_01.html

Looks like a promising starting point.

Sinan
 
C

Chris

Randal said:
A> OTOH, I know there is research in this area. Google is your friend.

A> http://www.linux-mag.com/2003-08/perl_01.html

Heh. I was just going to suggest my colum, although for many reasons
I'd direct the user over here:

<http://www.stonehenge.com/merlyn/LinuxMag/col50.html>

print "Just another Perl column hacker,"

Ewww. Beat to the punch, right from the horse's mouth... I was going
to suggest the same column. This question couldn't be a better fit for
your column. It was the first thing I thought of, which is funny
because when I read your column, I remember thinking, "Only Randal
Schwartz would ever want to do something like this ANYWAY..." And looky
here at this question... :cool:

Chris
 
D

dede

Thank you very much guys!

I am all the time impressed about the quick and comprehensive response
I receive from you. I ll try my best "to give something back" :)

Greetings from Paris, France
dede

P.S.: I feel honored by the direct response of Randal himself - Merci!
 

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top