Extracting text from .png images

  • Thread starter Henrik Berg Nielsen
  • Start date
H

Henrik Berg Nielsen

Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

Thanks in advance,

Henrik
 
J

John J. Lee

Henrik Berg Nielsen said:
I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

OCR is the TLA you're looking for ("Optical Character Recognition").

Dunno if there are any good free OCR engines. With these sorts of
hard algorithms, you tend to get what you pay for.


John
 
I

Indigo Moon Man

Henrik Berg Nielsen said:
I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed
to a Python script for further processing. Any good ideas on how to go
about with this? I have no idea whatsoever about how to extract the
numbers out of the images...
This might help you out...
http://www.pricelessware.org/2003/PL2003TEXT.htm#Convert-OCR

I'm not sure if it does PNG, you might have to convert the file to tiff or
bmp or something.
 
L

Lee Harr

Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...


http://www.claraocr.org/
 
S

Skip Montanaro

John> OCR is the TLA you're looking for ("Optical Character Recognition").

John> Dunno if there are any good free OCR engines. With these sorts of
John> hard algorithms, you tend to get what you pay for.

Which often means there's a piece of free software out there which works
better than the most expensive commercial solutions. <wink>

A little googling suggests this might be a candidate:

http://www.claraocr.org/

I have no idea if there's an exported library and/or a Python wrapper, but
it's probably worth a look.

Skip
 
T

Tim Roberts

Henrik Berg Nielsen said:
I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.
 
L

Lukas Ccenovsky

Henrik said:
Hi group!

I need to extract some text (well numbers actually) from a bunch of
similarly looking .png images. After extraction the numbers will be fed to a
Python script for further processing. Any good ideas on how to go about with
this? I have no idea whatsoever about how to extract the numbers out of the
images...

Hi,
I'm dealing with similar problem now. My pictures are very complicated
(construction drawings). I am trying to use gamera
(http://dkc.jhu.edu/gamera/) for OCR and it seems very promising.
 
B

Bengt Richter

Are you hoping to extract the "password" characters from the pictures
presented by the whois checks? If so, you should give up now, because
those images are SPECIFICALLY designed to make them almost impervious to
automated recognition.
Sounds interesting as a problem, but I wouldn't want to create a skeleton key
for any bad guys ;-)

Regards,
Bengt Richter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top