Java OCR ?

Discussion in 'Java' started by Soefara, Sep 18, 2003.

  1. Soefara

    Soefara Guest

    Is it just me or no Java OCR package exists ?

    I've seen one reference - www.javaocr.com - but if you download the
    demo, it's actually a 33KB .jar file which in turn calls a 220KB DLL
    (which will run on Windows only). Maybe I'm misunderstanding
    something, but this looks like a bit misleading.

    Even PHP apparently has OCR packages, according to SourceForge. How
    can it be that Java does not ?

    Seofara
     
    Soefara, Sep 18, 2003
    #1
    1. Advertising

  2. Soefara:

    >Is it just me or no Java OCR package exists ?


    Good OCR is hard and requires a lot of research and experience.
    Finereader has an SDK that works under Windows and Linux:
    <http://www.abbyy.com/developer_toolkits1.asp?param=28807&from=topcom2>.
    Maybe it can be interfaced from Java?

    Regards,
    Marco
    --
    Please reply in the newsgroup, not by email!
    Java programming tips: http://jiu.sourceforge.net/javatips.html
    Other Java pages: http://www.geocities.com/marcoschmidt.geo/java.html
     
    Marco Schmidt, Sep 19, 2003
    #2
    1. Advertising

  3. Soefara

    Soefara Guest

    Thank you for the reply Marco.

    > Good OCR is hard and requires a lot of research and experience.
    > Finereader has an SDK that works under Windows and Linux:
    > <http://www.abbyy.com/developer_toolkits1.asp?param=28807&from=topcom2>.
    > Maybe it can be interfaced from Java?


    I'm not sure how I'd go about "interfacing" that. However,
    there do seem to be quite a few open source and linux OCR
    packages, some of which can be driven from the command line,
    the most prominent of which is Clara
    (see http://www.claraocr.org/faq.html)

    Is there any danger in executing an external program (such
    as Clara) from within a Java servlet using something like this ?

    Runtime.exec("/full/path/to/program [optional-arguments]");


    Soefara
     
    Soefara, Sep 20, 2003
    #3
  4. Soefara

    Ayesh

    Joined:
    May 23, 2008
    Messages:
    1
    Copy from java.sun, hope it can help some lost soul in need of OCR :)

    Greetings,

    I know that it's quite a long time that those posts are here but I found them while looking for an OCR solution in Java, and I would like to share the FREE answer I have created.

    I browsed lots of posts while searching for OCR in Java, and all was linking to Asprise / javaocr, but those are unaffordable for non-commercial project.

    So I searched for OCR software, without language prereq, in the purpose to interface it with Java.

    -I discovered GOCR (http://jocr.sourceforge.net/) which is an ocr in command line. It was a beginning ^^ I downloaded and used Windows version. After few tests I was able to figure how to use it but I've to feed it with PPM images.

    -Here come the second software nconvert (http://pagesperso-orange.fr/pierre.g/xnview/fr_nconvert.html) which can convert images to PPM.

    So I have done 2 static classes to act like OCR.

    The main part is the class OCR, which take a screenshot of the screen, put the proper color (I've made gorc work only with Black letters on White background), write the image to the disk and then call nconvert and gorc.

    By parsing outputstream of GOCR process you should have your text recognized. There is the "replace" thing in return because I work on numbers and gorc make some mistakes with 1-l and O-0 ^^

    That's not a Strong OCR facility but it can help with little application. Hope it'll help and lot of thanks to nconvert and gocr ;)

    Code:
    package t3x.tnn.utility;
    
    import java.awt.Color;
    import java.awt.Point;
    import java.awt.image.BufferedImage;
    import java.io.File;
    import java.io.IOException;
    
    import javax.imageio.ImageIO;
    
    public class OCR {
    	static public String recognize(Point hg, Point bd, Color color, boolean isColorEcriture){
    		String res = null;
    		File fImg = new File("screenshot.png");
    		while(res == null){
    			BufferedImage img = ScreenHandler.getScreen(hg, bd);
    			if(isColorEcriture)
    				img = changeWithColorEcriture(img, color);
    			else
    				img = changeWithColorFond(img, color);
    			try {
    				ImageIO.write(img, "PNG", fImg);
    				Process p = Runtime.getRuntime().exec("nconvert -out ppm -o text.ppm screenshot.png");
    				p.waitFor();
    				p.destroy();
    				p = Runtime.getRuntime().exec("gocr045 text.ppm");
    				p.waitFor();
    				if(p.getInputStream().available()>0)
    					res = IOHandler.getResponse(p.getInputStream());
    				p.destroy();
    			}catch (InterruptedException e) {
    				e.printStackTrace();
    			} catch (IOException e) {
    				e.printStackTrace();
    			}
    		}
    		if(fImg.exists())
    			fImg.delete();
    		File texte = new File("text.ppm");
    		if(texte.exists())
    			texte.delete();
    		return res.replace("l", "1").replace("O", "0").trim();
    	}
    
    	private static BufferedImage changeWithColorEcriture(BufferedImage bi, Color ecriture) {
    		if (bi != null) {                       
    			int w = bi.getWidth();
    			int h = bi.getHeight();
    			int pixel;
    			BufferedImage bitmp = new BufferedImage(w, h, bi.getType());
    			BufferedImage biOut = new BufferedImage(w, h, bi.getType());
    
    			for (int x = 0; x < w; x++) {
    				for (int y = 0; y < h; y++) {
    					pixel = bi.getRGB(x, y);
    					if(pixel != ecriture.getRGB())
    						pixel = Color.BLUE.getRGB();
    					else
    						pixel = Color.BLACK.getRGB();
    					bitmp.setRGB(x, y, pixel); 
    				}
    			}
    
    			for (int x = 0; x < w; x++) {
    				for (int y = 0; y < h; y++) {
    					pixel = bitmp.getRGB(x, y);
    					if(pixel == Color.BLUE.getRGB())
    						pixel = Color.WHITE.getRGB();
    					biOut.setRGB(x, y, pixel);
    				}
    			}
    
    			return biOut;
    		} else {
    			return bi;
    		}
    	}
    	
    	private static BufferedImage changeWithColorFond(BufferedImage bi, Color fond) {
    		if (bi != null) {                       
    			int w = bi.getWidth();
    			int h = bi.getHeight();
    			int pixel;
    			BufferedImage bitmp = new BufferedImage(w, h, bi.getType());
    			BufferedImage biOut = new BufferedImage(w, h, bi.getType());
    
    			for (int x = 0; x < w; x++) {
    				for (int y = 0; y < h; y++) {
    					pixel = bi.getRGB(x, y);
    					if(pixel == fond.getRGB())
    						pixel = Color.BLUE.getRGB();
    					else
    						pixel = Color.WHITE.getRGB();
    					bitmp.setRGB(x, y, pixel); 
    				}
    			}
    
    			for (int x = 0; x < w; x++) {
    				for (int y = 0; y < h; y++) {
    					pixel = bitmp.getRGB(x, y);
    					if(pixel == Color.BLUE.getRGB())
    						pixel = Color.WHITE.getRGB();
    					biOut.setRGB(x, y, pixel);
    				}
    			}
    
    			return biOut;
    		} else {
    			return bi;
    		}
    	}
    }
    
    Code:
    package t3x.tnn.utility;
    
    import java.awt.AWTException;
    import java.awt.Color;
    import java.awt.Dimension;
    import java.awt.Point;
    import java.awt.Rectangle;
    import java.awt.Robot;
    import java.awt.image.BufferedImage;
    
    public class ScreenHandler {
    
    	public static Color getPixelColor(Point p){
    		return getPixelColor(p.x, p.y);
    	}
    
    	public static BufferedImage getScreen(Point hg, Point bd){
    		checkNano();
    		return nano.createScreenCapture(new Rectangle(hg, new Dimension(bd.x-hg.x, bd.y-hg.y)));
    	}
    	
    	public static boolean areImagesEqual(BufferedImage img1, BufferedImage img2){
    		int[] timg1 = getPixels(img1);
    		int[] timg2 = getPixels(img2);
    		for(int i = 0 ; i < timg1.length; i++){
    			if(timg1[i]!=timg2[i]){
    				return false;
    			}
    		}
    		return true;
    	}
    	
    	public static Color analyse(Point depart, int deviation, Color fond){
    		for(int i= depart.x; i < depart.x+deviation; i++){
    			Color col = ScreenHandler.getPixelColor(i, depart.y);
    			if(!col.equals(fond))
    				return col;
    		}
    		//IOHandler.abort("[ScreenHandler.analyse] : Aucune couleur de jeu trouvée");
    		return null;
    	}
    ///////////////////////////////////////////////////////////////////////////////////
    	private static Robot nano;
    	
    	private static Color getPixelColor(int x, int y){
    		checkNano();
    		return nano.getPixelColor(x, y);
    	}
    	
    	private static int[] getPixels(BufferedImage img){
    		return img.getRaster().getPixels(img.getRaster().getMinX(), img.getRaster().getMinY(),  img.getRaster().getWidth(), img.getRaster().getHeight(), new int[ img.getRaster().getWidth()*img.getRaster().getHeight()*10]);
    	}
    	
    	private static void checkNano(){
    		if(nano == null)
    			try {
    				nano = new Robot();
    			} catch (AWTException e) {
    				e.printStackTrace();
    			}
    	}
    }
    
     
    Ayesh, May 23, 2008
    #4
  5. Soefara

    eduardoavdr

    Joined:
    Dec 15, 2008
    Messages:
    1
    Trying to use your solution but...

    Hi Ayesh,

    We´ve developed a web app which indexes documents, you can see it at nootes dot org

    What we want now is to make a swing app which lets me scan documents and do OCR on them so they can be uploaded to my web app using webservices (already developed).

    The thing is that I found your solution perfect to my needs, but when I tried to use it on NetBeans IDE I got the following error:

    res = IOHandler.getResponse(p.getInputStream());

    What package do I need to use such function?

    Thanks for your help and your time.
     
    Last edited: Dec 16, 2008
    eduardoavdr, Dec 15, 2008
    #5
  6. Soefara

    clueless

    Joined:
    Aug 2, 2009
    Messages:
    1
    I am aware that this thread is rather old but am in need of help! I have used java quite extansivly a few years back but unfortunatly am a little rusty with it- i am trying to make an OCR program and think that the method posted here using gocr and nconvert is a good idea to avoid using Aspire OCR which needs payed for...

    Anyway, using blueJ, I am having the same problem as the above poster "cannot find symbol - variable IOHandler". I thought I would try it in netbeans too just to make sure it wasn't a blueJ quirk but same error message.

    From waht I can gather the variable IOHandler hasn't been defined in the OCR class but I am unsure what type to variable to declare it as such that it can use the getResponse() method. does anyone have any idea?

    I have searched high and low to find a solution but to no avail, I really hope someone can point me in the right direction.

    Thanks.

    Ewen
     
    clueless, Aug 2, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anjali Lourda
    Replies:
    0
    Views:
    464
    Anjali Lourda
    Feb 4, 2004
  2. Replies:
    3
    Views:
    548
  3. jazz_2k2

    Java OCR

    jazz_2k2, Mar 5, 2008, in forum: Java
    Replies:
    0
    Views:
    893
    jazz_2k2
    Mar 5, 2008
  4. maciejzapior

    Java + Ocr Sdk

    maciejzapior, Aug 6, 2009, in forum: Java
    Replies:
    0
    Views:
    653
    maciejzapior
    Aug 6, 2009
  5. Sprashant

    Java OCR

    Sprashant, May 27, 2011, in forum: Java
    Replies:
    0
    Views:
    413
    Sprashant
    May 27, 2011
Loading...

Share This Page