Discussion in 'Java' started by mambenanje, Aug 22, 2006.

    I need help on a little practise project I am undertaking right now. I
    have a couple of question papers with diagrams and text and I wish to
    scan the papers to either pdf format or jpeg png format. my main
    problem is to read the file using java and extract the images and
    questions. can some one help me or give me a clue on this
    mambenanje, Aug 22, 2006
  2. Hi,

    "OCR" is nothing you will solve in "a little practise project". Forget
    it. (Use an existing OCR-Software instead or type it into your computer
    by yourself). There is nothing to say more about that...

    Ingo R. Homann, Aug 22, 2006
  3. As Ingo said, OCR isn't something you can just throw together. It is
    very sensitive and your results can vary (especially when dealing with
    OCR software for faxes, which is where I've dealt with it, but that's
    another issue). Do you need to search the text that is on your diagrams
    or be able to copy/paste it into another application? If not, then you
    don't even need OCR but just something that can read the existing file
    format and convert to your pdf/jpg/png format.

    OCR actually reads the text in your source and separates it from the
    image version of the text so you can manipulate it as text but if you
    don't actually have a need to manipulate it like that then you don't
    need OCR.
    Brandon McCombs, Aug 22, 2006
    ok thanks for the help,
    this is what I want to do
    1) scan the question paper to any file format
    2) get the text and pictures found on the file analyse them with no
    human copying and pasting
    3) send the information into a database

    this will help me work with several question papers and when papers
    come in future I only have to scan then pass it thru the application. I
    cannot use another OCR tool for this, well maybe I cant cos I dont know
    mambenanje, Aug 27, 2006
