Getting the Text from Image and PDF

J

Jan

Hi friends,

This is Jan, I am new to this Group.

I have a requirement here.

Is there any Java API for getting the Text data from an Image and PDF formats. Please let me know the same. If anything found, please suggest me regarding them.


Thanks && Regards..
Jan
 
A

Andreas Leitgeb

Jan said:
Hi friends,
This is Jan, I am new to this Group.
I have a requirement here.
Is there any Java API for getting the Text data from an Image
and PDF formats.

For reading characters from graphical data, google "ocr" (and "java")
(the acronym means "optical character recognition")

PDFs may contain the text directly (non-graphically), which would make
extraction much easier (and not require ocr).
 
R

Roedy Green

Is there any Java API for getting the Text data from an Image and PDF formats.
Pease let me know the same. If anything found, please suggest me regarding them

You could spawn a copy of Nuance Omnipage. It can OCR pdfs. You might
just look for PDF --> X converters.

See http://mindprod.com/pdf.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top