D
Daniel
Hi All,
I'm working on OCR of ancient periodicals. The issue is this: I can't access
to layout data encoded in the OCR pdf files and use them regardless to their
original format. There is an appropriate XML standard, ALTO, which matches
each text character and its corresponding graphic zone. But I don't know how
to generate an ALTO output. Do you know a soft whith such output? Any clue
about this?
Thanks a lot.
Daniel
Paris
I'm working on OCR of ancient periodicals. The issue is this: I can't access
to layout data encoded in the OCR pdf files and use them regardless to their
original format. There is an appropriate XML standard, ALTO, which matches
each text character and its corresponding graphic zone. But I don't know how
to generate an ALTO output. Do you know a soft whith such output? Any clue
about this?
Thanks a lot.
Daniel
Paris