Hello,
Recently I was digging into the same area in Python and came to
following conclusions:
1. You must choose between pretty expensive and proprietary Abbyy
command line OCR SDK and free Tesseract OCR. Abbyy's product is great
in recognition, but have very limiting license, while Tesseract is
great and trainable, but have very poor layout analysis.
2. I am not aware about any existing wrapper over either of these
products. Writing a basic wrapper won't be a real problem though,
since basic interaction with them is limited to forking an external
process. Additionally, Tesseract has an API bindings for Python, it
seems that implementing them for Ruby would be an easy task too.
Tesseract would work for you if you have an evenly formatted amounts
of text. Otherwise you would have to implement image layout analysis
engine on your own. Also, you would better use SVN trunk of Tesseract,
because it contains many changes comparing to the last packaged
version.