Extracting images from a PDF file

D

Doug Farrell

Hi all,

Does anyone know how to extract images from a PDF file? What I'm looking
to do is use pdflib_py to open large PDF files on our Linux servers,
then use PIL to verify image data. I want to do this in order
to find corrupt images in the PDF files. If anyone could help
me out, or point me in the right direction, it would be most
appreciated!

Also, does anyone know of a way to validate a PDF file?

Thanks in advance,
Doug
 
C

Carl K

Doug said:
Hi all,

Does anyone know how to extract images from a PDF file? What I'm looking
to do is use pdflib_py to open large PDF files on our Linux servers,
then use PIL to verify image data. I want to do this in order
to find corrupt images in the PDF files. If anyone could help
me out, or point me in the right direction, it would be most
appreciated!

If you are ok shelling out to a binary:

pdfimages - Portable Document Format (PDF) image extractor (version
3.00)
http://packages.ubuntu.com/gutsy/text/xpdf-utils

I am trying to convert the pdf to a png, but without having to run external
commands. so I will understand if you arn't happy with pdfimages.

Carl K
 
W

writeson

If you are ok shelling out to a binary:

pdfimages - Portable Document Format (PDF) image extractor (version
3.00)http://packages.ubuntu.com/gutsy/text/xpdf-utils

I am trying to convert the pdf to a png, but without having to run external
commands. so I will understand if you arn't happy with pdfimages.

Carl K

Carl,

Thanks for the feedback, and I don't mind shelling out to an external
command if it gets the job done. Thanks for the link to xpdf-utils,
I'm going to look into it this morning.

Doug
 
M

Max Erickson

Doug Farrell said:
Hi all,

Does anyone know how to extract images from a PDF file? What I'm
looking to do is use pdflib_py to open large PDF files on our
Linux servers, then use PIL to verify image data. I want to do
this in order to find corrupt images in the PDF files. If anyone
could help me out, or point me in the right direction, it would
be most appreciated!

Also, does anyone know of a way to validate a PDF file?

Thanks in advance,
Doug

There is some discussion here:

http://nedbatchelder.com/blog/200712.html#e20071210T064608



max
 
W

writeson

Carl,

Thanks for the feedback, and I don't mind shelling out to an external
command if it gets the job done. Thanks for the link to xpdf-utils,
I'm going to look into it this morning.

Doug

Hi,

Our linux servers run CentOS (4.X) I believe, and the repositories for
this version doesn't have xpdf-utils available. I'm going to look into
editing the sources.list file in order to get yum to install the
necessary dependencies for me as xpdf-utils looks very useful!

Doug
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top