read part of jpeg file by pure python

Pacino · Sep 13, 2007

Hi, everyone,

I am wondering whether it's possible to read part (e.g. 1000*1000) of
a huge jpeg file (e.g. 30000*30000) and save it to another jpeg file
by pure python. I failed to read the whole file and split it, because
it would cost 2GB memory.

Can anyone help me? Any comments would be appreciated.

Thanks.

Robert

Laurent Pointal · Sep 13, 2007

Pacino a ¨¦crit :

Hi, everyone,

I am wondering whether it's possible to read part (e.g. 1000*1000) of
a huge jpeg file (e.g. 30000*30000) and save it to another jpeg file
by pure python. I failed to read the whole file and split it, because
it would cost 2GB memory.

Can anyone help me? Any comments would be appreciated.

Thanks.

Robert

Just reading parts of the *file* is easy (see tell() seek() and read()
methods on files).
But to extract a part of the *picture*, you must uncompress the picture
in memory, grab the sub-picture, and save it back - generally with
compression. I can't see how you can bypass the uncompress/compress
phases and the corresponding memory use.

A+

Laurent.

Pacino · Sep 13, 2007

Pacino a écrit :

Just reading parts of the *file* is easy (see tell() seek() and read()
methods on files).
But to extract a part of the *picture*, you must uncompress the picture
in memory, grab the sub-picture, and save it back - generally with
compression. I can't see how you can bypass the uncompress/compress
phases and the corresponding memory use.

A+

Laurent.

The most difficult part is the uncompress part. I don't want the whole
picture to be uncompressed in the memory, because it will consume a
lot of memory (2GB, as I mentioned). My goal is to uncompress part of
the picture into the memory.

I just read some article on this subject (http://mail.python.org/
pipermail/image-sig/1999-April/000713.html) , but haven't test it out
yet.

Robert

michal.zaborowski · Sep 13, 2007

The most difficult part is the uncompress part. I don't want the whole
picture to be uncompressed in the memory, because it will consume a
lot of memory (2GB, as I mentioned). My goal is to uncompress part of
the picture into the memory.

I just read some article on this subject (http://mail.python.org/
pipermail/image-sig/1999-April/000713.html) , but haven't test it out
yet.

I have no idea what it does. Anyway - jpeg:
1. RGB -> HLV
2. divide data into 8x8 - blocks of data.
3. blocks are treated with discrete cosine transform.
4. Result is filtered to remove "fast changes".
5. Then result is compressed with Huffman alg.

So to get part of image - you can take smaller image before step 4.
As far as I understand code presented at:
http://mail.python.org/pipermail/image-sig/1999-April/000713.html
- full image will be loaded, and cutted.

Pacino · Sep 14, 2007

I have no idea what it does. Anyway - jpeg:
1. RGB -> HLV
2. divide data into 8x8 - blocks of data.
3. blocks are treated with discrete cosine transform.
4. Result is filtered to remove "fast changes".
5. Then result is compressed with Huffman alg.

So to get part of image - you can take smaller image before step 4.
As far as I understand code presented at:http://mail.python.org/pipermail/image-sig/1999-April/000713.html
- full image will be loaded, and cutted.

--
Regards,
Micha³ Zaborowski (TeXXaS)- -

- -

Thanks. Seems no ways to achieve the requirements.

Dennis Lee Bieber · Sep 14, 2007

Thanks. Seems no ways to achieve the requirements.

Implement a JPEG decoder that doesn't work in memory, but rather to
a disk file -- then extract the portion(s) of the disk file you need,
reencode, etc.

I haven't studied JPEG format, but wouldn't the size of the image be
something in the file header? That would let you allocate an RGB buffer
on disk. Then start decompressing (is the Huffman by scan row, or be
block, or ? -- would control how much you did need to maintain in memory
before doing seek() on the output file). Since part of the algorithm is
an 8x8 block, you /will/ have a lot of seek() calls to write the RGB for
8 pixels, seek to next row, write, etc...
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/

JMF & JPEG	1	Mar 12, 2008
use python to split a video file into a set of parts	2	May 7, 2013
Python 3 read() function	11	Dec 4, 2008
multi-protocol url-based IO -- pure python kioslave-like module?	2	Oct 27, 2007
Reading in cooked mode (was Re: Python MSI not installing, log fileshowing name of a Viatnemese comm	8	Mar 23, 2014
help on file storage for split multi part download	5	Mar 6, 2008
Translater + module + tkinter	1	Feb 16, 2023
Trying to parse a HUGE(1gb) xml file	41	Dec 20, 2010

read part of jpeg file by pure python

Pacino

Laurent Pointal

Pacino

michal.zaborowski

Pacino

Dennis Lee Bieber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads