using PIL for PCA analysis

D

devnew

hi guys
i am trying out PCA analysis using python.I have a set of
jpeg(rgbcolor) images whose pixel data i need to extract and make a
matrix .( rows =num of images and cols=num of pixels)
For this i need to represent an image as an array.
i was able to do this using java's BufferedImage as below

<javacode>
int[] rgbdata = new int[width * height];
image.getRGB(0,0,width,height,rgbdata,0,width);

doubles = new double[rgbdata.length];
int i;
for ( i = 0; i < bytes.length; i++) {
doubles = (double)(rgbdata);
}
</javacode>

this doubles[] now represent a single image's pixels

then i can get a matrix of say 4 images ..(each of 4X3 size)
<sampledata>
images[][] rows=4,cols=12
[
[-4413029.0, -1.0463919E7,... -5201255.0]

[-5399916.0, -9411231.0, ... -6583163.0]

[-3886937.0, -1.0202292E7,... -6648444.0]

[-5597295.0, -7901339.0,... -5989995.0]
]
</sampledata>
i can normalise the above matrix to zeromean and then find covariance
matrix by
images * transpose(images)

my problem is how i can use PIL to do the same thing..if i extract
imagedata using im.getdata()
i will get
<sampledata>
[
[(188, 169, 155), (96, 85, 81),.. (176, 162, 153)]

[(173, 154, 148), (112, 101, 97),.. (155, 140, 133)]

[(196, 176, 167), (100, 83, 76), ... (154, 141, 132)]

[(170, 151, 145), (135, 111, 101), ... (164, 153, 149)]
]
</sampledata>
i donot know how to find covariance matrix from such a matrix..it
would'v been ideal if they were single values instead of tuples..i
can't use greyscale images since the unput images are all rgb jpeg

can someone suggest a solution?
thanks
dn
 
B

Bronner, Gregory

Since nobody has responded to this:

I know nothing about PIL, but you can do this using numpy and scipy
fairly easily, and you can transform PIL arrays into Numpy arrays pretty
quickly as well.



-----Original Message-----
From: (e-mail address removed) [mailto:[email protected]]
Sent: Thursday, February 21, 2008 2:41 AM
To: (e-mail address removed)
Subject: using PIL for PCA analysis

hi guys
i am trying out PCA analysis using python.I have a set of
jpeg(rgbcolor) images whose pixel data i need to extract and make a
matrix .( rows =num of images and cols=num of pixels) For this i need to
represent an image as an array.
i was able to do this using java's BufferedImage as below

<javacode>
int[] rgbdata = new int[width * height];
image.getRGB(0,0,width,height,rgbdata,0,width);

doubles = new double[rgbdata.length];
int i;
for ( i = 0; i < bytes.length; i++) {
doubles = (double)(rgbdata);
}
</javacode>

this doubles[] now represent a single image's pixels

then i can get a matrix of say 4 images ..(each of 4X3 size)
<sampledata> images[][] rows=4,cols=12 [ [-4413029.0, -1.0463919E7,...
-5201255.0]

[-5399916.0, -9411231.0, ... -6583163.0]

[-3886937.0, -1.0202292E7,... -6648444.0]

[-5597295.0, -7901339.0,... -5989995.0]
]
</sampledata>
i can normalise the above matrix to zeromean and then find covariance
matrix by images * transpose(images)

my problem is how i can use PIL to do the same thing..if i extract
imagedata using im.getdata() i will get <sampledata> [ [(188, 169, 155),
(96, 85, 81),.. (176, 162, 153)]

[(173, 154, 148), (112, 101, 97),.. (155, 140, 133)]

[(196, 176, 167), (100, 83, 76), ... (154, 141, 132)]

[(170, 151, 145), (135, 111, 101), ... (164, 153, 149)] ] </sampledata>
i donot know how to find covariance matrix from such a matrix..it
would'v been ideal if they were single values instead of tuples..i can't
use greyscale images since the unput images are all rgb jpeg

can someone suggest a solution?
thanks
dn

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

This message is intended only for the personal and confidential use of the designated recipient(s) named above. If you are not the intended recipient of this message you are hereby notified that any review, dissemination, distribution or copying of this message is strictly prohibited. This communication is for information purposes only and should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product, an official confirmation of any transaction, or as an official statement of Lehman Brothers. Email transmission cannot be guaranteed to be secure or error-free. Therefore, we do not represent that this information is complete or accurate and it should not be relied upon as such. All information is subject to change without notice.
 
D

devnew

On Feb 21, 7:35 pm, "Bronner, Gregory" <[email protected]>
wrote:
you can do this using numpy and scipy
fairly easily, and you can transform PIL arrays into Numpy arrays pretty
quickly as well.

i can use numpy ndarray or matrix once i have a PIL array with
elements in the correct format(ie a single number for each pixel
instead of a tuple of integers)
it is the image data extraction step that is giving me the problem

ie i want PIL to return an image as something like
[-4413029.0, -1.0463919E7,... -5201255.0]
instead of
[(188, 169, 155), (96, 85, 81),.. (176, 162, 153)]

Any PIL experts please help
dn
 
P

Paul McGuire

hi guys
i am trying out  PCA analysis using python.I have a set of
jpeg(rgbcolor) images whose pixel data i need to extract and make a
matrix .( rows =num of images and cols=num of pixels)
For this i need to represent an image as an array.
i was able to do this using java's BufferedImage as below

<javacode>
int[] rgbdata = new int[width * height];
image.getRGB(0,0,width,height,rgbdata,0,width);

doubles = new double[rgbdata.length];
int i;
for ( i = 0; i < bytes.length; i++) {
   doubles  = (double)(rgbdata);}

</javacode>

this doubles[] now represent a single image's pixels

then i can get a matrix of say 4 images ..(each of 4X3 size)
<sampledata>
images[][]  rows=4,cols=12
[
[-4413029.0, -1.0463919E7,... -5201255.0]

[-5399916.0, -9411231.0, ... -6583163.0]

[-3886937.0, -1.0202292E7,... -6648444.0]

[-5597295.0, -7901339.0,... -5989995.0]
]
</sampledata>
i can normalise the above matrix to zeromean and then find covariance
matrix by
images * transpose(images)

my problem is how i can use PIL to do the same thing..if i extract
imagedata using im.getdata()
i will get
<sampledata>
[
[(188, 169, 155), (96, 85, 81),.. (176, 162, 153)]

[(173, 154, 148), (112, 101, 97),.. (155, 140, 133)]

[(196, 176, 167), (100, 83, 76), ... (154, 141, 132)]

[(170, 151, 145), (135, 111, 101), ... (164, 153, 149)]
]
</sampledata>
i donot know how to find covariance matrix from such a matrix..it
would'v been ideal if they were single values instead of tuples..i
can't use greyscale images since the unput images are all rgb jpeg

can someone suggest a solution?
thanks
dn


I'm surprised PIL doesn't have a grayscale conversion, but here is one
that can manipulate your RGB values:

sampledata = [
[(188, 169, 155), (96, 85, 81), (176, 162, 153)],
[(173, 154, 148), (112, 101, 97), (155, 140, 133)],
[(196, 176, 167), (100, 83, 76), (154, 141, 132)],
[(170, 151, 145), (135, 111, 101), (164, 153, 149)],
]

# following approx from http://www.dfanning.com/ip_tips/color2gray.html
grayscale = lambda (R,G,B) : int(0.3*R + 0.59*G + 0.11*B)
print [ [ grayscale(rgb) for rgb in row ] for row in sampledata ]

prints (reformatted to match your sampledata):

[
[173, 87, 165],
[159, 103, 143],
[181, 87, 143],
[156, 117, 155]
]
 
H

harryos

Paul McGuire wrote
# following approx fromhttp://www.dfanning.com/ip_tips/color2gray.html
grayscale = lambda (R,G,B) : int(0.3*R + 0.59*G + 0.11*B)
print [ [ grayscale(rgb) for rgb in row ] for row in sampledata ]


Paul
in PIL handbook ,they mention a Luma transform on page15, under the
im.convert() section..
L = R * 299/1000 + G * 587/1000 + B * 114/1000
is that not similar to what you mentioned?(I am newbie in this area..)
if i want to do an array of PIL image data i can use
img=Image.open("myimg.jpg") .convert("L")
pixelarray=img.getdata()

thus i guess i can build a matrix of a set of images
is there something wrong in the way i do this above?may be i can use
that to find covariance matrix for the set of images?

H
 
J

Jan Erik Solem

if i want to do an array of PIL image data i can use
img=Image.open("myimg.jpg") .convert("L")
pixelarray=img.getdata()
convert("L") is a good way to make images grayscale. An option to using
getdata() is to try numpy's array:
pixelarray = numpy.array(img)
this gives lots of possibilities for working with the images numerically,
like for PCA. (see example code in the link below)


thus i guess i can build a matrix of a set of images
is there something wrong in the way i do this above?may be i can use
that to find covariance matrix for the set of images?
I wrote a short script for doing PCA on images using python, with some
explanations and example code
http://jesolem.blogspot.com/2009/01/pca-for-images-using-python.html here .
Could be of help to you guys.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,147
Latest member
CarenSchni
Top