regular expressions eliminating filenames of type foo.thumbnail.jpg


O

oscartheduck

Hi folks,

I'm trying to alter a program I posted about a few days ago. It
creates thumbnail images from master images. Nice and simple.

To make sure I can match all variations in spelling of jpeg, and
different cases, I'm using regular expressions.


The code is currently:

-----

#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) ]:
file, ext = os.path.splitext(picture)
im = Image.open (picture)
im.thumbnail(size, Image.ANTIALIAS)
im.save(file + ".thumbnail" + ext)

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
thumbnailer(".", jpg)

-----

The problem is this. This code outputs foo.thumbnail.jpg when ran, and
when ran again it creates foo.thumbnail.thumbnail.jpg and so on,
filling a directory.

The obvious solution is to filter out any name that contains the term
"thumbnail", which I can once again do with a regular expression. My
problem is the construction of this expression.

The relevant page in the tutorial docs is: http://docs.python.org/lib/re-syntax.html

It lists (?<!...) as the proper syntax, with the example given being

I tried adding something like that to my original regex, but it added
a third argument, which re.compile can't accept.

jpg = re.compile("(?<!thumbnail).*\.(jpg|jpeg)", ".*\.(jpg|jpeg)",
re.IGNORECASE)

I tried it with re.search instead and received a lot of errors.


So I tried this:

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
jpg = re.compile("(?<!*thumbnail).jpg", "jpg")
thumbnailer(".", jpg)


Two assignments, but I receive more errors telling me this:

[[email protected] ~/pictures]$ ./thumbnail.2.py
Traceback (most recent call last):
File "./thumbnail.2.py", line 15, in ?
jpg = re.search("(?<!*thumbnail).jpg", "jpg")
File "/usr/local/lib/python2.4/sre.py", line 134, in search
return _compile(pattern, flags).search(string)
File "/usr/local/lib/python2.4/sre.py", line 227, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat




--------------------------------------------


I'm stuck as to where to go forwards from here. The code which
produced the above error is:

---------
#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) ]:
file, ext = os.path.splitext(picture)
im = Image.open (picture)
im.thumbnail(size, Image.ANTIALIAS)
im.save(file + ".thumbnail" + ext)

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
jpg = re.search("(?<!*thumbnail).jpg", "jpg")
thumbnailer(".", jpg)
png = re.compile(".*\.png", re.IGNORECASE)
thumbnailer(".", png)
gif = re.compile(".*\.gif", re.IGNORECASE)
thumbnailer(".", gif)

---------


I'd like to know where I can find more information about regexs and
how to think with them, as well as some idea of the solution to this
problem. As it stands, I can solve it with a simple os.system call and
allow my OS to do the hard work, but I'd like the code to be portable.
 
Ad

Advertisements

J

Justin Ezequiel

Why not ditch regular expressions altogether for this problem?

[ p for p in os.listdir(dir)
if os.path.isfile(os.path.join(dir,p))
and p.lower().find('.thumbnail.')==-1 ]
 
H

half.italian

Why not ditch regular expressions altogether for this problem?

[ p for p in os.listdir(dir)
if os.path.isfile(os.path.join(dir,p))
and p.lower().find('.thumbnail.')==-1 ]

I like `and '.thumbnail.' not in p]` as a better ending. :)

~Sean
 
O

oscartheduck

I eventually went with:

#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) if 'thumbnail' not in p]:
file, ext = os.path.splitext(picture)
im = Image.open (picture)
im.thumbnail(size, Image.ANTIALIAS)
im.save(file + ".thumbnail" + ext)

jpg = re.compile(".*\.(jpg|jpeg)", re.IGNORECASE)
thumbnailer(".", jpg)


Thanks for the help!
 
O

oscartheduck

Well, darn.

I just discovered that the computer this is going to run on only has
python version 2.2 installed on it, which apparently doesn't support
checking a whole word, but instead checks a single letter against
other single letters. 2.4 and, presumably 2.5 though I've actually not
used it much, has such huge leaps forwards relative to 2.2 that it's
frightening thinking about where the language is going.

But for now, I'm going to have to work on this problem again from
scratch, it seems.
 
Ad

Advertisements

G

Gabriel Genellina

Well, darn.

I just discovered that the computer this is going to run on only has
python version 2.2 installed on it, which apparently doesn't support
checking a whole word, but instead checks a single letter against
other single letters. 2.4 and, presumably 2.5 though I've actually not
used it much, has such huge leaps forwards relative to 2.2 that it's
frightening thinking about where the language is going.

But for now, I'm going to have to work on this problem again from
scratch, it seems.

You can still use a regular expression:

thumbnailRx = re.compile(r"\.thumbnail\.")

for picture in [... and filenameRx.match(p) and not thumbnailRx.search(p)]:
def thumbnailer(dir, filenameRx):
for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) if 'thumbnail' not in p]:
file, ext = os.path.splitext(picture)
 
Ad

Advertisements

A

attn.steven.kuo

I eventually went with:

#!/usr/bin/env python
from PIL import Image
import glob, os, re

size = 128, 128

def thumbnailer(dir, filenameRx):
for picture in [ p for p in os.listdir(dir) if
os.path.isfile(os.path.join(
dir,p)) and filenameRx.match(p) if 'thumbnail' not in p]:
file, ext = os.path.splitext(picture)


(snipped)

Or, one can forego regular expressions:

prefix = '.thumbnail'
for p in os.listdir(dir):
root, ext = os.path.splitext(p)
if not os.path.isfile(os.path.join(dir, p)) \
or ext.lower() not in ('.jpg', '.jpeg') \
or root[-10:].lower() == prefix:
continue
if os.path.isfile(os.path.join(dir, "%s%s%s" % (root, prefix,
ext))):
print "A thumbnail of %s already exists" % p
else:
print "Making a thumbnail of %s" % os.path.join(dir, "%s%s%s"
%
(root, prefix, ext))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top