Newbie regular expression ?

L

len

I have the following statement and it works fine;

list1 = glob.glob('*.dat')

however I now have an additional requirement the the string must begin
with
any form of "UNQ,Unq,unq,..."

as an example if I had the following four files in the directory:

unq123abc.dat
xy4223.dat
myfile.dat
UNQxyc123489-24.dat

only unq123abc.dat and UNQxyc123489-24.dat would be selected

I have read through the documentation and I am now sooooo
confussedddddd!!

Len Sumnler
 
F

Fredrik Lundh

len said:
I have the following statement and it works fine;

list1 = glob.glob('*.dat')

that's a glob pattern, not a regular expression.
however I now have an additional requirement the the string must begin
with any form of "UNQ,Unq,unq,..."

list1 = glob.glob('*.dat')
list1 = [file for file in list1 if file.lower().startswith("unq")]

</F>
 
J

jepler

Here are two ideas that come to mind:
files = glob.glob("UNQ*.dat") + glob.glob("Unq*.dat") + glob.glob("unq.dat")

files = [f for f in glob.glob("*.dat") if f[:3] in ("UNQ", "Unq", "unq")]

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDQs2mJd01MZaTXX0RAkPpAKCfKqMCXC/LKLuMDq6KJ4cQjX7EWwCgm2Ho
q+fZjjIzhHsS4l1yMoW3kzE=
=ikw+
-----END PGP SIGNATURE-----
 
D

David Murmann

Here are two ideas that come to mind:
files = glob.glob("UNQ*.dat") + glob.glob("Unq*.dat") + glob.glob("unq.dat")

files = [f for f in glob.glob("*.dat") if f[:3] in ("UNQ", "Unq", "unq")]

actually i think he wanted "unq" to be case-insensitive which could be
done with:

files = [f for f in glob.glob("*.dat") if f.lower().startswith("unq")]

David.
 
M

Micah Elliott

I have the following statement and it works fine;

list1 = glob.glob('*.dat')

however I now have an additional requirement the the string must begin
with any form of "UNQ,Unq,unq,..."

as an example if I had the following four files in the directory:

unq123abc.dat
xy4223.dat
myfile.dat
UNQxyc123489-24.dat

only unq123abc.dat and UNQxyc123489-24.dat would be selected

If glob is your preferred means, one option is:

$ touch unq1.dat UnQ1.dat unQ1.dat UNQ1.dat foo.dat
$ python -c '
- import glob
- print glob.glob("[uU][nN][qQ]*.dat")
- '
['unq1.dat', 'UnQ1.dat', 'unQ1.dat', 'UNQ1.dat']
$ man 3 fnmatch
 
M

Micah Elliott

$ man 3 fnmatch

Actually "man 7 glob" would be better (assuming you've got *nix). Also
note that globs are not regular expressions. "pydoc glob" is another
reference.
 
S

Steve Holden

len said:
I have the following statement and it works fine;

list1 = glob.glob('*.dat')

however I now have an additional requirement the the string must begin
with
any form of "UNQ,Unq,unq,..."

as an example if I had the following four files in the directory:

unq123abc.dat
xy4223.dat
myfile.dat
UNQxyc123489-24.dat

only unq123abc.dat and UNQxyc123489-24.dat would be selected

I have read through the documentation and I am now sooooo
confussedddddd!!
You don't need regular expressions. You want

list1 = glob.glob("[Uu][Nn][Qq]*.dat")

regards
Steve
 
L

len

Thanks everyone for your help.

I took the option of f1.lower().startswith("unq").

Len Sumnler
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top