Expanding a perl regex to a list of files with full paths

N

Neil Shadrach

I have a situation where I'd like to specify a list of files ( full paths )
with a perl regular expression as opposed to shell wildcards.
Something like qr!/usr/path[abc]/file[123]|/user/[abc]path/longer/file[456]!
If a list of all files on the machine ( find / -print ) was cheap to get that
would be fine but it isn't since I will want to check the expansion at intervals.
A reasonable compromise would seem to be to allow individual directory or file names
to be regular expressions but for the separators to be fixed.
That is for qr!/usr/path[abc]/file[123]! I could open each directory in turn and pick
out matches. For my first example I'd have to do it as separately
qr!/usr/path[abc]/file[123]! and qr!/user/[abc]path/longer/file[456]!

Is there a better way to do this? I'd really like to be able to use a single regular expression
with no restrictions ( and without the need to split on the directory separator ).
I've looked at File::Find but couldn't see how it could be used for this case.
 
A

Anno Siegel

Neil Shadrach said:
I have a situation where I'd like to specify a list of files ( full paths )
with a perl regular expression as opposed to shell wildcards.
Something like qr!/usr/path[abc]/file[123]|/user/[abc]path/longer/file[456]!
If a list of all files on the machine ( find / -print ) was cheap to get that
would be fine but it isn't since I will want to check the expansion at
intervals.
A reasonable compromise would seem to be to allow individual directory
or file names
to be regular expressions but for the separators to be fixed.
That is for qr!/usr/path[abc]/file[123]! I could open each directory in
turn and pick
out matches. For my first example I'd have to do it as separately
qr!/usr/path[abc]/file[123]! and qr!/user/[abc]path/longer/file[456]!

Is there a better way to do this? I'd really like to be able to use a
single regular expression
with no restrictions ( and without the need to split on the directory
separator ).
I've looked at File::Find but couldn't see how it could be used for this case.

Why not? $File::Find::name contains the complete path, you can match
against that.

In general, there is no other way but to dive into *all* directories
looking for a match. With a general regex you won't be able to tell
from looking at a directory name whether the directory can hold
matching files or not. Just think of /^a.*a$/ or even /^(.).*\1/.

So, unless you can divide the regex into parts that match individual
path components, you will have to walk the entire directory tree.

Anno
 
N

Neil Shadrach

Anno said:
Why not? $File::Find::name contains the complete path, you can match
against that.

True. I suppose what I rather vaguely had in mind was a way of using the module that
would optimize the search. That is not go down paths that could not possibly
match - if this could be determined from the expression. This implies breaking it
down in some way I guess. I did this by splitting on '/' but of course I've
lost the regex then. May be my best practical option.
In general, there is no other way but to dive into *all* directories
looking for a match. With a general regex you won't be able to tell
from looking at a directory name whether the directory can hold
matching files or not. Just think of /^a.*a$/ or even /^(.).*\1/.

So, unless you can divide the regex into parts that match individual
path components, you will have to walk the entire directory tree.

That's the conclusion I came to but I've been wrong before so I thought I'd ask :)

Thanks

Neil Shadrach
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top