Regex for special chars..

N

NurAzije

Hi,
I need a regular expresion which will take all chars from a string
which can be used for files naming on linux, something which will
filter the string from any char which is not allowed to be in a regular
file name..
I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
me everything that is not in this combination..
Thank you

regards,
Nur
 
J

Jürgen Exner

NurAzije said:
I need a regular expresion which will take all chars from a string
which can be used for files naming on linux, something which will
filter the string from any char which is not allowed to be in a
regular file name..
I think the allowed ones are a-zA-Z0-9

There are definitely many more. As far as I remember any character even
including line break and CR can be used. Exception being the forward slash
because that is reserved as the directory separator.
But why don't you ask in a NG that actually deals with Linux? BTW: _WHICH_
Linux file system? AFAIR there are about half a dozen.
I need a regex that will filter
me everything that is not in this combination..

REs don't filter, they match.


jue
 
A

Anno Siegel

NurAzije said:
Hi,
I need a regular expresion which will take all chars from a string
which can be used for files naming on linux, something which will
filter the string from any char which is not allowed to be in a regular
file name..
I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
me everything that is not in this combination..

You can use a lot more characters in file names.

To check a string for occurrence of a set of characters use tr///:

if ( $str !~ tr/a-zA-Z0-9//c ) { # string is okay

Anno
 
B

Brian McCauley

Jürgen Exner said:
There are definitely many more. As far as I remember any character even
including line break and CR can be used. Exception being the forward slash
because that is reserved as the directory separator.

You also can't use NUL (ie. character 0) because the POSIX API uses
NUL-terminated strings.
But why don't you ask in a NG that actually deals with Linux?

Hmmm.... do think he'd get a very positive reception?
 
T

Tad McClellan

NurAzije said:
I need a regular expresion which will take all chars from a string
which can be used for files naming on linux,


warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested


There are only 2 ASCII characters that are not allowed in
filenames on the *nix filesystems that I've seen.
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Tad McClellan
warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested
There are only 2 ASCII characters that are not allowed in
filenames on the *nix filesystems that I've seen.

The convenience of ASCII is that there are so many of the standards to
choose from... So if you consider OS X filesystem as *nix, things
quickly go down the drain (UTF-8 encoding *enforced* on the file
system level).

Hope this helps,
Ilya
 
N

NurAzije

I have a script which will take the string from the DB, then compare
the string with this REGEX and replace every char which is not from the
allowed ascii with "_", then name a file with the new string.. for
example:
"asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
I need the right regex that will mark everything not allowed..
Thank you..
 
J

John W. Krahn

NurAzije said:
I have a script which will take the string from the DB, then compare
the string with this REGEX and replace every char which is not from the
allowed ascii with "_", then name a file with the new string.. for
example:
"asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
I need the right regex that will mark everything not allowed..

What do you mean by "allowed ascii"? ',', '*' and '?' are ASCII. And why do
you think that you need to use a regular expression?

my $string = q[asjiuel,dpdsš3898d*?jn];

$string =~ tr/a-zA-Z0-9/_/c;

print "$string\n";



John
 
N

NurAzije

Hi,
this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need, I
need something to replace everything not in a-zA-Z0-9 to _ ..
I ment with allowed the ones I can use to name a file..
 
A

Anno Siegel

NurAzije said:
Hi,
this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need, I
need something to replace everything not in a-zA-Z0-9 to _ ..

Have you bothered to look up what the /c option does in tr///?

Anno
 
B

Brad Baxter

NurAzije said:
Thank you guys, I have found it:
[^a-z|A-Z|0-9]
thank you anyway..

No, you haven't. IMO, you should be thanking for this
(apparently, you didn't try running it):

my $string = q[asjiuel,dpds\2323898d*?jn];

$string =~ tr/a-zA-Z0-9/_/c;

# prints: asjiuel_dpds_3898d__jn
print "$string\n";


Because what you think you have found does this:

my $str = q[as|jiuel,dp|ds\23238|98d*?jn];

$str =~ s/[^a-z|A-Z|0-9]/_/g;

# prints: as|jiuel_dp|ds_38|98d__jn
print "$str\n";

Hint: lose the or-bars. 'Or' is understood in character
classes.

Regards,
 
T

Tad McClellan

NurAzije said:
this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need,


Why do you say that?

Did you try it?

Show us the code where it fails to do what you asked for, and
we will be able to explain it to you.

I
need something to replace everything not in a-zA-Z0-9 to _ ..


The code above will replace everything not in a-zA-Z0-9 to _,
so what is the problem?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top