Regex for special chars..

Discussion in 'Perl Misc' started by NurAzije, Apr 18, 2006.

  1. NurAzije

    NurAzije Guest

    Hi,
    I need a regular expresion which will take all chars from a string
    which can be used for files naming on linux, something which will
    filter the string from any char which is not allowed to be in a regular
    file name..
    I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
    me everything that is not in this combination..
    Thank you

    regards,
    Nur
     
    NurAzije, Apr 18, 2006
    #1
    1. Advertising

  2. NurAzije wrote:
    > I need a regular expresion which will take all chars from a string
    > which can be used for files naming on linux, something which will
    > filter the string from any char which is not allowed to be in a
    > regular file name..
    > I think the allowed ones are a-zA-Z0-9


    There are definitely many more. As far as I remember any character even
    including line break and CR can be used. Exception being the forward slash
    because that is reserved as the directory separator.
    But why don't you ask in a NG that actually deals with Linux? BTW: _WHICH_
    Linux file system? AFAIR there are about half a dozen.

    > I need a regex that will filter
    > me everything that is not in this combination..


    REs don't filter, they match.


    jue
     
    Jürgen Exner, Apr 18, 2006
    #2
    1. Advertising

  3. NurAzije

    Anno Siegel Guest

    NurAzije <> wrote in comp.lang.perl.misc:
    > Hi,
    > I need a regular expresion which will take all chars from a string
    > which can be used for files naming on linux, something which will
    > filter the string from any char which is not allowed to be in a regular
    > file name..
    > I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
    > me everything that is not in this combination..


    You can use a lot more characters in file names.

    To check a string for occurrence of a set of characters use tr///:

    if ( $str !~ tr/a-zA-Z0-9//c ) { # string is okay

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Apr 18, 2006
    #3
  4. Jürgen Exner wrote:
    > NurAzije wrote:
    > > I need a regular expresion which will take all chars from a string
    > > which can be used for files naming on linux, something which will
    > > filter the string from any char which is not allowed to be in a
    > > regular file name..
    > > I think the allowed ones are a-zA-Z0-9

    >
    > There are definitely many more. As far as I remember any character even
    > including line break and CR can be used. Exception being the forward slash
    > because that is reserved as the directory separator.


    You also can't use NUL (ie. character 0) because the POSIX API uses
    NUL-terminated strings.

    > But why don't you ask in a NG that actually deals with Linux?


    Hmmm.... do think he'd get a very positive reception?
     
    Brian McCauley, Apr 18, 2006
    #4
  5. NurAzije <> wrote:

    > I need a regular expresion which will take all chars from a string
    > which can be used for files naming on linux,



    warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested


    There are only 2 ASCII characters that are not allowed in
    filenames on the *nix filesystems that I've seen.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 18, 2006
    #5
  6. [A complimentary Cc of this posting was sent to
    Tad McClellan
    <>], who wrote in article <>:

    > warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested


    > There are only 2 ASCII characters that are not allowed in
    > filenames on the *nix filesystems that I've seen.


    The convenience of ASCII is that there are so many of the standards to
    choose from... So if you consider OS X filesystem as *nix, things
    quickly go down the drain (UTF-8 encoding *enforced* on the file
    system level).

    Hope this helps,
    Ilya
     
    Ilya Zakharevich, Apr 19, 2006
    #6
  7. NurAzije

    NurAzije Guest

    I have a script which will take the string from the DB, then compare
    the string with this REGEX and replace every char which is not from the
    allowed ascii with "_", then name a file with the new string.. for
    example:
    "asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
    I need the right regex that will mark everything not allowed..
    Thank you..
     
    NurAzije, Apr 19, 2006
    #7
  8. NurAzije wrote:
    > I have a script which will take the string from the DB, then compare
    > the string with this REGEX and replace every char which is not from the
    > allowed ascii with "_", then name a file with the new string.. for
    > example:
    > "asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
    > I need the right regex that will mark everything not allowed..


    What do you mean by "allowed ascii"? ',', '*' and '?' are ASCII. And why do
    you think that you need to use a regular expression?

    my $string = q[asjiuel,dpdsš3898d*?jn];

    $string =~ tr/a-zA-Z0-9/_/c;

    print "$string\n";



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Apr 19, 2006
    #8
  9. NurAzije

    NurAzije Guest

    Hi,
    this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need, I
    need something to replace everything not in a-zA-Z0-9 to _ ..
    I ment with allowed the ones I can use to name a file..
     
    NurAzije, Apr 19, 2006
    #9
  10. NurAzije

    NurAzije Guest

    Thank you guys, I have found it:
    [^a-z|A-Z|0-9]
    thank you anyway..
     
    NurAzije, Apr 19, 2006
    #10
  11. NurAzije

    Anno Siegel Guest

    NurAzije <> wrote in comp.lang.perl.misc:
    > Hi,
    > this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need, I
    > need something to replace everything not in a-zA-Z0-9 to _ ..


    Have you bothered to look up what the /c option does in tr///?

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Apr 19, 2006
    #11
  12. NurAzije

    Brad Baxter Guest

    NurAzije wrote:
    > Thank you guys, I have found it:
    > [^a-z|A-Z|0-9]
    > thank you anyway..


    No, you haven't. IMO, you should be thanking for this
    (apparently, you didn't try running it):

    my $string = q[asjiuel,dpds\2323898d*?jn];

    $string =~ tr/a-zA-Z0-9/_/c;

    # prints: asjiuel_dpds_3898d__jn
    print "$string\n";


    Because what you think you have found does this:

    my $str = q[as|jiuel,dp|ds\23238|98d*?jn];

    $str =~ s/[^a-z|A-Z|0-9]/_/g;

    # prints: as|jiuel_dp|ds_38|98d__jn
    print "$str\n";

    Hint: lose the or-bars. 'Or' is understood in character
    classes.

    Regards,

    --
    Brad
     
    Brad Baxter, Apr 19, 2006
    #12
  13. NurAzije <> wrote:

    > this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need,



    Why do you say that?

    Did you try it?

    Show us the code where it fails to do what you asked for, and
    we will be able to explain it to you.


    > I
    > need something to replace everything not in a-zA-Z0-9 to _ ..



    The code above will replace everything not in a-zA-Z0-9 to _,
    so what is the problem?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 19, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hansiman
    Replies:
    4
    Views:
    838
    hansiman
    Sep 24, 2004
  2. Alfredo
    Replies:
    1
    Views:
    443
    George
    Apr 21, 2005
  3. Kosio

    Floats to chars and chars to floats

    Kosio, Sep 16, 2005, in forum: C Programming
    Replies:
    44
    Views:
    1,309
    Tim Rentsch
    Sep 23, 2005
  4. Hongyu
    Replies:
    9
    Views:
    938
    James Kanze
    Aug 8, 2008
  5. M.Posseth

    receiving ??? chars instead of "special" chars

    M.Posseth, Nov 15, 2004, in forum: ASP .Net Web Services
    Replies:
    3
    Views:
    253
    Dan Rogers
    Nov 16, 2004
Loading...

Share This Page