remove non alphanumeric characters

Discussion in 'C Programming' started by joe, Mar 2, 2007.

  1. joe

    joe Guest

    hello i have a databse program that uses char arrays to output data to
    reports. I would like to remove all invalid characters from the array
    and replace them with a blank space. I have problems with ( ' return
    and some non ascii charcters. Any quick and dirty way to do this?
    thanks.
     
    joe, Mar 2, 2007
    #1
    1. Advertising

  2. joe

    santosh Guest

    joe wrote:
    > hello i have a databse program that uses char arrays to output data to
    > reports. I would like to remove all invalid characters from the array
    > and replace them with a blank space. I have problems with ( ' return
    > and some non ascii charcters. Any quick and dirty way to do this?
    > thanks.


    Have a look at the various is* functions in ctype.h. A combination of
    them should do what you want. For example ispunct returns true if the
    argument is a non-alphanumeric non-space printable character.
    Similarly iscntrl returns true if it's argument is a control
    character. You can use such functions, (actually macros), to identify
    and strip out the unwanted characters.

    Beware some of the is* functions are specific to C99 which may not be
    fully supported on most compilers.
     
    santosh, Mar 2, 2007
    #2
    1. Advertising

  3. joe

    Guest

    On 2 Mar, 16:51, "joe" <> wrote:
    > hello i have a databse program that uses char arrays to output data to
    > reports. I would like to remove all invalid characters from the array
    > and replace them with a blank space. I have problems with ( ' return
    > and some non ascii charcters. Any quick and dirty way to do this?



    Apart from scanning the array and checking each character with
    isprint()?

    I doubt it. But I don't think something like this (WARNING: untested)
    is too hard:-

    void cleanup(char *string) {
    while(*string) {
    if (!isprint(*string)) {
    *string = ' ';
    }
    string++;
    }
    }

    Adjust to your needs, but I think the isxxxx() functions in ctype.h
    are what you need.

    Variations are to declare a lookup table of valid characters and
    validate against it, or (more efficiently) to do what I believe
    isxxxx() normally does and setup an array of flags which can be
    indexed by the character we are testing.

    (I expect this will get torn to shreds ...)
     
    , Mar 2, 2007
    #3
  4. "joe" <> writes:
    > hello i have a databse program that uses char arrays to output data to
    > reports. I would like to remove all invalid characters from the array
    > and replace them with a blank space. I have problems with ( ' return
    > and some non ascii charcters. Any quick and dirty way to do this?


    The first think you need to do is define what you mean by "invalid
    characters".

    You tell us you "have problems", but you don't tell us what the
    problems are; that makes it impossible to suggest a solution.

    Sometimes 90% of the effort of getting an answer is just asking the
    right question.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Mar 2, 2007
    #4
  5. "joe" <> wrote in message
    > hello i have a databse program that uses char arrays to output data to
    > reports. I would like to remove all invalid characters from the array
    > and replace them with a blank space. I have problems with ( ' return
    > and some non ascii charcters. Any quick and dirty way to do this?
    > thanks.
    >

    Why do it quick and dirty when a decent program only takes a minute?

    /*
    must a character be repalced by a space?
    Params: ch - character to test
    Returns: 1 if must be replaced, 0 if must be retained
    */
    int replaceme(char ch)
    {
    if(isalnum(ch))
    return 0;
    if(isspace(ch))
    {
    if(ch == '\n' || ch == '\t')
    return 0;
    else
    return 1;
    }
    /* other conditions here for punctuation and so on */
    }

    /*
    This might need a substantial rewrite if you wish to distinguish gibberish
    from a name which might have
    one or two European or punctuation characters embedded in it, eg O'Rourke,
    Bronte with two dots over the e, and so forth.
    */
    void fixstring(char *str)
    {
    while(*str)
    {
    if(replaceme(*str))
    *str = ' ';
    str++;
    }
    }

    --
    Free games and programming goodies.
    http://www.personal.leeds.ac.uk/~bgy1mm
     
    Malcolm McLean, Mar 3, 2007
    #5
  6. joe

    joe Guest

    Thanks guys, I will try to test some of this code. I have two problems
    that arise from weird character. I am storing the ouput from my sybase
    database into a char array. Some times a weird character like ',
    crashes the c program. Sometimes a '(' messes up the html pages. I am
    sure there are more problems but those two, i remember. I will try
    the is print trick. thanks.
    On Mar 3, 3:33 am, "Malcolm McLean" <> wrote:
    > "joe" <> wrote in message
    > > hello i have a databse program that uses char arrays to output data to
    > > reports. I would like to remove all invalid characters from the array
    > > and replace them with a blank space. I have problems with ( ' return
    > > and some non ascii charcters. Any quick and dirty way to do this?
    > > thanks.

    >
    > Why do it quick and dirty when a decent program only takes a minute?
    >
    > /*
    > must a character be repalced by a space?
    > Params: ch - character to test
    > Returns: 1 if must be replaced, 0 if must be retained
    > */
    > int replaceme(char ch)
    > {
    > if(isalnum(ch))
    > return 0;
    > if(isspace(ch))
    > {
    > if(ch == '\n' || ch == '\t')
    > return 0;
    > else
    > return 1;
    > }
    > /* other conditions here for punctuation and so on */
    >
    > }
    >
    > /*
    > This might need a substantial rewrite if you wish to distinguish gibberish
    > from a name which might have
    > one or two European or punctuation characters embedded in it, eg O'Rourke,
    > Bronte with two dots over the e, and so forth.
    > */
    > void fixstring(char *str)
    > {
    > while(*str)
    > {
    > if(replaceme(*str))
    > *str = ' ';
    > str++;
    > }
    >
    > }
    >
    > --
    > Free games and programming goodies.http://www.personal.leeds.ac.uk/~bgy1mm
     
    joe, Mar 5, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steven J Sobol
    Replies:
    8
    Views:
    5,719
    Thomas Weidenfeller
    Apr 30, 2004
  2. Yasin Cepeci
    Replies:
    1
    Views:
    954
    Juan T. Llibre
    Apr 26, 2007
  3. The Web President

    re.match and non-alphanumeric characters

    The Web President, Nov 16, 2008, in forum: Python
    Replies:
    8
    Views:
    403
    John Machin
    Nov 17, 2008
  4. Yasin Cepeci
    Replies:
    2
    Views:
    249
    Yasin Cepeci
    Apr 26, 2007
  5. Theallnighter Theallnighter

    Newbie Question: delete all non alphanumeric characters

    Theallnighter Theallnighter, Jul 21, 2006, in forum: Ruby
    Replies:
    15
    Views:
    317
    Joe Karma
    Jul 22, 2006
Loading...

Share This Page