Is there a string function to trim all non-ascii characters out of astring

Discussion in 'Python' started by silverburgh.meryl@gmail.com, Dec 31, 2007.

  1. Guest

    Hi,

    Is there a string function to trim all non-ascii characters out of a
    string?
    Let say I have a string in python (which is utf8 encoded), is there a
    python function which I can convert that to a string which composed of
    only ascii characters?

    Thank you.
    , Dec 31, 2007
    #1
    1. Advertising

  2. Dan Bishop Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 2:20 am, ""
    <> wrote:
    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >
    > Thank you.


    def ascii_chars(string):
    return ''.join(char for char in string if ord(char) < 128)
    Dan Bishop, Dec 31, 2007
    #2
    1. Advertising

  3. abhishek Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 1:20 pm, ""
    <> wrote:
    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >
    > Thank you.


    Use this function --

    def omitNonAscii(nstr):
    sstr=''
    for r in nstr:
    if ord(r)<127:
    sstr+=r
    return sstr
    abhishek, Dec 31, 2007
    #3
  4. John Machin Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 7:20 pm, ""
    <> wrote:
    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >
    John Machin, Dec 31, 2007
    #4
  5. John Machin Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 7:20 pm, ""
    <> wrote:
    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >


    OK, I'll bite: why do you want to throw data away?
    John Machin, Dec 31, 2007
    #5
  6. Paul McGuire Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 2:54 am, abhishek <> wrote:
    >
    > Use this function --
    >
    > def omitNonAscii(nstr):
    >     sstr=''
    >     for r in nstr:
    >         if ord(r)<127:
    >             sstr+=r
    >     return sstr


    <Yoda>
    Learn the ways of the generator expression you must.
    </Yoda>
    See Dan Bishop's post.

    -- Paul
    Paul McGuire, Dec 31, 2007
    #6
  7. Duncan Booth Guest

    Re: Is there a string function to trim all non-ascii characters out of a string

    "" <> wrote:

    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >
    > Thank you.


    Yes, just decode it to unicode (which you should do as the first thing for
    any encoded strings) and then encode it back to ascii with error handling
    set how you want:

    >>> s = '\xc2\xa342'
    >>> s.decode('utf8').encode('ascii', 'replace')

    '?42'
    >>> s.decode('utf8').encode('ascii', 'ignore')

    '42'
    >>> s.decode('utf8').encode('ascii', 'xmlcharrefreplace')

    '£42'
    Duncan Booth, Dec 31, 2007
    #7
  8. Re: Is there a string function to trim all non-ascii characters outof a string

    On Mon, 31 Dec 2007 01:09:09 -0800, John Machin wrote:

    > On Dec 31, 7:20 pm, ""
    > <> wrote:
    >> Hi,
    >>
    >> Is there a string function to trim all non-ascii characters out of a
    >> string?
    >> Let say I have a string in python (which is utf8 encoded), is there a
    >> python function which I can convert that to a string which composed of
    >> only ascii characters?
    >>
    >>

    > OK, I'll bite: why do you want to throw data away?


    Maybe he has to send the data to a device that can't deal with more than
    7-bit ASCII.

    Maybe he's sick of seeing text with "missing character" squares all over
    from all the characters that his fonts can't display.

    Maybe the string ends up as a file name on an operating system that
    doesn't support unicode.

    Or maybe he's just a curmudgeon who thinks life was better when there
    were only 128 characters available.


    --
    Steven
    Steven D'Aprano, Dec 31, 2007
    #8
  9. John Machin Guest

    Re: Is there a string function to trim all non-ascii characters outof a string

    On Dec 31, 7:20 pm, ""
    <> wrote:
    > Hi,
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?
    >


    You actually asked TWO different questions, and have got answers
    mainly to the first one. Here's a very simple answer to the second
    question, which has the advantage of no loss of information:

    repr(your_utf8_string.decode('utf8'))
    or merely
    repr(your_utf8_string)

    Cheers,
    John
    John Machin, Dec 31, 2007
    #9
  10. Re: Is there a string function to trim all non-ascii characters outof a string

    wrote:
    >
    > Is there a string function to trim all non-ascii characters out of a
    > string?
    > Let say I have a string in python (which is utf8 encoded), is there a
    > python function which I can convert that to a string which composed of
    > only ascii characters?


    I'd recommend to rethink this approach.
    In the worst case the result is an empty string... ;-)

    Ciao, Michael.
    Michael Ströder, Dec 31, 2007
    #10
  11. Re: Is there a string function to trim all non-ascii characters out of a string

    Hallöchen!

    Paul McGuire writes:

    > On Dec 31, 2:54 am, abhishek <> wrote:
    >>
    >> Use this function --
    >>
    >> def omitNonAscii(nstr):
    >>     sstr=''
    >>     for r in nstr:
    >>         if ord(r)<127:
    >>             sstr+=r
    >>     return sstr

    >
    > <Yoda>
    > Learn the ways of the generator expression you must.
    > </Yoda>


    Stupid me! How could I miss such a lovely feature in the language?

    Tschö,
    Torsten.

    --
    Torsten Bronger, aquisgrana, europa vetus
    Jabber ID:
    (See http://ime.webhop.org for further contact info.)
    Torsten Bronger, Dec 31, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Aldric Giacomoni
    Replies:
    0
    Views:
    110
    Aldric Giacomoni
    Feb 18, 2009
  2. FAQ server
    Replies:
    0
    Views:
    141
    FAQ server
    Aug 29, 2006
  3. FAQ server
    Replies:
    0
    Views:
    137
    FAQ server
    Oct 26, 2006
  4. bruce
    Replies:
    38
    Views:
    274
    Mark Lawrence
    Nov 1, 2013
  5. MRAB
    Replies:
    0
    Views:
    96
Loading...

Share This Page