Re: Detect string has non-ASCII chars without checking each char?

Discussion in 'Python' started by Vlastimil Brom, Aug 21, 2010.

  1. 2010/8/21 <>:
    > Python 2.6: Is there a built-in way to check if a Unicode string has
    > non-ASCII chars without having to check each char in the string?
    >
    > Here's my use case: I have a section of code that makes frequent calls to
    > hasattr. The attribute name being tested is derived from incoming data which
    > at times can contain international content.
    >
    > hasattr() raises an exception when passed a Unicode attribute name. I would
    > have expected a simple True/False return value vs. an encoding error.
    >
    > UnicodeEncodeError: 'ascii' codec can't encode character u'\u012c' in
    > position 0: ordinal not in range(128)
    >
    > Is this behavior by design or could I be encoding the string I'm passing
    > hasattr() incorrectly?
    >
    > If its by design, I'm thinking the best approach for me would be to write  a
    > hasattr_enhanced() function that traps the Unicode encoding exception and
    > returns False and use this function in place of hasattr(). Any thoughts on
    > this strategy?
    >
    > Thank you,
    > Malcolm
    >
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    >

    Hi,
    I can't comment on the mentioned usecase, but for checking the basic
    ascii unicode strings one can maybe use a simple hack (not sure about
    possible drawbacks ...)
    It is likely working with all characters too, but maybe in a more
    straightforward way...

    >>> a = u"abc"
    >>> b = u"abc\u012c"
    >>> a.encode("ascii", "ignore").decode("ascii") == a

    True
    >>> b.encode("ascii", "ignore").decode("ascii") == b

    False
    >>>


    Others may supply more general/elegant/... approaches.

    vbr
    Vlastimil Brom, Aug 21, 2010
    #1
    1. Advertising

  2. Vlastimil Brom

    John Nagle Guest

    On 8/21/2010 1:21 PM, Vlastimil Brom wrote:
    > 2010/8/21<>:
    >> Python 2.6: Is there a built-in way to check if a Unicode string has
    >> non-ASCII chars without having to check each char in the string?
    >>
    >> Here's my use case: I have a section of code that makes frequent calls to
    >> hasattr. The attribute name being tested is derived from incoming data which
    >> at times can contain international content.


    Bad idea. Use a dict; don't try to pretend that an object is a dict.
    This isn't Javascript. Incidentally, inheriting from "dict" works,
    and is quite useful.

    class item(dict) :
    ...

    p = item()
    p['abc'] = 1

    That wasn't in early versions of Python, which led to a style of abusing
    objects as if they were dictionaries.

    Also note that 1) spaces in attribute names can be troublesome, and
    2) duplicating the name of a function or built-in attribute will
    override it, usually leading to unwanted results.

    John Nagle
    John Nagle, Aug 22, 2010
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. TOXiC
    Replies:
    5
    Views:
    1,231
    TOXiC
    Jan 31, 2007
  2. Hongyu
    Replies:
    9
    Views:
    888
    James Kanze
    Aug 8, 2008
  3. Michel Claveau - MVP
    Replies:
    3
    Views:
    437
    John Machin
    Aug 22, 2010
  4. bruce
    Replies:
    38
    Views:
    258
    Mark Lawrence
    Nov 1, 2013
  5. MRAB
    Replies:
    0
    Views:
    87
Loading...

Share This Page