Difference between str.isdigit() and str.isdecimal() in Python 3

Discussion in 'Python' started by Marco, May 16, 2012.

  1. Marco

    Marco Guest

    Hi all, because

    "There should be one-- and preferably only one --obvious way to do it",

    there should be a difference between the two methods in the subject, but
    I can't find it:

    >>> '123'.isdecimal(), '123'.isdigit()

    (True, True)
    >>> print('\u0660123')

    Ù 123
    >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()

    (True, True)
    >>> print('\u216B')

    â…«
    >>> '\u216B'.isdecimal(), '\u216B'.isdigit()

    (False, False)

    Can anyone give me some help?
    Regards, Marco
     
    Marco, May 16, 2012
    #1
    1. Advertising

  2. Marco wrote:
    > >>> '123'.isdecimal(), '123'.isdigit()

    > (True, True)
    > >>> print('\u0660123')

    > Ù 123
    > >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()

    > (True, True)
    > >>> print('\u216B')

    > â…«
    > >>> '\u216B'.isdecimal(), '\u216B'.isdigit()

    > (False, False)


    [chr(a) for a in range(0x20000) if chr(a).isdigit()]

    Congratulations, you found a bug! Or maybe not, it all depends on whether
    Roman numbers are considered digits or not. I could imagine there being a
    difference.

    :)

    Uli
     
    Ulrich Eckhardt, May 16, 2012
    #2
    1. Advertising

  3. Marco

    Marco Guest

    Re: Difference between str.isdigit() and str.isdecimal() in Python3

    On 05/16/2012 06:24 PM, Ulrich Eckhardt wrote:

    > Marco wrote:
    >> > >>> '123'.isdecimal(), '123'.isdigit()
    >> > (True, True)
    >> > >>> print('\u0660123')
    >> > Ù 123
    >> > >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()
    >> > (True, True)
    >> > >>> print('\u216B')
    >> > â…«
    >> > >>> '\u216B'.isdecimal(), '\u216B'.isdigit()
    >> > (False, False)


    > [chr(a) for a in range(0x20000) if chr(a).isdigit()]


    Thanks to your list comprehension I found they are not equal:

    >>> set([chr(a) for a in range(0x10FFFF) if chr(a).isdigit()]) - \

    .... set([chr(a) for a in range(0x10FFFF) if chr(a).isdecimal()])

    Marco
     
    Marco, May 16, 2012
    #3
  4. Marco

    jmfauth Guest

    On 16 mai, 17:48, Marco <> wrote:
    > Hi all, because
    >
    > "There should be one-- and preferably only one --obvious way to do it",
    >
    > there should be a difference between the two methods in the subject, but
    > I can't find it:
    >
    >  >>> '123'.isdecimal(), '123'.isdigit()
    > (True, True)
    >  >>> print('\u0660123')
    > Ù 123
    >  >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()
    > (True, True)
    >  >>> print('\u216B')
    > â…«
    >  >>> '\u216B'.isdecimal(), '\u216B'.isdigit()
    > (False, False)
    >
    > Can anyone give me some help?
    > Regards, Marco


    It seems to me that it is correct, and the reason lies in this:

    >>> import unicodedata as ud
    >>> ud.category('\u216b')

    'Nl'
    >>> ud.category('1')

    'Nd'
    >>>
    >>> # Note
    >>> ud.numeric('\u216b')

    12.0

    jmf
     
    jmfauth, May 16, 2012
    #4
  5. Marco wrote:

    > Hi all, because
    >
    > "There should be one-- and preferably only one --obvious way to do it",
    >
    > there should be a difference between the two methods in the subject, but
    > I can't find it:
    >
    > >>> '123'.isdecimal(), '123'.isdigit()

    > (True, True)
    > >>> print('\u0660123')

    > Ù 123
    > >>> '\u0660123'.isdigit(), '\u0660123'.isdecimal()

    > (True, True)
    > >>> print('\u216B')

    > â…«
    > >>> '\u216B'.isdecimal(), '\u216B'.isdigit()

    > (False, False)
    >
    > Can anyone give me some help?


    RTFM.

    $ python3 -c 'print("42".isdecimal.__doc__ + "\n");
    print("42".isdigit.__doc__)'
    S.isdecimal() -> bool

    Return True if there are only decimal characters in S,
    False otherwise.

    S.isdigit() -> bool

    Return True if all characters in S are digits
    and there is at least one character in S, False otherwise.

    --
    PointedEars

    Please do not Cc: me. / Bitte keine Kopien per E-Mail.
     
    Thomas 'PointedEars' Lahn, May 16, 2012
    #5
  6. On Wed, 16 May 2012 17:48:19 +0200, Marco wrote:

    > Hi all, because
    >
    > "There should be one-- and preferably only one --obvious way to do it",
    >
    > there should be a difference between the two methods in the subject, but
    > I can't find it:


    The Fine Manual has more detail, although I admit it isn't *entirely*
    clear what it is talking about if you're not a Unicode expert:


    http://docs.python.org/py3k/library/stdtypes.html#str.isdecimal

    str.isdecimal()
    Return true if all characters in the string are decimal characters
    and there is at least one character, false otherwise. Decimal characters
    are those from general category “Ndâ€. This category includes digit
    characters, and all characters that can be used to form decimal-radix
    numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO.

    str.isdigit()
    Return true if all characters in the string are digits and there is
    at least one character, false otherwise. Digits include decimal
    characters and digits that need special handling, such as the
    compatibility superscript digits. Formally, a digit is a character that
    has the property value Numeric_Type=Digit or Numeric_Type=Decimal.


    And also:

    str.isnumeric()
    Return true if all characters in the string are numeric characters,
    and there is at least one character, false otherwise. Numeric characters
    include digit characters, and all characters that have the Unicode
    numeric value property, e.g. U+2155, VULGAR FRACTION ONE FIFTH. Formally,
    numeric characters are those with the property value Numeric_Type=Digit,
    Numeric_Type=Decimal or Numeric_Type=Numeric.


    Examples:

    py> c = '\u2155'
    py> print(c)
    â…•
    py> c.isdecimal(), c.isdigit(), c.isnumeric()
    (False, False, True)
    py> import unicodedata
    py> unicodedata.numeric(c)
    0.2

    py> c = '\u00B2'
    py> print(c)
    ²
    py> c.isdecimal(), c.isdigit(), c.isnumeric()
    (False, True, True)
    py> unicodedata.numeric(c)
    2.0


    --
    Steven
     
    Steven D'Aprano, May 17, 2012
    #6
  7. Marco

    Marco Guest

    Re: Difference between str.isdigit() and str.isdecimal() in Python3

    On 05/17/2012 02:15 AM, Steven D'Aprano wrote:

    > the Fine Manual has more detail, although I admit it isn't *entirely*
    > clear what it is talking about if you're not a Unicode expert:
    >
    >
    > http://docs.python.org/py3k/library/stdtypes.html#str.isdecimal


    You are right, that is clear, thanks :)

    > Examples:
    >
    > py> c = '\u2155'
    > py> print(c)
    > â…•
    > py> c.isdecimal(), c.isdigit(), c.isnumeric()
    > (False, False, True)
    > py> import unicodedata
    > py> unicodedata.numeric(c)
    > 0.2
    >
    > py> c = '\u00B2'
    > py> print(c)
    > ²
    > py> c.isdecimal(), c.isdigit(), c.isnumeric()
    > (False, True, True)
    > py> unicodedata.numeric(c)
    > 2.0


    Perfect explanation, thanks again, Marco
     
    Marco, May 17, 2012
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jakk
    Replies:
    4
    Views:
    12,639
  2. David
    Replies:
    2
    Views:
    505
    Thomas G. Marshall
    Aug 3, 2003
  3. Alf P. Steinbach

    isdigit() for characters greater than 127

    Alf P. Steinbach, Oct 9, 2004, in forum: C++
    Replies:
    4
    Views:
    3,269
    James Gregory
    Oct 9, 2004
  4. Replies:
    9
    Views:
    575
    Andreas Leitgeb
    Feb 9, 2007
  5. Carramba

    passing array to isdigit()

    Carramba, Feb 9, 2005, in forum: C Programming
    Replies:
    15
    Views:
    813
    Michael Mair
    Feb 21, 2005
Loading...

Share This Page