Help with Code

Discussion in 'Perl Misc' started by David Williams, Oct 2, 2006.

  1. Hello all,
    I am asking for help with the following code:


    if($old=~/checksum=(\d+)/)

    I think the =~ is not equal to meaning if
    $old (which is a filehandler) is not equal to something.

    also, checksum, is that a UNIX command? checksum is not a variable
    anywhere in the code I am debugging. I did a man page on checksum
    and got nothing back.

    Lastly, what is \d+ ? The PERL book says \d is digit but I don't
    understand.

    Thanks for any help!

    David


    --
    David Williams
    Georgia Institute of Technology, Atlanta Georgia, 30332
    Email:
     
    David Williams, Oct 2, 2006
    #1
    1. Advertising

  2. David Williams

    Paul Lalli Guest

    David Williams wrote:
    > I am asking for help with the following code:
    >
    >
    > if($old=~/checksum=(\d+)/)
    >
    > I think the =~ is not equal to


    No, that's not correct. =~ is the binding operator. It says to look
    for a pattern (the right argument) within the string (the left
    argument).

    The "not equal to" operator is != for numbers, and ne for strings

    > meaning if
    > $old (which is a filehandler) is not equal to something.


    If $old is a filehandle, then you're not going to be able to do any
    useful comparisons on it. You must read a string from the filehandle
    and do your comparison or pattern matching on that string. You
    generally do this with the < > operator, like so:

    my $line = <$old>;

    > also, checksum, is that a UNIX command? checksum is not a variable
    > anywhere in the code I am debugging. I did a man page on checksum
    > and got nothing back.


    In the above code, 'checksum' is simply a part of the pattern that is
    being searched for in the variable $old. It is not a variable nor a
    UNIX command.

    > Lastly, what is \d+ ? The PERL book says \d is digit but I don't
    > understand.


    \d+ is a regular expression token that means "1 or more of any digit".


    The entirety of the code above says:

    "if the string contained in $old contains the string 'checksum',
    followed by an equals sign, followed by 1 or more of any digit,
    then..."

    (it also stores the digits that it found in the variable $1, if the
    pattern is actually successful).

    You can read more about regular expressions by typing these into your
    command window:
    perldoc perlretut
    perldoc perlre
    perldoc perlreref

    And you can find all the operators and what they mean by typing:
    perldoc perlop

    Hope this helps,
    Paul Lalli
     
    Paul Lalli, Oct 2, 2006
    #2
    1. Advertising

  3. David Williams

    David Squire Guest

    David Williams wrote:
    > Hello all,
    > I am asking for help with the following code:
    >
    >
    > if($old=~/checksum=(\d+)/)
    >
    > I think the =~ is not equal to meaning if
    > $old (which is a filehandler) is not equal to something.


    No. It is an operator that associates a string with a regular expression
    for matching.
    >
    > also, checksum, is that a UNIX command? checksum is not a variable
    > anywhere in the code I am debugging. I did a man page on checksum
    > and got nothing back.


    That's because it is part of the regular expression pattern in the match
    in that line. It's a literal pattern of characters.

    > Lastly, what is \d+ ? The PERL


    There's no such language as "PERL". It's Perl.

    > book says \d is digit but I don't
    > understand.


    You need to go back to your book, or to the documentation that comes
    with Perl, and learn the basics about regular expressions and the Perl
    operators that use them. See, for example:

    perldoc perlretut


    DS
     
    David Squire, Oct 2, 2006
    #3
  4. David Williams

    Ian Wilson Guest

    David Williams wrote:
    > Hello all,
    > I am asking for help with the following code:


    I recommend you read an introductory book such as "Learning Perl".

    You may like to review what is available at http://learn.perl.org

    Also, at a command prompt you can review the doumentation included with
    perl - e.g. view the table of contents using the command `perldoc perltoc`

    > if($old=~/checksum=(\d+)/)
    >
    > I think the =~ is not equal to meaning if
    > $old (which is a filehandler) is not equal to something.


    I find it's not worth guessing blindly, reading the documentation works
    well for me. Start at `perldoc perlop` and look for "Binding Operators"

    >
    > also, checksum, is that a UNIX command? checksum is not a variable
    > anywhere in the code I am debugging. I did a man page on checksum
    > and got nothing back.


    Your "checksum=" is fixed text in a Perl regular expression. It is
    something that is being searched for in the contents of the variable
    $old. In your case $old presumably contains something like
    "lorem ipsum checksum=3489589713485 dolor sit amet" and your program
    needs to extract the checksum value.

    >
    > Lastly, what is \d+ ? The PERL book says \d is digit but I don't
    > understand.


    \d matches "0", "1" ... "8" or "9"
    + means match one or more of the previous character
    (\d+) 'captures' a sequence of one or more digits, for example an
    integer such as "6" or "1238". Capturing means that perl stores the
    matched text in a special variable for you to use later.

    >
    > Thanks for any help!


    Please follow the references before posting more questions of this sort.
     
    Ian Wilson, Oct 2, 2006
    #4
  5. David Williams

    -berlin.de Guest

    David Williams <> wrote in comp.lang.perl.misc:
    > Hello all,
    > I am asking for help with the following code:
    >
    >
    > if($old=~/checksum=(\d+)/)
    >
    > I think the =~ is not equal to meaning if


    No. "Not equal" can be expressed as "!=" or "ne" in Perl. The "=~"
    you have here is a binding operator. In this case it means to match
    the string $old against the pattern (/checksum=(\d+)/) on the right side.

    > $old (which is a filehandler) is not equal to something.


    It doesn't make sense for $old to be a filehandle (not filehander).
    The pattern match that happens is only useful with a string.

    > also, checksum, is that a UNIX command? checksum is not a variable
    > anywhere in the code I am debugging. I did a man page on checksum
    > and got nothing back.


    It may or may not be a Unix command, that doesn't matter. In your
    code it is just part of the pattern to match.

    > Lastly, what is \d+ ? The PERL book says \d is digit but I don't
    > understand.


    You need to understand regular expressions, which are Perl's (and many
    other languages') way of expressing string patterns. The specific
    pattern /checksum=(\d+)/ matches any string that contains the characters
    "checksum=" immediately followed by one or more digits.

    "hihi haha checksum=123 hoho"

    would be an example.

    > Thanks for any help!


    You won't be able to debug a Perl program (even a short one) with
    ad-hoc explanations given on Usenet. You'll either have to learn
    enough Perl to understand the program or get someone else to do it.

    Anno
     
    -berlin.de, Oct 2, 2006
    #5
  6. David Williams

    Dr.Ruud Guest

    Ian Wilson schreef:

    > \d matches "0", "1" ... "8" or "9"


    Last time I checked, \d matched 268 different characters. Dear
    programmer, if you mean [0-9], then write [0-9].

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Oct 3, 2006
    #6
  7. David Williams

    Paul Lalli Guest

    Dr.Ruud wrote:
    > Ian Wilson schreef:
    >
    > > \d matches "0", "1" ... "8" or "9"

    >
    > Last time I checked, \d matched 268 different characters. Dear
    > programmer, if you mean [0-9], then write [0-9].


    Er. Huh? I realize that \w will match not only 'a'..'z', 'A'..'Z',
    '0'..'9', and _, and that all the "international" letters such as á
    and Ñ are included as well, depending on locale. But other than the
    ten characters Ian implied, what else does \d match?

    I did take a look at `perldoc perlreref`, which in turn referred me to
    `perldoc perllocale`, but I confess that I don't get it - I'm extremely
    naïve when it comes to locales...

    Paul Lalli
     
    Paul Lalli, Oct 3, 2006
    #7
  8. David Williams

    Dr.Ruud Guest

    Paul Lalli schreef:
    > Dr.Ruud:
    >> Ian Wilson:


    >>> \d matches "0", "1" ... "8" or "9"

    >>
    >> Last time I checked, \d matched 268 different characters. Dear
    >> programmer, if you mean [0-9], then write [0-9].

    >
    > Er. Huh? I realize that \w will match not only 'a'..'z', 'A'..'Z',
    > '0'..'9', and _, and that all the "international" letters such as á
    > and Ñ are included as well, depending on locale. But other than the
    > ten characters Ian implied, what else does \d match?
    >
    > I did take a look at `perldoc perlreref`, which in turn referred me to
    > `perldoc perllocale`, but I confess that I don't get it - I'm
    > extremely naïve when it comes to locales...


    The following tries to promote Data::Alias as well:

    #!/usr/bin/perl
    # Id: unicount.pl
    # Subject: show some Unicode statistics

    use warnings ;
    use strict ;
    use Data::Alias ;

    binmode STDOUT, ':utf8' ;

    my @table =
    # +--Name------+---qRegexp--------+-C-+-L-+-U-+
    (
    [ 'xdigit' , qr/[[:xdigit:]]/ , 0 , 0 , 0 ] ,
    [ 'ascii' , qr/[[:ascii:]]/ , 0 , 0 , 0 ] ,
    [ '\\d' , qr/\d/ , 0 , 0 , 0 ] ,
    [ 'digit' , qr/[[:digit:]]/ , 0 , 0 , 0 ] ,
    [ 'IsNumber' , qr/\p{IsNumber}/ , 0 , 0 , 0 ] ,
    [ 'alpha' , qr/[[:alpha:]]/ , 0 , 0 , 0 ] ,
    [ 'alnum' , qr/[[:alnum:]]/ , 0 , 0 , 0 ] ,
    [ 'word' , qr/[[:word:]]/ , 0 , 0 , 0 ] ,
    [ 'graph' , qr/[[:graph:]]/ , 0 , 0 , 0 ] ,
    [ 'print' , qr/[[:print:]]/ , 0 , 0 , 0 ] ,
    [ 'blank' , qr/[[:blank:]]/ , 0 , 0 , 0 ] ,
    [ 'space' , qr/[[:space:]]/ , 0 , 0 , 0 ] ,
    [ 'punct' , qr/[[:punct:]]/ , 0 , 0 , 0 ] ,
    [ 'cntrl' , qr/[[:cntrl:]]/ , 0 , 0 , 0 ] ,
    ) ;

    my @codepoints =
    (
    0x0000 .. 0xD7FF,
    0xE000 .. 0xFDCF,
    0xFDF0 .. 0xFFFD,
    0x10000 .. 0x1FFFD,
    0x20000 .. 0x2FFFD,
    # 0x30000 .. 0x3FFFD, # etc.
    ) ;

    for my $row ( @table )
    {
    alias my ($name, $qrx, $count, $lower, $upper) = @$row ;

    printf "\n%s\n", $name ;

    my $n = 0 ;

    for ( @codepoints )
    {
    local $_ = chr ; # int-2-char conversion
    $n++ ;

    if ( /$qrx/ )
    {
    $count++ ;
    $lower++ if / [[:lower:]] /x ;
    $upper++ if / [[:upper:]] /x ;
    }
    }

    my $show_lower_upper =
    ($lower || $upper)
    ? sprintf( ' (lower:%6d, upper:%6d)'
    , $lower
    , $upper
    )
    : '' ;

    printf "%6d /%6d =%7.3f%%%s\n"
    , $count
    , $n
    , 100 * $count / $n
    , $show_lower_upper
    }

    print "\n" ;

    __END__


    Results (v5.8.6, i386-freebsd-64int)

    xdigit
    22 /194522 = 0.011% (lower: 6, upper: 6)

    ascii
    128 /194522 = 0.066% (lower: 26, upper: 26)

    \d
    268 /194522 = 0.138%

    digit
    268 /194522 = 0.138%

    IsNumber
    612 /194522 = 0.315%

    alpha
    91183 /194522 = 46.875% (lower: 1380, upper: 1160)

    alnum
    91451 /194522 = 47.013% (lower: 1380, upper: 1160)

    word
    91801 /194522 = 47.193% (lower: 1380, upper: 1160)

    graph
    102330 /194522 = 52.606% (lower: 1380, upper: 1160)

    print
    102349 /194522 = 52.616% (lower: 1380, upper: 1160)

    blank
    18 /194522 = 0.009%

    space
    24 /194522 = 0.012%

    punct
    374 /194522 = 0.192%

    cntrl
    6473 /194522 = 3.328%

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Oct 3, 2006
    #8
  9. On 2006-10-03 12:12, Paul Lalli <> wrote:
    > Dr.Ruud wrote:
    >> Ian Wilson schreef:
    >>
    >> > \d matches "0", "1" ... "8" or "9"

    >>
    >> Last time I checked, \d matched 268 different characters. Dear
    >> programmer, if you mean [0-9], then write [0-9].

    >
    > Er. Huh? I realize that \w will match not only 'a'..'z', 'A'..'Z',
    > '0'..'9', and _, and that all the "international" letters such as á
    > and Ñ are included as well, depending on locale. But other than the
    > ten characters Ian implied, what else does \d match?


    The digits in all the non-latin scripts. Try:


    #!/usr/bin/perl
    use warnings;
    use strict;
    use charnames qw();

    for my $c (0x0000 .. 0xD7FF,
    0xE000 .. 0xFDCF,
    0xFDF0 .. 0xFFFD,
    0x1_0000 .. 11_0000
    ) {
    my $s = pack 'U', $c;
    if ($s =~ /\d/) {
    printf ("%5d %5x %s %s\n", $c, $c, $s, charnames::viacode($c));
    }
    }

    On my system this prints 218 digits:

    48 30 0 DIGIT ZERO
    49 31 1 DIGIT ONE
    50 32 2 DIGIT TWO
    51 33 3 DIGIT THREE
    52 34 4 DIGIT FOUR
    53 35 5 DIGIT FIVE
    54 36 6 DIGIT SIX
    55 37 7 DIGIT SEVEN
    56 38 8 DIGIT EIGHT
    57 39 9 DIGIT NINE
    1632 660 Ù  ARABIC-INDIC DIGIT ZERO
    1633 661 Ù¡ ARABIC-INDIC DIGIT ONE
    1634 662 Ù¢ ARABIC-INDIC DIGIT TWO
    1635 663 Ù£ ARABIC-INDIC DIGIT THREE
    1636 664 Ù¤ ARABIC-INDIC DIGIT FOUR
    1637 665 Ù¥ ARABIC-INDIC DIGIT FIVE
    1638 666 Ù¦ ARABIC-INDIC DIGIT SIX
    1639 667 Ù§ ARABIC-INDIC DIGIT SEVEN
    1640 668 Ù¨ ARABIC-INDIC DIGIT EIGHT
    1641 669 Ù© ARABIC-INDIC DIGIT NINE
    1776 6f0 Û° EXTENDED ARABIC-INDIC DIGIT ZERO
    1777 6f1 Û± EXTENDED ARABIC-INDIC DIGIT ONE
    1778 6f2 Û² EXTENDED ARABIC-INDIC DIGIT TWO
    1779 6f3 Û³ EXTENDED ARABIC-INDIC DIGIT THREE
    1780 6f4 Û´ EXTENDED ARABIC-INDIC DIGIT FOUR
    1781 6f5 Ûµ EXTENDED ARABIC-INDIC DIGIT FIVE
    1782 6f6 Û¶ EXTENDED ARABIC-INDIC DIGIT SIX
    1783 6f7 Û· EXTENDED ARABIC-INDIC DIGIT SEVEN
    1784 6f8 Û¸ EXTENDED ARABIC-INDIC DIGIT EIGHT
    1785 6f9 Û¹ EXTENDED ARABIC-INDIC DIGIT NINE
    2406 966 ० DEVANAGARI DIGIT ZERO
    2407 967 १ DEVANAGARI DIGIT ONE
    2408 968 २ DEVANAGARI DIGIT TWO
    2409 969 ३ DEVANAGARI DIGIT THREE
    2410 96a ४ DEVANAGARI DIGIT FOUR
    2411 96b ५ DEVANAGARI DIGIT FIVE
    2412 96c ६ DEVANAGARI DIGIT SIX
    2413 96d ७ DEVANAGARI DIGIT SEVEN
    2414 96e ८ DEVANAGARI DIGIT EIGHT
    2415 96f ९ DEVANAGARI DIGIT NINE
    2534 9e6 ০ BENGALI DIGIT ZERO
    2535 9e7 ১ BENGALI DIGIT ONE
    2536 9e8 ২ BENGALI DIGIT TWO
    2537 9e9 ৩ BENGALI DIGIT THREE
    2538 9ea ৪ BENGALI DIGIT FOUR
    2539 9eb ৫ BENGALI DIGIT FIVE
    2540 9ec ৬ BENGALI DIGIT SIX
    2541 9ed ৭ BENGALI DIGIT SEVEN
    2542 9ee ৮ BENGALI DIGIT EIGHT
    2543 9ef ৯ BENGALI DIGIT NINE
    2662 a66 ੦ GURMUKHI DIGIT ZERO
    2663 a67 ੧ GURMUKHI DIGIT ONE
    2664 a68 ੨ GURMUKHI DIGIT TWO
    2665 a69 à©© GURMUKHI DIGIT THREE
    2666 a6a ੪ GURMUKHI DIGIT FOUR
    2667 a6b à©« GURMUKHI DIGIT FIVE
    2668 a6c ੬ GURMUKHI DIGIT SIX
    2669 a6d à©­ GURMUKHI DIGIT SEVEN
    2670 a6e à©® GURMUKHI DIGIT EIGHT
    2671 a6f ੯ GURMUKHI DIGIT NINE
    2790 ae6 ૦ GUJARATI DIGIT ZERO
    2791 ae7 ૧ GUJARATI DIGIT ONE
    2792 ae8 ૨ GUJARATI DIGIT TWO
    2793 ae9 à«© GUJARATI DIGIT THREE
    2794 aea ૪ GUJARATI DIGIT FOUR
    2795 aeb à«« GUJARATI DIGIT FIVE
    2796 aec ૬ GUJARATI DIGIT SIX
    2797 aed à«­ GUJARATI DIGIT SEVEN
    2798 aee à«® GUJARATI DIGIT EIGHT
    2799 aef ૯ GUJARATI DIGIT NINE
    2918 b66 à­¦ ORIYA DIGIT ZERO
    2919 b67 à­§ ORIYA DIGIT ONE
    2920 b68 à­¨ ORIYA DIGIT TWO
    2921 b69 à­© ORIYA DIGIT THREE
    2922 b6a à­ª ORIYA DIGIT FOUR
    2923 b6b à­« ORIYA DIGIT FIVE
    2924 b6c à­¬ ORIYA DIGIT SIX
    2925 b6d à­­ ORIYA DIGIT SEVEN
    2926 b6e à­® ORIYA DIGIT EIGHT
    2927 b6f à­¯ ORIYA DIGIT NINE
    3047 be7 ௧ TAMIL DIGIT ONE
    3048 be8 ௨ TAMIL DIGIT TWO
    3049 be9 ௩ TAMIL DIGIT THREE
    3050 bea ௪ TAMIL DIGIT FOUR
    3051 beb ௫ TAMIL DIGIT FIVE
    3052 bec ௬ TAMIL DIGIT SIX
    3053 bed ௭ TAMIL DIGIT SEVEN
    3054 bee ௮ TAMIL DIGIT EIGHT
    3055 bef ௯ TAMIL DIGIT NINE
    3174 c66 ౦ TELUGU DIGIT ZERO
    3175 c67 ౧ TELUGU DIGIT ONE
    3176 c68 ౨ TELUGU DIGIT TWO
    3177 c69 ౩ TELUGU DIGIT THREE
    3178 c6a ౪ TELUGU DIGIT FOUR
    3179 c6b ౫ TELUGU DIGIT FIVE
    3180 c6c ౬ TELUGU DIGIT SIX
    3181 c6d à±­ TELUGU DIGIT SEVEN
    3182 c6e à±® TELUGU DIGIT EIGHT
    3183 c6f ౯ TELUGU DIGIT NINE
    3302 ce6 ೦ KANNADA DIGIT ZERO
    3303 ce7 ೧ KANNADA DIGIT ONE
    3304 ce8 ೨ KANNADA DIGIT TWO
    3305 ce9 ೩ KANNADA DIGIT THREE
    3306 cea ೪ KANNADA DIGIT FOUR
    3307 ceb ೫ KANNADA DIGIT FIVE
    3308 cec ೬ KANNADA DIGIT SIX
    3309 ced à³­ KANNADA DIGIT SEVEN
    3310 cee à³® KANNADA DIGIT EIGHT
    3311 cef ೯ KANNADA DIGIT NINE
    3430 d66 ൦ MALAYALAM DIGIT ZERO
    3431 d67 ൧ MALAYALAM DIGIT ONE
    3432 d68 ൨ MALAYALAM DIGIT TWO
    3433 d69 ൩ MALAYALAM DIGIT THREE
    3434 d6a ൪ MALAYALAM DIGIT FOUR
    3435 d6b ൫ MALAYALAM DIGIT FIVE
    3436 d6c ൬ MALAYALAM DIGIT SIX
    3437 d6d ൭ MALAYALAM DIGIT SEVEN
    3438 d6e ൮ MALAYALAM DIGIT EIGHT
    3439 d6f ൯ MALAYALAM DIGIT NINE
    3664 e50 ๠THAI DIGIT ZERO
    3665 e51 ๑ THAI DIGIT ONE
    3666 e52 ๒ THAI DIGIT TWO
    3667 e53 ๓ THAI DIGIT THREE
    3668 e54 ๔ THAI DIGIT FOUR
    3669 e55 ๕ THAI DIGIT FIVE
    3670 e56 ๖ THAI DIGIT SIX
    3671 e57 ๗ THAI DIGIT SEVEN
    3672 e58 ๘ THAI DIGIT EIGHT
    3673 e59 ๙ THAI DIGIT NINE
    3792 ed0 à» LAO DIGIT ZERO
    3793 ed1 ໑ LAO DIGIT ONE
    3794 ed2 à»’ LAO DIGIT TWO
    3795 ed3 ໓ LAO DIGIT THREE
    3796 ed4 à»” LAO DIGIT FOUR
    3797 ed5 ໕ LAO DIGIT FIVE
    3798 ed6 à»– LAO DIGIT SIX
    3799 ed7 à»— LAO DIGIT SEVEN
    3800 ed8 ໘ LAO DIGIT EIGHT
    3801 ed9 à»™ LAO DIGIT NINE
    3872 f20 ༠ TIBETAN DIGIT ZERO
    3873 f21 ༡ TIBETAN DIGIT ONE
    3874 f22 ༢ TIBETAN DIGIT TWO
    3875 f23 ༣ TIBETAN DIGIT THREE
    3876 f24 ༤ TIBETAN DIGIT FOUR
    3877 f25 ༥ TIBETAN DIGIT FIVE
    3878 f26 ༦ TIBETAN DIGIT SIX
    3879 f27 ༧ TIBETAN DIGIT SEVEN
    3880 f28 ༨ TIBETAN DIGIT EIGHT
    3881 f29 ༩ TIBETAN DIGIT NINE
    4160 1040 ဠMYANMAR DIGIT ZERO
    4161 1041 á MYANMAR DIGIT ONE
    4162 1042 á‚ MYANMAR DIGIT TWO
    4163 1043 რMYANMAR DIGIT THREE
    4164 1044 á„ MYANMAR DIGIT FOUR
    4165 1045 á… MYANMAR DIGIT FIVE
    4166 1046 ᆠMYANMAR DIGIT SIX
    4167 1047 ᇠMYANMAR DIGIT SEVEN
    4168 1048 ሠMYANMAR DIGIT EIGHT
    4169 1049 በMYANMAR DIGIT NINE
    4969 1369 á© ETHIOPIC DIGIT ONE
    4970 136a ᪠ETHIOPIC DIGIT TWO
    4971 136b á« ETHIOPIC DIGIT THREE
    4972 136c ᬠETHIOPIC DIGIT FOUR
    4973 136d á­ ETHIOPIC DIGIT FIVE
    4974 136e á® ETHIOPIC DIGIT SIX
    4975 136f ᯠETHIOPIC DIGIT SEVEN
    4976 1370 á° ETHIOPIC DIGIT EIGHT
    4977 1371 á± ETHIOPIC DIGIT NINE
    6112 17e0 ០ KHMER DIGIT ZERO
    6113 17e1 ១ KHMER DIGIT ONE
    6114 17e2 ២ KHMER DIGIT TWO
    6115 17e3 ៣ KHMER DIGIT THREE
    6116 17e4 ៤ KHMER DIGIT FOUR
    6117 17e5 ៥ KHMER DIGIT FIVE
    6118 17e6 ៦ KHMER DIGIT SIX
    6119 17e7 ៧ KHMER DIGIT SEVEN
    6120 17e8 ៨ KHMER DIGIT EIGHT
    6121 17e9 ៩ KHMER DIGIT NINE
    6160 1810 á  MONGOLIAN DIGIT ZERO
    6161 1811 á ‘ MONGOLIAN DIGIT ONE
    6162 1812 á ’ MONGOLIAN DIGIT TWO
    6163 1813 á “ MONGOLIAN DIGIT THREE
    6164 1814 á ” MONGOLIAN DIGIT FOUR
    6165 1815 á • MONGOLIAN DIGIT FIVE
    6166 1816 á – MONGOLIAN DIGIT SIX
    6167 1817 á — MONGOLIAN DIGIT SEVEN
    6168 1818 á ˜ MONGOLIAN DIGIT EIGHT
    6169 1819 á ™ MONGOLIAN DIGIT NINE
    6470 1946 ᥆ LIMBU DIGIT ZERO
    6471 1947 ᥇ LIMBU DIGIT ONE
    6472 1948 ᥈ LIMBU DIGIT TWO
    6473 1949 ᥉ LIMBU DIGIT THREE
    6474 194a ᥊ LIMBU DIGIT FOUR
    6475 194b ᥋ LIMBU DIGIT FIVE
    6476 194c ᥌ LIMBU DIGIT SIX
    6477 194d ᥠLIMBU DIGIT SEVEN
    6478 194e ᥎ LIMBU DIGIT EIGHT
    6479 194f ᥠLIMBU DIGIT NINE
    65296 ff10 ï¼ FULLWIDTH DIGIT ZERO
    65297 ff11 1 FULLWIDTH DIGIT ONE
    65298 ff12 ï¼’ FULLWIDTH DIGIT TWO
    65299 ff13 3 FULLWIDTH DIGIT THREE
    65300 ff14 ï¼” FULLWIDTH DIGIT FOUR
    65301 ff15 5 FULLWIDTH DIGIT FIVE
    65302 ff16 ï¼– FULLWIDTH DIGIT SIX
    65303 ff17 ï¼— FULLWIDTH DIGIT SEVEN
    65304 ff18 8 FULLWIDTH DIGIT EIGHT
    65305 ff19 ï¼™ FULLWIDTH DIGIT NINE
    66720 104a0 ð’  OSMANYA DIGIT ZERO
    66721 104a1 ð’¡ OSMANYA DIGIT ONE
    66722 104a2 ð’¢ OSMANYA DIGIT TWO
    66723 104a3 ð’£ OSMANYA DIGIT THREE
    66724 104a4 ð’¤ OSMANYA DIGIT FOUR
    66725 104a5 ð’¥ OSMANYA DIGIT FIVE
    66726 104a6 ð’¦ OSMANYA DIGIT SIX
    66727 104a7 ð’§ OSMANYA DIGIT SEVEN
    66728 104a8 ð’¨ OSMANYA DIGIT EIGHT
    66729 104a9 ð’© OSMANYA DIGIT NINE

    hp


    --
    _ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
    |_|_) | Sysadmin WSR | > ist?
    | | | | Was sonst wäre der Sinn des Erfindens?
    __/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
     
    Peter J. Holzer, Oct 3, 2006
    #9
  10. David Williams

    Paul Lalli Guest

    Peter J. Holzer wrote:
    > On 2006-10-03 12:12, Paul Lalli <> wrote:
    > > Dr.Ruud wrote:
    > >> Ian Wilson schreef:
    > >>
    > >> > \d matches "0", "1" ... "8" or "9"
    > >>
    > >> Last time I checked, \d matched 268 different characters. Dear
    > >> programmer, if you mean [0-9], then write [0-9].

    > >
    > > Er. Huh? I realize that \w will match not only 'a'..'z', 'A'..'Z',
    > > '0'..'9', and _, and that all the "international" letters such as á
    > > and Ñ are included as well, depending on locale. But other than the
    > > ten characters Ian implied, what else does \d match?

    >
    > The digits in all the non-latin scripts. Try:


    It absolutely never even occurred to me that other characters would be
    considered digits. Like I said, I'm depressingly un-informed about
    locales and internationalization. Thanks for the information.

    Paul Lalli
     
    Paul Lalli, Oct 3, 2006
    #10
  11. David Williams

    Ian Wilson Guest

    Dr.Ruud wrote:
    > Ian Wilson schreef:
    >
    >
    >>\d matches "0", "1" ... "8" or "9"

    >
    > Last time I checked, \d matched 268 different characters.


    Both the above statements are true :)
    All 268 are characters, all are digits, few are numeric!

    > Dear programmer, if you mean [0-9], then write [0-9].


    No one has really followed up on this in the context set by the OP.

    Assuming that some program writes a decimal checksum to a file and that
    checksum contains non-ASCII numerals, would Perl arithmetic do the
    right thing?

    -----------------------8<-----------------------------
    #!/usr/bin/perl
    #
    use warnings;
    use strict;

    checksum('foo 1234 bar');
    checksum("fie \x{0101} fum");
    checksum("baz \x{0661}\x{0662}\x{0663}\x{0664} qux");

    sub checksum {
    my $text = shift;
    if ($text =~ /(\d+)/) {
    print "$1 + 1 = ", $1+1, "\n";
    } else {
    print "no numbers in '$text' \n";
    }
    }
    -----------------------8<-----------------------------
    $ perl -v
    This is perl, v5.8.0 built for i386-linux-thread-multi

    $ perl numbers.pl
    1234 + 1 = 1235
    Wide character in print at numbers.pl line 15.
    no numbers in 'fie Ä fum'
    Argument "\x{661}\x{662}..." isn't numeric in addition (+) at numbers.pl
    line 13.
    Wide character in print at numbers.pl line 13.
    ١٢٣٤ + 1 = 1

    (Actually the last line looked different before I cut & pasted it, it
    ended " + 1 = 1")

    Why doesn't perl handle any unicode digit named "XXXX DIGIT NINE" as
    numerically equivalent to DIGIT NINE?
     
    Ian Wilson, Oct 6, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?Q2FybG8gTWFyY2hlc29uaQ==?=

    Fire Code behind code AND Javascript code associated to a Button Click Event

    =?Utf-8?B?Q2FybG8gTWFyY2hlc29uaQ==?=, Feb 10, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    21,381
    =?Utf-8?B?Q2FybG8gTWFyY2hlc29uaQ==?=
    Feb 11, 2004
  2. Tee
    Replies:
    1
    Views:
    4,179
    Raterus
    Jun 24, 2004
  3. Phil Winstanley [Microsoft MVP ASP.NET]

    Re: help with repeater, how to code it in code-behind page

    Phil Winstanley [Microsoft MVP ASP.NET], Jun 24, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    494
    Ireney Berezniak
    Jun 25, 2004
  4. keithb
    Replies:
    1
    Views:
    962
    Bruce Barker
    Mar 29, 2006
  5. tone
    Replies:
    4
    Views:
    471
    Hywel Jenkins
    Nov 19, 2003
Loading...

Share This Page