trying to generate integer from string

Discussion in 'Perl Misc' started by bpatton, Apr 25, 2007.

  1. bpatton

    bpatton Guest

    I'm trying to generate a unique integer from a string. It must
    generate the same integer each time it has the same string. I'm
    trying to use unpack to do this.
    Here is a small sample. My real version now has @ 2000 strings, but
    his is only going up.
    my $s1 = '-Dfull_drc=true -Dgds_file=gds.VIA4T -Dgds_layer=VIA4T -
    Dstore_layer=VIA4T';
    my $s2 = '-Dfull_drc=true -Dgds_file=gds.VIA1T -Dgds_layer=VIA1T -
    Dstore_layer=VIA1T';
    my ($u1,$u2);
    ($u1) = unpack("%J*",$s1);
    ($u2) = unpack("%J*",$s2);
    print "u1 = $u1\n";
    print "u2 = $u2\n";

    If I change the J to an A this example works ok, but hunderds other
    one fail.
    I'm checking these by creating a perl hash where the $U# is the key
    and the string is the value.
    So that I check for the existance of a key, if it exists the I compare
    the values. if the are equal then it is an error


    Here is my actual code : (less genRppPermutations too large) $s1 and
    $s2 are examples from genRppPerrmutations.
    my @switchList;
    my ($rpp,%hash,$key,$string);
    foreach $rpp ( qw ( COMBINE GEN_STORE L2G.gatet L2G.met L2G.primary
    L2G.umc MASTER_RPP PG_PASS1 PG_PASS2 PG_PASS2.SPLITPOL PG_PASS2.met
    PG_PASS3) ) {
    @switchList = genRppPermutations($rpp);
    foreach $string (@switchList) {
    ($key) = unpack("%A*",$string);
    if (exists $hash{$key}) {
    unless ($hash{$key} eq $string) {
    print "collision between strings, both generated '$key'\n";
    print " s1 : $string\n";
    print " s2 : $hash{$key}\n";
    }
    } else {
    $hash{$key} = $string;
    }
    }
    }
     
    bpatton, Apr 25, 2007
    #1
    1. Advertising

  2. bpatton

    Mirco Wahab Guest

    bpatton wrote:
    > I'm trying to generate a unique integer from a string. It must
    > generate the same integer each time it has the same string. I'm
    > trying to use unpack to do this.
    > Here is a small sample. My real version now has @ 2000 strings, but
    > his is only going up.
    > my $s1 = '-Dfull_drc=true -Dgds_file=gds.VIA4T -Dgds_layer=VIA4T -
    > Dstore_layer=VIA4T';
    > my $s2 = '-Dfull_drc=true -Dgds_file=gds.VIA1T -Dgds_layer=VIA1T -
    > Dstore_layer=VIA1T';
    > my ($u1,$u2);
    > ($u1) = unpack("%J*",$s1);
    > ($u2) = unpack("%J*",$s2);
    > print "u1 = $u1\n";
    > print "u2 = $u2\n";
    >
    > If I change the J to an A this example works ok, but hunderds other
    > one fail.
    > I'm checking these by creating a perl hash where the $U# is the key
    > and the string is the value.
    > So that I check for the existance of a key, if it exists the I compare
    > the values. if the are equal then it is an error


    Depending on the length of the string, compute a 10-20 byte 'fingerprint'
    of them, for example with the md5 or sha1 algorithm. There are modules for
    this purpose, you may use one of the Digest:: Modules
    (http://search.cpan.org/~gaas/Digest-1.15/Digest.pm), eg. SHA1

    >
    > Here is my actual code : (less genRppPermutations too large) $s1 and
    > $s2 are examples from genRppPerrmutations.


    Example:
    ==>

    use strict;
    use warnings;
    # print 20 byte number , sha1 (40 byte hex code)
    use Digest::SHA1 qw(sha1_hex);

    my @strings = qw'
    COMBINE GEN_STORE L2G.gatet L2G.met L2G.primary L2G.umc MASTER_RPP PG_PASS1
    PG_PASS2 PG_PASS2.SPLITPOL PG_PASS2.met PG_PASS3';

    my ($rpp, %hash, $key, $string, $collision);

    foreach $rpp (@strings) {
    foreach $string ( genRppPermutations($rpp) ) {
    $key = sha1_hex( $string );
    if( exists $hash{$key} ) {
    if( $hash{$key} ne $string ) {
    print "collision" . ++$collision . "between generated '$key'\n";
    print " s1 : $string\n";
    print " s2 : $hash{$key}\n"
    }
    }
    else {
    $hash{$key} = $string;
    print "$key, "
    }
    }
    }
    print "all ok!\n" unless $collision;

    <==

    Regards

    M.
     
    Mirco Wahab, Apr 25, 2007
    #2
    1. Advertising

  3. bpatton

    Mirco Wahab Guest

    Mirco Wahab wrote:
    > Depending on the length of the string, compute a 10-20 byte 'fingerprint'
    > of them, for example with the md5 or sha1 algorithm. There are modules for
    > this purpose, you may use one of the Digest:: Modules
    > (http://search.cpan.org/~gaas/Digest-1.15/Digest.pm), eg. SHA1


    If you need "normal integers (4 byte)" as keys,
    you'd look at the CRC32 algorithm, where a
    module is also available. The following would
    use "regular" integers as keys:
    (only modified parts shown)
    ==>
    ...
    use Digest::CRC qw'crc32';
    ...

    ...
    foreach $string ( genRppPermutations($rpp) ) {
    my $key = crc32($string);
    if( exists $hash{$key} ) {
    if( $hash{$key} ne $string ) {
    print "collision " . ++$collision . " between generated '$key'\n";
    print " s1 : $string\n s2 : $hash{$key}\n";
    }
    }
    else {
    $hash{$key} = $string;
    printf "0x%08X, ", $key
    }
    }
    ...


    <==

    Regards

    M.
     
    Mirco Wahab, Apr 25, 2007
    #3
  4. bpatton

    -berlin.de Guest

    bpatton <> wrote in comp.lang.perl.misc:
    > I'm trying to generate a unique integer from a string. It must
    > generate the same integer each time it has the same string. I'm
    > trying to use unpack to do this.
    > Here is a small sample. My real version now has @ 2000 strings, but
    > his is only going up.


    But two thousand is nothing. Just use the strings as hash keys.

    If it were two millions or more, using a digest could be meaningful.
    If so, use a module that generates a tried-and-(mathematically-)proven
    digest instead of am ad-hoc solution.

    Anno
     
    -berlin.de, Apr 25, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Weng Tianxiang
    Replies:
    5
    Views:
    1,318
    Christophe
    Feb 16, 2006
  2. Toby Newman

    Generate a random integer of set [0,3]

    Toby Newman, Aug 10, 2004, in forum: C Programming
    Replies:
    18
    Views:
    674
    Jarno A Wuolijoki
    Aug 17, 2004
  3. Replies:
    3
    Views:
    416
    red floyd
    Apr 7, 2006
  4. Harlan Messinger
    Replies:
    2
    Views:
    2,306
    John Bell
    Mar 28, 2010
  5. Randy Kramer
    Replies:
    12
    Views:
    377
    Robert Klemme
    Oct 25, 2007
Loading...

Share This Page