Very simple hash/regex question

Discussion in 'Perl Misc' started by Tuxedo, Aug 23, 2012.

  1. Tuxedo

    Tuxedo Guest

    What is a simple way to copy a hash into for example %hash_copy and change
    all characters in the keys of the copied hash to lowercases and all
    whitespaces to underscores?

    my %hash = ('My first subject key' => 'my first value',
    'my second Subject key' => 'my second value');

    my %hash_copy = %hash;

    ...?

    Many thanks for any suggestions.

    Tuxedo
    Tuxedo, Aug 23, 2012
    #1
    1. Advertising

  2. Tuxedo

    Klaus Guest

    On 23 août, 20:03, Tuxedo <> wrote:
    > What is a simple way to copy a hash into for example %hash_copy and change
    > all characters in the keys of the copied hash to lowercases and all
    > whitespaces to underscores?


    use 5.014;
    my %hash = ('My first subject key' => 'my first value',
    'my second Subject key' => 'my second value');

    my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;
    Klaus, Aug 23, 2012
    #2
    1. Advertising

  3. On 2012-08-23 19:05, Klaus <> wrote:
    > On 23 août, 20:03, Tuxedo <> wrote:
    >> What is a simple way to copy a hash into for example %hash_copy and change
    >> all characters in the keys of the copied hash to lowercases and all
    >> whitespaces to underscores?

    >
    > use 5.014;
    > my %hash = ('My first subject key' => 'my first value',
    > 'my second Subject key' => 'my second value');
    >
    > my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;


    use Data::Dumper;
    say Dumper \%hash_copy;


    $VAR1 = {
    'my_first_subject_key' => 'my first value',
    'my_second_value' => undef,
    'my_first_value' => undef,
    'my_second_subject_key' => 'my second value'
    };

    not quite what Tuxedo wanted, I think.

    hp


    --
    _ | Peter J. Holzer | Deprecating human carelessness and
    |_|_) | Sysadmin WSR | ignorance has no successful track record.
    | | | |
    __/ | http://www.hjp.at/ | -- Bill Code on
    Peter J. Holzer, Aug 23, 2012
    #3
  4. Tuxedo

    Tim McDaniel Guest

    In article <>,
    Klaus <> wrote:
    >On 23 août, 20:03, Tuxedo <> wrote:
    >> What is a simple way to copy a hash into for example %hash_copy and change
    >> all characters in the keys of the copied hash to lowercases and all
    >> whitespaces to underscores?

    >
    >use 5.014;
    >my %hash = ('My first subject key' => 'my first value',
    > 'my second Subject key' => 'my second value');
    >
    >my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;


    Fundamental problem 0: You didn't test the proposal.

    1: %hash returns both keys and values, so hash_copy would get two
    hashes for each one in the original table, one of them being
    "transformed value => undef". You want "map{...}keys %hash".

    2: I don't believe Perl defines an order of operations in this case,
    where one part of the expression modifies $_ and another part uses
    it. If it evaluates left to right, then $hash{$_} will try to use the
    transformed $_, so it won't find the value. (This bit me on my first
    attempt too.)

    3: The spec is "all whitespaces". I think that means \s, not ' '.

    4: "$_=~..." is the default operand.

    My correction to that:

    use 5.014;
    my %hash_copy = map { my $key = $_; lc(s/\s/_/gr) => $hash{$key} } keys %hash;

    But since you need a temp anyway (or so I think), there's no need for
    s///r, so no need to require 5.014. So this also works without 5.014:

    my %hash_copy = map { my $orig_ = $_; s/\s/_/g; lc($_) => $hash{$orig_} } keys %hash;

    On the whole, I think this looks cleaner with just a plain loop:

    my %hash_copy = ();
    while (my ($key, $value) = each %hash) {
    $key =~ s/\s/_/g;
    $hash_copy{lc($key)} = $value;
    }


    --
    Tim McDaniel,
    Tim McDaniel, Aug 23, 2012
    #4
  5. Tuxedo

    Tuxedo Guest

    Peter J. Holzer wrote:

    > On 2012-08-23 19:05, Klaus <> wrote:
    > > On 23 août, 20:03, Tuxedo <> wrote:
    > >> What is a simple way to copy a hash into for example %hash_copy and
    > >> change all characters in the keys of the copied hash to lowercases and
    > >> all whitespaces to underscores?

    > >
    > > use 5.014;
    > > my %hash = ('My first subject key' => 'my first value',
    > > 'my second Subject key' => 'my second value');
    > >
    > > my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;

    >
    > use Data::Dumper;
    > say Dumper \%hash_copy;
    >
    >
    > $VAR1 = {
    > 'my_first_subject_key' => 'my first value',
    > 'my_second_value' => undef,
    > 'my_first_value' => undef,
    > 'my_second_subject_key' => 'my second value'
    > };
    >
    > not quite what Tuxedo wanted, I think.
    >
    > hp
    >
    >


    I'm not quite sure either to be honest....

    What I have so far is a hash, like:

    my %hash = ('My first subject key' => 'my first value',
    'my second Subject key' => 'my second value');

    First I thought I should duplicate the hash into a copy named for example
    %hash_copy. Then modify the keys, not the values.

    I would then run the script via cgi parameters, using the keys as they
    appear in the lowercase and with underscores, e.g. 'my_second_subject_key'.

    use CGI qw(param);
    my $subject = param('subject');

    The original %hash keys can contain spaces and capitals.

    I would then like to access and print the key string in its original
    whitespace and partly uppercase format, e.g. 'my second Subject key' when
    accessing the script by for example.pl?subject=my_second_subject_key

    Maybe it will be necessary to know the position of the given key in the
    %hash_copy in order to access the key string in the same numerical position
    as in the original %hash?

    The idea is simply to access and print an original key string, based on the
    modified one in the query string, without unecessarily resorting to the
    idea of maintaining near duplicate hashes manually....

    As mentioned, I'm not quite sure which is the best way to go about this.

    Many thanks for any ideas.

    Tuxedo
    Tuxedo, Aug 23, 2012
    #5
  6. Tuxedo

    Tuxedo Guest

    Tim McDaniel wrote:

    > In article
    > <>,
    > Klaus <> wrote:
    > >On 23 août, 20:03, Tuxedo <> wrote:
    > >> What is a simple way to copy a hash into for example %hash_copy and
    > >> change all characters in the keys of the copied hash to lowercases and
    > >> all whitespaces to underscores?

    > >
    > >use 5.014;
    > >my %hash = ('My first subject key' => 'my first value',
    > > 'my second Subject key' => 'my second value');
    > >
    > >my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;

    >
    > Fundamental problem 0: You didn't test the proposal.
    >
    > 1: %hash returns both keys and values, so hash_copy would get two
    > hashes for each one in the original table, one of them being
    > "transformed value => undef". You want "map{...}keys %hash".
    >
    > 2: I don't believe Perl defines an order of operations in this case,
    > where one part of the expression modifies $_ and another part uses
    > it. If it evaluates left to right, then $hash{$_} will try to use the
    > transformed $_, so it won't find the value. (This bit me on my first
    > attempt too.)
    >
    > 3: The spec is "all whitespaces". I think that means \s, not ' '.
    >
    > 4: "$_=~..." is the default operand.
    >
    > My correction to that:
    >
    > use 5.014;
    > my %hash_copy = map { my $key = $_; lc(s/\s/_/gr) => $hash{$key} } keys
    > %hash;
    >
    > But since you need a temp anyway (or so I think), there's no need for
    > s///r, so no need to require 5.014. So this also works without 5.014:
    >
    > my %hash_copy = map { my $orig_ = $_; s/\s/_/g; lc($_) => $hash{$orig_} }
    > keys %hash;
    >
    > On the whole, I think this looks cleaner with just a plain loop:
    >
    > my %hash_copy = ();
    > while (my ($key, $value) = each %hash) {
    > $key =~ s/\s/_/g;
    > $hash_copy{lc($key)} = $value;
    > }
    >
    >


    Thanks for posting these examples. I will test.

    Tuxedo
    Tuxedo, Aug 23, 2012
    #6
  7. Tuxedo

    Klaus Guest

    On 23 août, 22:22, (Tim McDaniel) wrote:
    > In article <..com>,
    >
    > Klaus  <> wrote:
    > >On 23 août, 20:03, Tuxedo <> wrote:
    > >> What is a simple way to copy a hash into for example %hash_copy and change
    > >> all characters in the keys of the copied hash to lowercases and all
    > >> whitespaces to underscores?

    >
    > >use 5.014;
    > >my %hash = ('My first subject key' => 'my first value',
    > >        'my second Subject key' => 'my second value');

    >
    > >my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;

    >
    > Fundamental problem 0: You didn't test the proposal.


    doh and double-doh, I typed perl code on the fly and I messed up !

    You (and Peter J. Holzer) are of course right. I didn't test my code
    and I apologise.
    Klaus, Aug 23, 2012
    #7
  8. Tuxedo

    Tim McDaniel Guest

    In article <k164o4$6bm$>,
    Tuxedo <> wrote:
    >Peter J. Holzer wrote:
    >> not quite what Tuxedo wanted, I think.

    >
    >I'm not quite sure either to be honest....


    Well, it's hard to get to a desired destination when you don't know
    where you're going ...

    >What I have so far is a hash, like:
    >
    >my %hash = ('My first subject key' => 'my first value',
    > 'my second Subject key' => 'my second value');
    >
    >First I thought I should duplicate the hash into a copy named for example
    >%hash_copy. Then modify the keys, not the values.


    Well, to be precise, you can't per se modify the key of a hash
    element. You can get the effect of that by creating a new hash
    member, assigning an older value to be its value, then deleting the
    older key and its value.

    >I would then run the script via cgi parameters, using the keys as they
    >appear in the lowercase and with underscores, e.g. 'my_second_subject_key'.
    >
    >use CGI qw(param);
    >my $subject = param('subject');
    >
    >The original %hash keys can contain spaces and capitals.
    >I would then like to access and print the key string in its original
    >whitespace and partly uppercase format, e.g. 'my second Subject key'
    >when accessing the script by for
    >example.pl?subject=my_second_subject_key


    Had it been a case of needing to go from 'my second Subject key' to
    'my_second_subject_key', you could use a hash or a sub. However, the
    other direction is one to many -- given 'my_second_subject_key', you
    can't tell what the original was. So you'd have to use a hash. In
    any event, you have to determine whether it's possible to have a
    collision like "my second Subject key" and "mY SeCond\tSUBJECT\rKeY",
    and if so, what you plan to do about it.

    >Maybe it will be necessary to know the position of the given key in
    >the %hash_copy in order to access the key string in the same
    >numerical position as in the original %hash?


    You cannot access a hash by a numerical position. You can only go
    directly to an element via its key.

    >The idea is simply to access and print an original key string, based
    >on the modified one in the query string, without unecessarily
    >resorting to the idea of maintaining near duplicate hashes
    >manually....


    Well, sorry, but that's what you're going to have to do.

    If you are planning to do changes and references in lots of places,
    then you might encapsulate the tracking in a module, or even a class,
    with map_original_to_normalized(), map_normalized_to_original(), add,
    delete, and such.

    --
    Tim McDaniel,
    Tim McDaniel, Aug 23, 2012
    #8
  9. Tuxedo

    Tuxedo Guest

    Tim McDaniel wrote:

    [...]

    > collision like "my second Subject key" and "mY SeCond\tSUBJECT\rKeY",
    > and if so, what you plan to do about it.


    Thanks for mentioning this, it could indeed happen.

    [...]

    > You cannot access a hash by a numerical position. You can only go
    > directly to an element via its key.


    I suspected as much, so I'm not sure how it can be done.

    > >The idea is simply to access and print an original key string, based
    > >on the modified one in the query string, without unecessarily
    > >resorting to the idea of maintaining near duplicate hashes
    > >manually....

    >
    > Well, sorry, but that's what you're going to have to do.


    All I had planned was to keep normal key values as they would be written in
    a natural language, then change them to a format which contains no spaces
    or capitals, then access both the original (normalised) key strings and
    values as well as the modified ones with underscores, while only knowing
    the modified key string at the time the script runs on a CGI request.
    Instead, to maintain two separate hashes which can be accessed by the same
    key-string in the query string can of course be done. It just means
    dublicating some information manually, which is no big deal, although
    suspect it can be done better. Anyway, then there would be the main hash:

    my %hash = ('my_first_subject_key' => 'my first value',
    'my_second_subject key' => 'my second value');

    And an additional hash, providing the normalised word strings as values:

    my %hash_normalize = ('my_first_subject_key' => 'My first subject key',
    'my_second_subject_key' => 'my second Subject key');

    I can now access both normal and modified versions using one parameter as a
    key to both hashes.

    > If you are planning to do changes and references in lots of places,
    > then you might encapsulate the tracking in a module, or even a class,
    > with map_original_to_normalized(), map_normalized_to_original(), add,
    > delete, and such.


    I'm not sure what kind of tracking module you refer to? Also, I don't fully
    understand the map_original_to_normalized() and
    map_normalized_to_original() class ideas.

    Or maybe some other data structure could be better suited for my purpose.

    Many thanks,
    Tuxedo
    Tuxedo, Aug 24, 2012
    #9
  10. Tuxedo

    Tuxedo Guest

    Ben Morrow wrote:

    [...]

    > I would structure this like this:
    >
    > my %hash = (
    > my_first_subject_key => {
    > key => "My first subject key",
    > value => "my first value",
    > },
    > ...
    > );
    >
    > See perllol, perldsc and perlreftut. Of course, you might want to give
    > the subhashes more meaningful keys than 'key' and 'value'.
    >
    > Ben
    >


    Thanks for the example. I will delve into those manuals.

    Tuxedo
    Tuxedo, Aug 24, 2012
    #10
  11. Tuxedo

    Tim McDaniel Guest

    In article <>,
    Ben Morrow <> wrote:
    >
    >Quoth :
    >> In article <>,
    >> Klaus <> wrote:
    >> >
    >> >my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;

    ><snip>
    >>
    >> 2: I don't believe Perl defines an order of operations in this case,
    >> where one part of the expression modifies $_ and another part uses
    >> it. If it evaluates left to right, then $hash{$_} will try to use the
    >> transformed $_, so it won't find the value. (This bit me on my first
    >> attempt too.)

    >
    >I believe the order of operations is always well-defined in Perl: that
    >is, I don't know of any cases where it's been changed, nor any cases
    >where changing the order wouldn't be considered a bug.


    TL;DR: show me where Perl systematically talks about order of
    evaluation, except implicitly in some places, or talks about anything
    like "sequence points".

    I don't know of any place that Perl explicitly defines the order of
    evaluation and refers to anything like C's "sequence points", except
    where implied by things like "If the argument before the ? is true,
    the argument before the : is returned, otherwise the argument after
    the : is returned.". For example, for ++ and --, man perlop has

    Note that just as in C, Perl doesn't define when the variable is
    incremented or decremented. You just know it will be done sometime
    before or after the value is returned. This also means that
    modifying a variable twice in the same statement will lead to
    undefined behavior. Avoid statements like:

    $i = $i ++;
    print ++ $i + $i ++;

    Perl will not guarantee what the result of the above statements
    is.

    But C actually *does* define when the increment or decrement happens:
    some time after the previous sequence point and before the next one.
    C would not have sequence points in the problematic areas above, mind
    you, so that wouldn't matter in these two lines. But C does define
    that in
    ... (i++, i) ...
    the increment happens no later than the comma operator, so the value
    of "i" alone is the incremented version. (If I'm reading a draft
    standard right, if it matches the current version, and if my old
    neurons are firing right.)

    >In this case, '=>' is just sugar for ',', so the order would be
    >well-defined even in C.


    No it would not. In the map above, the tokens "=>" and "," are not
    comma operators, which in C would cause a sequence point. The only
    place I know of in C where "," represents a list of values is in
    initializations of arrays or structs or the like, and the draft I saw
    (can't find the real standard) has "The evaluations of the
    initialization list expressions are indeterminately sequenced with
    respect to one another and thus the order in which any side effects
    occur is unspecified."

    --
    Tim McDaniel,
    Tim McDaniel, Aug 24, 2012
    #11
  12. On 2012-08-23 20:22, Tim McDaniel <> wrote:
    > In article <>,
    > Klaus <> wrote:
    >>On 23 août, 20:03, Tuxedo <> wrote:
    >>> What is a simple way to copy a hash into for example %hash_copy and change
    >>> all characters in the keys of the copied hash to lowercases and all
    >>> whitespaces to underscores?

    >>
    >>use 5.014;
    >>my %hash = ('My first subject key' => 'my first value',
    >> 'my second Subject key' => 'my second value');
    >>
    >>my %hash_copy = map { lc($_ =~ s/ /_/gr) => $hash{$_} } %hash;

    >
    > Fundamental problem 0: You didn't test the proposal.
    >
    > 1: %hash returns both keys and values, so hash_copy would get two
    > hashes for each one in the original table, one of them being
    > "transformed value => undef". You want "map{...}keys %hash".


    Yup.


    > 2: I don't believe Perl defines an order of operations in this case,
    > where one part of the expression modifies $_ and another part uses
    > it. If it evaluates left to right, then $hash{$_} will try to use the
    > transformed $_, so it won't find the value. (This bit me on my first
    > attempt too.)


    $_ isn't transformed because of the /r modifier. So the order doesn't
    matter (although I agree with Ben that it's well-defined in this case).


    > My correction to that:
    >
    > use 5.014;
    > my %hash_copy = map { my $key = $_; lc(s/\s/_/gr) => $hash{$key} } keys %hash;
    >
    > But since you need a temp anyway (or so I think), there's no need for
    > s///r, so no need to require 5.014.


    The /r avoids the need for the temporary variable. You either need a
    temporary variable (then it works with any version of perl) or /r (then
    you need 5.14), but not both.

    hp

    --
    _ | Peter J. Holzer | Deprecating human carelessness and
    |_|_) | Sysadmin WSR | ignorance has no successful track record.
    | | | |
    __/ | http://www.hjp.at/ | -- Bill Code on
    Peter J. Holzer, Aug 25, 2012
    #12
  13. Tuxedo

    Tim McDaniel Guest

    In article <>,
    Peter J. Holzer <> wrote:
    >On 2012-08-23 20:22, Tim McDaniel <> wrote:
    >> 2: I don't believe Perl defines an order of operations in this case,
    >> where one part of the expression modifies $_ and another part uses
    >> it. If it evaluates left to right, then $hash{$_} will try to use the
    >> transformed $_, so it won't find the value. (This bit me on my first
    >> attempt too.)

    >
    >$_ isn't transformed because of the /r modifier. So the order doesn't
    >matter (although I agree with Ben that it's well-defined in this case).


    I am not familiar with s///r, as $ORKPLACE doesn't have the current
    Perl. Thank you for the correction.

    Were it to depend on the order of effects (if there were not explicit
    definition as, for example, && and || provide), I would intensely
    dislike it, even if experimentally it were to work.

    --
    Tim McDaniel,
    Tim McDaniel, Aug 25, 2012
    #13
  14. Ben Morrow <> writes:

    [...]

    > my $i = 2;
    > my $j;
    > $j = \++$i, $i = 10, say $$j;
    >
    > will print '10' despite the assignment to $j happening before the
    > assignment to $i.


    Eh ... considering that the value of $j is a reference to $i, what
    else should $$j print except the current value of $i?
    Rainer Weikusat, Aug 25, 2012
    #14
  15. Tuxedo wrote:
    > What is a simple way to copy a hash into for example %hash_copy and change
    > all characters in the keys of the copied hash to lowercases and all
    > whitespaces to underscores?
    >
    > my %hash = ('My first subject key' => 'my first value',
    > 'my second Subject key' => 'my second value');
    >
    > my %hash_copy = %hash;
    >
    > ..?


    $ perl -e'
    use Data::Dumper;
    my %hash = (
    q/My first subject key/ => q/my first value/,
    q/my second Subject key/ => q/my second value/,
    );
    my %hash_copy = %hash;
    print Dumper \%hash_copy;
    for my $key ( keys %hash_copy ) {
    ( my $new_key = lc $key ) =~ s/\s/_/g;
    $hash_copy{ $new_key } = delete $hash_copy{ $key };
    }
    print Dumper \%hash_copy;
    '
    $VAR1 = {
    'My first subject key' => 'my first value',
    'my second Subject key' => 'my second value'
    };
    $VAR1 = {
    'my_first_subject_key' => 'my first value',
    'my_second_subject_key' => 'my second value'
    };



    John
    --
    Any intelligent fool can make things bigger and
    more complex... It takes a touch of genius -
    and a lot of courage to move in the opposite
    direction. -- Albert Einstein
    John W. Krahn, Aug 27, 2012
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raymond Arthur St. Marie II of III

    very Very VERY dumb Question About The new Set( ) 's

    Raymond Arthur St. Marie II of III, Jul 23, 2003, in forum: Python
    Replies:
    4
    Views:
    454
    Raymond Hettinger
    Jul 27, 2003
  2. shanx__=|;-

    very very very long integer

    shanx__=|;-, Oct 16, 2004, in forum: C Programming
    Replies:
    19
    Views:
    1,595
    Merrill & Michele
    Oct 19, 2004
  3. Peter

    Very very very basic question

    Peter, Feb 8, 2005, in forum: C Programming
    Replies:
    14
    Views:
    505
    Dave Thompson
    Feb 14, 2005
  4. olivier.melcher

    Help running a very very very simple code

    olivier.melcher, May 12, 2008, in forum: Java
    Replies:
    8
    Views:
    2,249
  5. rp
    Replies:
    1
    Views:
    499
    red floyd
    Nov 10, 2011
Loading...

Share This Page