Iterating over a hash

limitz · Jul 6, 2007

Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:

$numbers = 123123123123;

$beginning = <STDIN>;
print "Input Ending Base Number:\n";
$end = <STDIN>;
$length = $end-$beginning+1;

$one = '1';
$onecount = 0;
for my $i (0 .. $length-1) {
my $count = substr($numbers, $i, 2);
if ($count =~ $one) {
$onecount++;
}
}

Instead of copy and pasting this code three times and changing the
variable names. How would I have it directly read into the hash and
iterate over the hash?

Thanks

Gunnar Hjalmarsson · Jul 6, 2007

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?

To me, the problem you describe rather seems to be about iterating over
a string of digits. Are you after something like this?

#!/usr/bin/perl
use strict;
use warnings;

my %hash = ( 1 => 0, 2 => 0, 3 => 0 );
my $digits = '123123123123';

foreach my $dig ( split //, $digits ) {
$hash{$dig}++ if exists $hash{$dig};
}

print "$_: $hash{$_}\n" for keys %hash;

__END__

xhoster · Jul 7, 2007

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:

$numbers = 123123123123;

You may want to make that a string rather than a number, or at some
point you will lose precision.

$numbers = '123123123123';

$beginning = <STDIN>;
print "Input Ending Base Number:\n";
$end = <STDIN>;
$length = $end-$beginning+1;

Since we don't know what it is you type in for STDIN, this makes a poor
example. I would be better to hard-code the values you want for testing
purposes.

$one = '1';
$onecount = 0;
for my $i (0 .. $length-1) {
my $count = substr($numbers, $i, 2);
if ($count =~ $one) {
$onecount++;
}
}

Assuming this actually does what you want (in which case you want
something quite strange--counting how many of the overlapping "digraphs"
(not exactly the right word--every combination of two adjacent letters
taken together, with some possible weirdenss as the end of the string if
$length is the same or longer than length $numbers) of a string contain a
digit in at least one of the two places in the digraph), then you could do
it something like this:

my %hash1=(
1 => 0,
2=> 0,
3=> 0,
);

for my $i (0 .. $length-1) {
foreach (keys %hash1) {
$hash1{$_}++ if substr($numbers, $i, 2) =~ /$_/;
}
}

If performance was important, you could use index rather than a regex,
or try switching the nesting of the inner and outer loops, or a variety
of other things.

On the other hand, if the code you posted doesn't do what you want in the
first place, then it isn't clear what you want.

Xho

Tad McClellan · Jul 7, 2007

$length = $end-$beginning+1; ^^
^^

for my $i (0 .. $length-1) {

^^
^^

What is the point of adding one followed by subtracting one?

John W. Krahn · Jul 7, 2007

limitz said:
Hey I have a question with iterating over a hash.

For example, if I have a hash table %hash1:

%hash1(
1 => 1,
2=> 2,
3=> 3,
);

And given this string 123123123123.

How would increment the values in the hash every time it recognizes
the value?
For example:

$numbers = 123123123123;

$hash1{ $_ }++ for $numbers =~ /[123]/g;

John

Jürgen Exner · Jul 7, 2007

Tad said:
^^
What is the point of adding one followed by subtracting one?

In this particular case it actually makes sense. Otherwise the variable
would have to be called "LastIndex" or "LengthMinusOne" or something like
that.
While technically you are right from a documentation and readability point
of view using $length and subtracting 1 when needed seems to be preferable.

jue

limitz · Jul 10, 2007

You may want to make that a string rather than a number, or at some
point you will lose precision.

$numbers = '123123123123';

Since we don't know what it is you type in for STDIN, this makes a poor
example. I would be better to hard-code the values you want for testing
purposes.

Assuming this actually does what you want (in which case you want
something quite strange--counting how many of the overlapping "digraphs"
(not exactly the right word--every combination of two adjacent letters
taken together, with some possible weirdenss as the end of the string if
$length is the same or longer than length $numbers) of a string contain a
digit in at least one of the two places in the digraph), then you could do
it something like this:

my %hash1=(
1 => 0,
2=> 0,
3=> 0,
);

for my $i (0 .. $length-1) {
foreach (keys %hash1) {
$hash1{$_}++ if substr($numbers, $i, 2) =~ /$_/;
}

}

If performance was important, you could use index rather than a regex,
or try switching the nesting of the inner and outer loops, or a variety
of other things.

On the other hand, if the code you posted doesn't do what you want in the
first place, then it isn't clear what you want.

Xho

--
--------------------http://NewsReader.Com/--------------------
Usenet Newsgroup Service $9.95/Month 30GB- Hide quoted text -

- Show quoted text -

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}
}
print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank

limitz · Jul 10, 2007

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}}

print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank- Hide quoted text -

- Show quoted text -

Actually, I solved my own problem, but have a fresh problem. Here is
the working script:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}
}

while ( my($keys,$values) = each(%dinucleotidepair) ) {
print "$keys $values\n";
}

#print "The Fasta sequence segment has $ACcountss AC's in
$beginning_of_sequence to $end_of_sequence",
#printf "for a relative frequency of %f\n", $ACcountss/
$length_of_sequence;

My new problem is this. I have to calculate the relative frequencies
for everything. So, what that means, is that if one of the keys in my
hash has an occurence, I have to divide that by $length_of_sequence to
find the relative frequency. Then, that relative frequency will be
used in another Perl script.

My question is this: How do I manipulate individual elements in a hash
given the value of the key is not zero.

Secondly, how do I save that as a variable that can be used by another
Perl script for calcuation purposes?

Thanks!

~Frank

limitz · Jul 10, 2007

Actually, that is what I want it to do, something like a sliding
frame. As for the values in STDIN, they are numerical values. This
way, I can specify an arbitrary sequence for it to iterate over.
Although this example seems ridiculous, its actually a simplification
of a first order Markov transitional chain matrix.

I'm actually having an error running this. After I modify my code to
fit this, essentially what you suggested but with a print function at
the end. I get this error: "Use of unintialized value in string C:\Perl
\bin\markovgen2.pl line 107, <STDIN> line 2."

This is my exact code:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i, 2) =~ /$_/;
}}

print "$dinucleotidepair";

Can anyone explain to me the reason the error is popping up? Thanks.

~Frank- Hide quoted text -

- Show quoted text -

Actually, I solved my own problem, but have a fresh problem. Here is
the working script:

print "\nThere are a total of 18990 bases in the entire sequence\n";
print "\nWhat is the starting base number? Keep in mind that Perl
begins the tally";
print "\nwith 0. So if you wanted to start from base 30, input in
29\n";

#Here we can define where we want the sequence to begin and end
print "Input Starting Number:\n";
$beginning_of_sequence = <STDIN>;
print "Input Ending Base Number:\n";
$end_of_sequence = <STDIN>;
$length_of_sequence = $end_of_sequence-$beginning_of_sequence+1;

%dinucleotidepair = (
AT => 0,
AC => 0,
AG => 0,
AA => 0,

TA => 0,
TC => 0,
TG => 0,
TT => 0,

CA => 0,
CT => 0,
CG => 0,
CC => 0,

GA => 0,
GT => 0,
GC => 0,
GG => 0,
);

#$AC = 'AC';
#$ACcountss = 0;
#for my $i (0 .. $length_of_sequence-1) {
# my $dinuc = substr($fastasequence, $i, 2);
# if ($dinuc =~ $AC) {
# $ACcountss++;
# }
#}

for my $i (0 .. $length_of_sequence-1) {
foreach (keys %dinucleotidepair) {
$dinucleotidepair{$_}++ if substr($fastasequence, $i,
2) =~ /$_/;
}

}

while ( my($keys,$values) = each(%dinucleotidepair) ) {
print "$keys $values\n";

}

#print "The Fasta sequence segment has $ACcountss AC's in
$beginning_of_sequence to $end_of_sequence",
#printf "for a relative frequency of %f\n", $ACcountss/
$length_of_sequence;

My new problem is this. I have to calculate the relative frequencies
for everything. So, what that means, is that if one of the keys in my
hash has an occurence, I have to divide that by $length_of_sequence
to
find the relative frequency. Then, that relative frequency will be
used in another Perl script.

My question is this:
How do I manipulate individual elements in a hash
given the value of the key is not zero?

For example. In $fastasequence, the first 30 nucleotide bases contain
4 occurences of the variable "AC" yet no occurences of the base
combination "TG".

Thus, the variable frequency of AC is 0.133.

Secondly, how do I save 0.133 as a variable that can be carried over
and used by another
Perl script for calcuation purposes?

Thanks!

~Frank

Quick sort algorithm	1	Feb 22, 2023
Hrs worked between two dates crossing over midnight	3	Feb 10, 2023
iterating over fields	2	Nov 15, 2010
restrict a hash to 15 pairs and iterate over it	7	Feb 15, 2009
OK to delete hash pairs while iterating through it?	7	Sep 9, 2008
Help with a hash	4	Apr 11, 2012
Iterating over an array style question	84	Nov 24, 2010
Fading effect between play and play-over and pause and pause-over	0	Oct 16, 2021

Iterating over a hash

limitz

Gunnar Hjalmarsson

xhoster

Tad McClellan

John W. Krahn

Jürgen Exner

limitz

limitz

limitz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads