Hash counting

Discussion in 'Ruby' started by Stuart Clarke, Feb 2, 2009.

  1. I am trying to load some data into a hash and then count how many times
    it occurs in the hash, if it occurs more than 5 times then we are adding
    some data to an array. Below is my code which I will explain

    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    eventsbydate[26..30]
    counts = Hash.new(0)
    if eventdateID.find {|d| (counts[d] +=1) >= 5}
    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    #{@tab}#{event.event_type} #{@tab} #{type}")
    end

    The first line loads a time and date value into an array and using gsub
    it creates the date and time into an ID value. We then process the array
    and say if an entry (a date/time ID) occurs more or equal to 5 times add
    some data to an array.

    My testing with this code is not picking up on any such occurrences,
    which I no exist see below:

    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009

    Does anyone have any ideas why my code is not working?

    I do not get errors, it just does not return any data.

    Thanks in advance
    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 2, 2009
    #1
    1. Advertising

  2. STDERR.puts is your friend.

    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]


    STDERR.puts "A: #{eventdateID.inspect}"

    > counts = Hash.new(0)
    > if eventdateID.find {|d| (counts[d] +=1) >= 5}
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab}#{event.event_type} #{@tab} #{type}")


    STDERR.puts "B: #{@alerts.inspect}"

    > end


    Then you can see if the data is what you expect before you go into the
    loop.

    Note that 'find' will abort after one successful match. Is that what you
    want?
    --
    Posted via http://www.ruby-forum.com/.
     
    Brian Candler, Feb 2, 2009
    #2
    1. Advertising

  3. Stuart Clarke

    Ilan Berci Guest

    Stuart,

    I believe this will get you closer to what you want..

    [1,2,2,3,3,3].inject({}) do |hash, val|
    hash[val] ||= 0
    hash[val] +=1
    hash
    end

    => {1=>1,2=>2,3=>3}

    hth

    ilan


    Stuart Clarke wrote:
    > I am trying to load some data into a hash and then count how many times
    > it occurs in the hash, if it occurs more than 5 times then we are adding
    > some data to an array. Below is my code which I will explain
    >
    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]
    > counts = Hash.new(0)
    > if eventdateID.find {|d| (counts[d] +=1) >= 5}
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab}#{event.event_type} #{@tab} #{type}")
    > end
    >

    --
    Posted via http://www.ruby-forum.com/.
     
    Ilan Berci, Feb 2, 2009
    #3
  4. On 02.02.2009 21:43, Stuart Clarke wrote:
    > I am trying to load some data into a hash and then count how many times
    > it occurs in the hash, if it occurs more than 5 times then we are adding
    > some data to an array. Below is my code which I will explain
    >
    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]
    > counts = Hash.new(0)
    > if eventdateID.find {|d| (counts[d] +=1) >= 5}
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab}#{event.event_type} #{@tab} #{type}")
    > end


    You probably rather want

    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    eventsbydate[26..30]
    counts = Hash.new(0)
    eventdateID.each {|d| counts[d] +=1}
    counts.each do |d,cnt|
    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    end

    Cheers

    robert
     
    Robert Klemme, Feb 2, 2009
    #4
  5. Thanks Robert.

    This is more to what I need. However I am still getting no result,
    everything works until we get to this section:

    > counts = Hash.new(0)
    > eventdateID.each {|d| counts[d] +=1}
    > counts.each do |d,cnt|
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    > end


    Does it make any difference that the data being read into the
    eventdateID is alphanumeric eg:

    MonFeb022009
    MonFeb022009
    MonFeb022009

    Many thanks.


    Robert Klemme wrote:
    > On 02.02.2009 21:43, Stuart Clarke wrote:
    >> end

    > You probably rather want
    >
    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]
    > counts = Hash.new(0)
    > eventdateID.each {|d| counts[d] +=1}
    > counts.each do |d,cnt|
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    > end
    >
    > Cheers
    >
    > robert


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 2, 2009
    #5
  6. I have worked out the problem but I am a little unsure how to solve it.

    We have counts which holds all of the event ID's, however |d, cnt| is
    not counting the number of matching ID numbers and it just assigns each
    ID the number 1.

    So given this example, we would expect cnt to find the ID of ?? as
    occuring more than 5 times and ignore the rest:

    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    TueAug052008
    TueAug052008
    WedAug062008

    However instead cnt is ust placing the number 1 for each ID for example

    1
    1
    1
    1
    1
    1
    1
    1
    1
    1

    Can anyone help me with a fix? Many thanks

    Stuart Clarke wrote:
    > Thanks Robert.
    >
    > This is more to what I need. However I am still getting no result,
    > everything works until we get to this section:
    >
    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=1}
    >> counts.each do |d,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >> end

    >
    > Does it make any difference that the data being read into the
    > eventdateID is alphanumeric eg:
    >
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    >
    > Many thanks.
    >
    >
    > Robert Klemme wrote:
    >> On 02.02.2009 21:43, Stuart Clarke wrote:
    >>> end

    >> You probably rather want
    >>
    >> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    >> eventsbydate[26..30]
    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=1}
    >> counts.each do |d,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >> end
    >>
    >> Cheers
    >>
    >> robert


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 2, 2009
    #6
  7. I have worked out the problem but I am a little unsure how to solve it.

    We have counts which holds all of the event ID's, however |d, cnt| is
    not counting the number of matching ID numbers and it just assigns each
    ID the number 1.

    So given this example, we would expect cnt to find the ID of ?? as
    occuring more than 5 times and ignore the rest:

    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    MonFeb022009
    TueAug052008
    TueAug052008
    WedAug062008

    However instead cnt is just placing the number 1 for each ID for example

    1
    1
    1
    1
    1
    1
    1
    1
    1
    1

    Can anyone help me with a fix? Many thanks

    Stuart Clarke wrote:
    > Thanks Robert.
    >
    > This is more to what I need. However I am still getting no result,
    > everything works until we get to this section:
    >
    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=1}
    >> counts.each do |d,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >> end

    >
    > Does it make any difference that the data being read into the
    > eventdateID is alphanumeric eg:
    >
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    >
    > Many thanks.
    >
    >
    > Robert Klemme wrote:
    >> On 02.02.2009 21:43, Stuart Clarke wrote:
    >>> end

    >> You probably rather want
    >>
    >> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    >> eventsbydate[26..30]
    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=1}
    >> counts.each do |d,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >> end
    >>
    >> Cheers
    >>
    >> robert


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 2, 2009
    #7
  8. 2009/2/3 Stuart Clarke <>:
    > I have worked out the problem but I am a little unsure how to solve it.
    >
    > We have counts which holds all of the event ID's, however |d, cnt| is
    > not counting the number of matching ID numbers and it just assigns each
    > ID the number 1.


    What does that mean? What's in the Hash?

    > So given this example, we would expect cnt to find the ID of ?? as
    > occuring more than 5 times and ignore the rest:
    >
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    > MonFeb022009
    > TueAug052008
    > TueAug052008
    > WedAug062008
    >
    > However instead cnt is ust placing the number 1 for each ID for example
    >
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    >
    > Can anyone help me with a fix? Many thanks


    Frankly, you lost me there. Please do this:

    require 'pp'

    File.open('/tmp/log', 'w') {|io| io.write(counts.pretty_inspect)}

    And look at the output and / or post it here.

    robert

    --
    remember.guy do |as, often| as.you_can - without end
     
    Robert Klemme, Feb 3, 2009
    #8
  9. Thanks for replying and sorry for the confusion.

    My hash (counts) contains date and time ID's like so TueAug052008

    When I do a puts on counts I get a list of these as per there date and
    time values which is what I want. However counting to see if there is
    more than 5 occurances of one the ID values fails and doesn't find
    anything in my data set.

    I have done as you asked and the output is as follows:

    {"WedAug062008"=>1}

    This suggests there is a problem. Just for your information doing an
    output on counts (the hash) gives this:

    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    WedAug062008
    WedAug062008

    Thanks for your help


    Robert Klemme wrote:
    > 2009/2/3 Stuart Clarke <>:
    >> I have worked out the problem but I am a little unsure how to solve it.
    >>
    >> We have counts which holds all of the event ID's, however |d, cnt| is
    >> not counting the number of matching ID numbers and it just assigns each
    >> ID the number 1.

    >
    > What does that mean? What's in the Hash?
    >
    >> TueAug052008
    >> 1
    >> 1
    >> 1
    >> 1
    >> 1
    >>
    >> Can anyone help me with a fix? Many thanks

    >
    > Frankly, you lost me there. Please do this:
    >
    > require 'pp'
    >
    > File.open('/tmp/log', 'w') {|io| io.write(counts.pretty_inspect)}
    >
    > And look at the output and / or post it here.
    >
    > robert


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 3, 2009
    #9
  10. 2009/2/3 Stuart Clarke <>:
    > Thanks for replying and sorry for the confusion.
    >
    > My hash (counts) contains date and time ID's like so TueAug052008


    Obviously not as the output below demonstrates that there is just a
    single entry in the Hash.

    > When I do a puts on counts I get a list of these as per there date and
    > time values which is what I want. However counting to see if there is
    > more than 5 occurances of one the ID values fails and doesn't find
    > anything in my data set.
    >
    > I have done as you asked and the output is as follows:
    >
    > {"WedAug062008"=>1}


    Looks like there is a lot missing.

    > This suggests there is a problem. Just for your information doing an
    > output on counts (the hash) gives this:


    What does "doing an output" mean? Please be more specific (e.g. by
    posting complete code, ideally a test case that someone else can
    execute), otherwise nobody can help you.

    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > WedAug062008
    > WedAug062008


    Cheers

    robert

    --
    remember.guy do |as, often| as.you_can - without end
     
    Robert Klemme, Feb 3, 2009
    #10
  11. Ok I will get straight to the code causing the problem, so first off you
    need to no that 'eventdateID' is an array full of values taken from log
    files. A sample of the values in this array are:

    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > WedAug062008
    > WedAug062008



    Then I have the following code:

    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=1}
    >> counts.each do |d,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >> end



    The @alerts.push data is again specific to the logs I am parsing.
    Basically each record in the log is given an ID number based on the time
    and date values which goes into eventdateID. The purpose of the code
    above is to check if any of the ID numbers occur more than 5 times in
    eventdateID.


    counts = Hash.new(0) - empty hash called counts
    eventdateID.each {|d| counts[d] +=1} - process each ID value in
    eventdateID and load into the hash counts
    counts.each do |d,cnt| - process counts and see how many of each ID
    value exist
    @alerts.push ............. if cnt >=5 - If there are more than 5 of an
    ID push some of the log data to an array which matches the eventdateID


    I have done some checking

    eventdateID.each {|d| counts[d] +=1}
    @alerts.push

    gives

    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    WedAug062008
    WedAug062008

    At this stage we are on the right lines we have the hash counts with
    some date ID's in it.

    Another test was:

    eventdateID.each {|d| counts[d] +=1}
    counts.each do |d,cnt|
    @alerts.push cnt

    This gives

    1
    1
    1
    1
    1
    1
    1

    This is where the problem I want it to identify that

    MonFeb0220091 occurs 6 times in the counts hash
    WedAug062008 occurs twice in the counts hash

    As a result of this I am expecting my code to output the log data to the
    @alerts array based on the eventdateID MonFeb0220091 as it occurs more
    than 5 times. Below is my code again to summarise, but the restriction
    is you do not have the logs, I can assure you the data in eventdateID
    are values like this MonFeb0220091.

    Code block:

    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    eventsbydate[26..30]
    counts = Hash.new(0)
    eventdateID.each {|d| counts[d] +=1}
    counts.each do |d,cnt|
    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    end


    Thanks again.




    Robert Klemme wrote:
    > 2009/2/3 Stuart Clarke <>:
    >> Thanks for replying and sorry for the confusion.
    >>
    >> My hash (counts) contains date and time ID's like so TueAug052008

    >
    > Obviously not as the output below demonstrates that there is just a
    > single entry in the Hash.
    >
    >> When I do a puts on counts I get a list of these as per there date and
    >> time values which is what I want. However counting to see if there is
    >> more than 5 occurances of one the ID values fails and doesn't find
    >> anything in my data set.
    >>
    >> I have done as you asked and the output is as follows:
    >>
    >> {"WedAug062008"=>1}

    >
    > Looks like there is a lot missing.
    >
    >> This suggests there is a problem. Just for your information doing an
    >> output on counts (the hash) gives this:

    >
    > What does "doing an output" mean? Please be more specific (e.g. by
    > posting complete code, ideally a test case that someone else can
    > execute), otherwise nobody can help you.
    >
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> WedAug062008
    >> WedAug062008

    >
    > Cheers
    >
    > robert


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 3, 2009
    #11
  12. On Tue, Feb 3, 2009 at 6:16 PM, Stuart Clarke
    <> wrote:

    > Ok I will get straight to the code causing the problem, so first off you
    > need to no that 'eventdateID' is an array full of values taken from log
    > files. A sample of the values in this array are:
    >
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> MonFeb0220091
    >> WedAug062008
    >> WedAug062008

    >
    >
    > Then I have the following code:
    >
    >>> counts = Hash.new(0)
    >>> eventdateID.each {|d| counts[d] +=1}
    >>> counts.each do |d,cnt|
    >>> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >>> #{@tab}#{event.event_type} #{@tab} #{type}") if cnt >= 5
    >>> end


    > Another test was:
    >
    > eventdateID.each {|d| counts[d] +=1}
    > counts.each do |d,cnt|
    > @alerts.push cnt
    >
    > This gives
    >
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    > 1
    >


    Sorry Stuart, can you show the exact code that produces that output
    (including the puts that you are using to print those values)? Cause
    this works for me as it is:


    irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
    MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
    WedAug062008}
    => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
    "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
    "WedAug062008"]
    irb(main):010:0> counts = Hash.new(0)
    => {}
    irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
    => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
    "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
    "WedAug062008"]
    irb(main):012:0> counts
    => {"MonFeb0220091"=>6, "WedAug062008"=>2}
    irb(main):013:0> @alerts = []
    => []
    irb(main):014:0> counts.each do |id, cnt|
    irb(main):015:1* @alerts.push(id) if cnt >= 5
    irb(main):016:1> end
    => {"MonFeb0220091"=>6, "WedAug062008"=>2}
    irb(main):017:0> @alerts
    => ["MonFeb0220091"]


    If each element in the array eventDateID is stored in the hash as a
    different key (which is what seems to be happening), maybe what is
    inside the array are not strings, but another class that has a
    different implementation of eql?.
    Can you inspect the eventDateID array to check that?

    Jesus.
     
    Jesús Gabriel y Galán, Feb 3, 2009
    #12
  13. Thanks for getting back to me.

    I have done similar to you in Fxri and got those results earlier it
    seems you may be correct and eventdateID and id, cnt do not like
    eachother so much. After doing an inspect on eventdateID array I only
    get the following:

    ["WedAug062008"]

    This is strange as it seems to missing all the other data. For your
    information in my actual code I do @alerts.push(counts)

    counts = Hash.new(0)
    eventdateID.each {|d| counts[d] += 1}
    @alerts.push(counts)
    counts.each do |id,cnt|

    and get what is expected:

    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    MonFeb0220091
    WedAug062008
    WedAug062008

    Its the next step counts.each do |id,cnt| which is the problem.

    > If each element in the array eventDateID is stored in the hash as a
    > different key (which is what seems to be happening), maybe what is
    > inside the array are not strings, but another class that has a
    > different implementation of eql?.


    Not sure what you mean by this. This is how I make eventdateID, just
    regular expressions on a string from a struct:

    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    eventsbydate[26..30]

    Many thanks

    Jesús Gabriel y Galán wrote:
    > On Tue, Feb 3, 2009 at 6:16 PM, Stuart Clarke
    > <> wrote:
    >
    >>> WedAug062008
    >>>> end

    >> 1
    >> 1
    >> 1
    >> 1
    >> 1
    >>

    >
    > Sorry Stuart, can you show the exact code that produces that output
    > (including the puts that you are using to print those values)? Cause
    > this works for me as it is:
    >
    >
    > irb(main):009:0> eventDateID = %w{MonFeb0220091 MonFeb0220091
    > MonFeb0220091 MonFeb0220091 MonFeb0220091 MonFeb0220091 WedAug062008
    > WedAug062008}
    > => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
    > "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
    > "WedAug062008"]
    > irb(main):010:0> counts = Hash.new(0)
    > => {}
    > irb(main):011:0> eventDateID.each {|d| counts[d] += 1}
    > => ["MonFeb0220091", "MonFeb0220091", "MonFeb0220091",
    > "MonFeb0220091", "MonFeb0220091", "MonFeb0220091", "WedAug062008",
    > "WedAug062008"]
    > irb(main):012:0> counts
    > => {"MonFeb0220091"=>6, "WedAug062008"=>2}
    > irb(main):013:0> @alerts = []
    > => []
    > irb(main):014:0> counts.each do |id, cnt|
    > irb(main):015:1* @alerts.push(id) if cnt >= 5
    > irb(main):016:1> end
    > => {"MonFeb0220091"=>6, "WedAug062008"=>2}
    > irb(main):017:0> @alerts
    > => ["MonFeb0220091"]
    >
    >
    > If each element in the array eventDateID is stored in the hash as a
    > different key (which is what seems to be happening), maybe what is
    > inside the array are not strings, but another class that has a
    > different implementation of eql?.
    > Can you inspect the eventDateID array to check that?
    >
    > Jesus.


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 3, 2009
    #13
  14. On Tue, Feb 3, 2009 at 7:34 PM, Stuart Clarke
    <> wrote:
    > Thanks for getting back to me.


    > and get what is expected:
    >
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > MonFeb0220091
    > WedAug062008
    > WedAug062008
    >
    > Its the next step counts.each do |id,cnt| which is the problem.


    Sorry, but can you post a complete executable piece of code we can use
    to reproduce the problem?
    You have this:

    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

    but what is eventdateID? Maybe you have an earlier line of code like
    eventdateID = [].
    I'd like to see the complete picture. Also, what is eventsbydate?
    By the way, now I'm realizing that eventsbydate might be a string, so
    how can eventdateID contain more than 1 entry at all?
    If that's true, then

    eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]

    is also a string. So you are pushing a single string into eventdateID,
    so when you later iterate you only get one iteration. Perhaps you have
    a loop around the piece of code you showed? If that's the case, then
    it makes sense that you never get more than 1 count per entry, because
    you are creating the hash every time. So, I think it would be easier
    if you pasted the complete program.


    >> If each element in the array eventDateID is stored in the hash as a
    >> different key (which is what seems to be happening), maybe what is
    >> inside the array are not strings, but another class that has a
    >> different implementation of eql?.

    >
    > Not sure what you mean by this.


    It was another hipothesis, but I think you can forget about it, since
    I'm pretty sure now that with the piece of code you showed you are
    only ever pushing one string into eventdateID.

    Jesus.
     
    Jesús Gabriel y Galán, Feb 3, 2009
    #14
  15. Thanks for your response. It makes a lot more sense and you are on the
    right lines I think. There is other code around this but it does not
    bare much relevance:

    def scanEVTWithSource(file, source)
    @alerts = []
    @evtLogArray = []
    begin
    #read the contents of the event logs files
    evtLog = EventLog.open_backup(file, source)

    #put data into an array
    @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
    b.event_id).nonzero? || (a.time_written <=> b.time_written)}

    #event log data collected
    evtLog.close

    if evtLogArray.length == 0
    return
    end

    #failed logons where more than 10 have occurred in a day
    if event.event_id == 529
    eventdateID = []
    #assign all time written values to the eventsbydate array
    eventsbydate = "#{event.time_written}"
    eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    eventsbydate[26..30]
    counts = Hash.new(0)
    eventdateID.each {|d| counts[d] += 1}
    counts.each do |id,cnt|
    @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    #{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
    end
    end
    end


    I will explain this.

    The scanEVTWithSource(file, source) - takes data and arguements from two
    other methods which assist with the reading of the log files.

    @evtLogArray - an array full of log data which is inspected in structs

    The rest we no about, but for example event.event_id is a struct to
    inspect the the ID field.

    Hope this helps and thank you very much for your help. You are right
    eventsbydate is a string based on data from the event.time_written
    struct using GSUB etc to chomp it down into the values you have already
    seen.

    Regards

    Jesús Gabriel y Galán wrote:
    > On Tue, Feb 3, 2009 at 7:34 PM, Stuart Clarke
    > <> wrote:
    >> Thanks for getting back to me.

    >
    >>
    >> Its the next step counts.each do |id,cnt| which is the problem.

    >
    > Sorry, but can you post a complete executable piece of code we can use
    > to reproduce the problem?
    > You have this:
    >
    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]
    >
    > but what is eventdateID? Maybe you have an earlier line of code like
    > eventdateID = [].
    > I'd like to see the complete picture. Also, what is eventsbydate?
    > By the way, now I'm realizing that eventsbydate might be a string, so
    > how can eventdateID contain more than 1 entry at all?
    > If that's true, then
    >
    > eventsbydate.gsub(/\s/, '')[0..7] + eventsbydate[26..30]
    >
    > is also a string. So you are pushing a single string into eventdateID,
    > so when you later iterate you only get one iteration. Perhaps you have
    > a loop around the piece of code you showed? If that's the case, then
    > it makes sense that you never get more than 1 count per entry, because
    > you are creating the hash every time. So, I think it would be easier
    > if you pasted the complete program.
    >
    >
    >>> If each element in the array eventDateID is stored in the hash as a
    >>> different key (which is what seems to be happening), maybe what is
    >>> inside the array are not strings, but another class that has a
    >>> different implementation of eql?.

    >>
    >> Not sure what you mean by this.

    >
    > It was another hipothesis, but I think you can forget about it, since
    > I'm pretty sure now that with the piece of code you showed you are
    > only ever pushing one string into eventdateID.
    >
    > Jesus.


    --
    Posted via http://www.ruby-forum.com/.
     
    Stuart Clarke, Feb 3, 2009
    #15
  16. Hi --

    On Wed, 4 Feb 2009, Stuart Clarke wrote:

    > Thanks for your response. It makes a lot more sense and you are on the
    > right lines I think. There is other code around this but it does not
    > bare much relevance:
    >
    > def scanEVTWithSource(file, source)
    > @alerts = []
    > @evtLogArray = []
    > begin
    > #read the contents of the event logs files
    > evtLog = EventLog.open_backup(file, source)
    >
    > #put data into an array
    > @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
    > b.event_id).nonzero? || (a.time_written <=> b.time_written)}


    I haven't really been following this thread but this caught my eye and
    I thought I'd mention this other technique:

    array.sort_by {|e| [e.event_id, e.time_written] }


    David

    --
    David A. Black / Ruby Power and Light, LLC
    Ruby/Rails consulting & training: http://www.rubypal.com
    Coming in 2009: The Well-Grounded Rubyist (http://manning.com/black2)

    http://www.wishsight.com => Independent, social wishlist management!
     
    David A. Black, Feb 3, 2009
    #16
  17. On Tue, Feb 3, 2009 at 11:12 PM, Stuart Clarke
    <> wrote:
    > Thanks for your response. It makes a lot more sense and you are on the
    > right lines I think. There is other code around this but it does not
    > bare much relevance:
    >
    > def scanEVTWithSource(file, source)
    > @alerts = []
    > @evtLogArray = []


    This is unneeded, since you later assign another array to this
    variable without using this one.

    > begin
    > #read the contents of the event logs files
    > evtLog = EventLog.open_backup(file, source)
    >
    > #put data into an array
    > @evtLogArray = evtLog.read.sort { |a, b| (a.event_id <=>
    > b.event_id).nonzero? || (a.time_written <=> b.time_written)}


    Are you sure you want to put this in an instance variable?

    > #event log data collected
    > evtLog.close
    > if evtLogArray.length == 0


    Shouldn't this be checking the @evtLogArray?

    > return
    > end
    >
    > #failed logons where more than 10 have occurred in a day
    > if event.event_id == 529


    Here we are reaching the culprit, I think. What is event? It's not
    defined in this method...

    > eventdateID = []
    > #assign all time written values to the eventsbydate array
    > eventsbydate = "#{event.time_written}"
    > eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    > eventsbydate[26..30]
    > counts = Hash.new(0)
    > eventdateID.each {|d| counts[d] += 1}
    > counts.each do |id,cnt|
    > @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    > #{@tab} #{event.event_type} #{@tab} #{type}") if cnt >= 5
    > end
    > end
    > end
    >


    Let me try to write what I think you want cause I still think the
    above code is not what you are actually running, cause the above as is
    will give a NoMethodError in the evtLogArray.length method call. The
    following is untested:


    def scanEVTWithSource(file, source)
    @alerts = []
    #read the contents of the event logs files
    evtLog = EventLog.open_backup(file, source)
    #put data into an array; sort it using David's advice
    evtLogArray = evtLog.read.sort_by { |e| [e.event_id, e.time_written] }

    #event log data collected
    evtLog.close
    return if evtLogArray.length == 0

    # Important part here: create the hash outside the loop
    # and, actually, do a loop on evtLogArray
    counts = Hash.new(0)
    # select relevant events, mapping them to the modified string
    events = evtLogArray.select {|event| event.event_id == 529}
    events.each do |event|
    event_time = event.time_written.to_s
    eventsbydate = event_time.gsub(/\s/, '')[0..7] + event_time[26..30]
    counts[eventsbydate] += 1
    end
    counts.each do |id,cnt|
    # Now I have a problem here: what we are putting in the hash
    is a string, not an event object
    # @alerts.push("#{event.event_id} #{@tab}
    #{event.time_written} #{@tab} #{event.event_type} #{@tab} #{type}") if
    cnt >= 5
    @alerts.push(id) if cnt >= 5
    end
    end

    I hope this helps. I don't have time now to solve the issue about you
    wanting to push the event object to the alerts array, instead of just
    the calculated string, but I hope you find a way to do that easily.

    Let me know if this helped.

    Jesus.
     
    Jesús Gabriel y Galán, Feb 3, 2009
    #17
  18. 2009/2/3 Jes=FAs Gabriel y Gal=E1n <>:
    > On Tue, Feb 3, 2009 at 11:12 PM, Stuart Clarke
    > <> wrote:
    >> Thanks for your response. It makes a lot more sense and you are on the
    >> right lines I think. There is other code around this but it does not
    >> bare much relevance:
    >>
    >> def scanEVTWithSource(file, source)
    >> @alerts =3D []
    >> @evtLogArray =3D []

    >
    > This is unneeded, since you later assign another array to this
    > variable without using this one.


    Also, when reinitializing these variables on each method call then
    chances are that they can be local variables and not instance
    variables - unless, of course, some other method in the class (which
    class?) uses the leftovers of scanEVTWithSource in those instance
    variables.

    I am suspecting the issue is somewhere above the method. For example,
    you might have a loop calling scanEVTWithSource and expecting that
    counts are aggregated throughout all calls but they aren't since you
    reinitialize the Hash on each call.

    >> begin
    >> #read the contents of the event logs files
    >> evtLog =3D EventLog.open_backup(file, source)
    >>
    >> #put data into an array
    >> @evtLogArray =3D evtLog.read.sort { |a, b| (a.event_id <=3D>
    >> b.event_id).nonzero? || (a.time_written <=3D> b.time_written)}

    >
    > Are you sure you want to put this in an instance variable?
    >
    >> #event log data collected
    >> evtLog.close
    >> if evtLogArray.length =3D=3D 0

    >
    > Shouldn't this be checking the @evtLogArray?
    >
    >> return
    >> end
    >>
    >> #failed logons where more than 10 have occurred in a day
    >> if event.event_id =3D=3D 529

    >
    > Here we are reaching the culprit, I think. What is event? It's not
    > defined in this method...
    >
    >> eventdateID =3D []
    >> #assign all time written values to the eventsbydate array
    >> eventsbydate =3D "#{event.time_written}"
    >> eventdateID.push eventsbydate.gsub(/\s/, '')[0..7] +
    >> eventsbydate[26..30]
    >> counts =3D Hash.new(0)
    >> eventdateID.each {|d| counts[d] +=3D 1}
    >> counts.each do |id,cnt|
    >> @alerts.push("#{event.event_id} #{@tab} #{event.time_written}
    >> #{@tab} #{event.event_type} #{@tab} #{type}") if cnt >=3D 5
    >> end
    >> end
    >> end
    >>


    Absolutely agree to your other comments. I still think we haven't
    seen all the code. Also, the whole problem is not very clear to me
    either.

    Cheers

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
     
    Robert Klemme, Feb 4, 2009
    #18
  19. On Wed, Feb 4, 2009 at 12:04 AM, Stuart Clarke
    <> wrote:
    >
    > counts = Hash.new(0)
    > eventdateID.each {|d| counts[d] += 1}


    Here is your problem. Hash.new(0) means "when I query the hash, and
    the key I request is not in there, return 0". It does not actually add
    {key => 0} to the hash itself. To do that, you need the block form of
    Hash.new, which yields as block the hash itself and the key:

    counts = Hash.new {|h, k| h[k] = 0}

    irb(main):001:0> a = Hash.new(0)
    => {}
    irb(main):002:0> b = Hash.new {|h,k| h[k] = 0}
    => {}
    irb(main):003:0> a['hello']
    => 0
    irb(main):004:0> b['hello']
    => 0
    irb(main):005:0> a
    => {}
    irb(main):006:0> b
    => {"hello"=>0}

    martin
     
    Martin DeMello, Feb 4, 2009
    #19
  20. On Wed, Feb 4, 2009 at 3:27 PM, Martin DeMello <> wrote:
    > On Wed, Feb 4, 2009 at 12:04 AM, Stuart Clarke
    > <> wrote:
    >>
    >> counts = Hash.new(0)
    >> eventdateID.each {|d| counts[d] += 1}

    >
    > Here is your problem. Hash.new(0) means "when I query the hash, and
    > the key I request is not in there, return 0". It does not actually add
    > {key => 0} to the hash itself.


    This is true, but counts[d] += 1 is actually counts[d] = counts[d] + 1
    so the RHS will evaluate to 1 the first time, assigning it to the hash:

    irb(main):001:0> h = Hash.new(0)
    => {}
    irb(main):002:0> h["a"] += 1
    => 1
    irb(main):003:0> h
    => {"a"=>1}

    So the above snippet is correct for generating a histogram.

    Jesus.
     
    Jesús Gabriel y Galán, Feb 4, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. rp
    Replies:
    1
    Views:
    562
    red floyd
    Nov 10, 2011
  2. Srijayanth Sridhar
    Replies:
    19
    Views:
    655
    David A. Black
    Jul 2, 2008
  3. Jen
    Replies:
    3
    Views:
    234
    Robert Klemme
    Mar 23, 2011
  4. Counting elements in a hash

    , Jun 9, 2004, in forum: Perl Misc
    Replies:
    4
    Views:
    138
  5. edwardfredriks

    counting up instead of counting down

    edwardfredriks, Sep 6, 2005, in forum: Javascript
    Replies:
    6
    Views:
    223
    Dr John Stockton
    Sep 7, 2005
Loading...

Share This Page