Faster CSV,need help with logic

Discussion in 'Ruby' started by Nick Barba, Jun 11, 2009.

  1. Nick Barba

    Nick Barba Guest

    I'm having a tough time figuring out how to go about solving a specific
    problem.

    I have a csv file that looks like this:

    Name,Estimated Hours,Actual Hours,Date
    Black, 30, 10, 2009-03-31
    Black,30,10,2009-04-30
    Casey,200,200,2009-04-30
    Clothier,80,40,2009-04-30
    Avino,100,100,2009-05-31
    Black,30,5,2009-05-31
    Clothier,80,50,2009-05-31

    I need to figure out how to consolidate the rows so that there is just
    one row per name, adding any actual hours together, and leaving the
    latest date.

    So that would become:

    Name,Estimated Hours,Actual Hours,Date
    Casey,200,200,2009-04-30
    Avino,100,100,2009-05-31
    Black,30,25,2009-05-31
    Clothier,80,90,2009-05-31

    Anyone have any ideas? I was thinking I would need to start by looking
    at the first line, then scan the rest of the file for rows that have
    matching names and store all of those. Then write a csv file with just
    that new combined line, and then delete all rows with those names. Then
    move onto the next name and do the same thing. Problem is I can't seem
    to figure out how to code it...
    --
    Posted via http://www.ruby-forum.com/.
     
    Nick Barba, Jun 11, 2009
    #1
    1. Advertising

  2. Nick Barba

    snex Guest

    On Jun 11, 12:40 pm, Nick Barba <> wrote:
    > I'm having a tough time figuring out how to go about solving a specific
    > problem.
    >
    > I have a csv file that looks like this:
    >
    > Name,Estimated Hours,Actual Hours,Date
    > Black, 30, 10, 2009-03-31
    > Black,30,10,2009-04-30
    > Casey,200,200,2009-04-30
    > Clothier,80,40,2009-04-30
    > Avino,100,100,2009-05-31
    > Black,30,5,2009-05-31
    > Clothier,80,50,2009-05-31
    >
    > I need to figure out how to consolidate the rows so that there is just
    > one row per name, adding any actual hours together, and leaving the
    > latest date.
    >
    > So that would become:
    >
    > Name,Estimated Hours,Actual Hours,Date
    > Casey,200,200,2009-04-30
    > Avino,100,100,2009-05-31
    > Black,30,25,2009-05-31
    > Clothier,80,90,2009-05-31
    >
    > Anyone have any ideas?  I was thinking I would need to start by looking
    > at the first line, then scan the rest of the file for rows that have
    > matching names and store all of those. Then write a csv file with just
    > that new combined line, and then delete all rows with those names.  Then
    > move onto the next name and do the same thing.  Problem is I can't seem
    > to figure out how to code it...
    > --
    > Posted viahttp://www.ruby-forum.com/.


    store everything in a hash table with the name as the key. if the key
    exists, add the hours and replace the date if necessary, otherwise
    insert it with the data given.
     
    snex, Jun 11, 2009
    #2
    1. Advertising

  3. Nick Barba

    James Gray Guest

    On Jun 11, 2009, at 12:40 PM, Nick Barba wrote:

    > I'm having a tough time figuring out how to go about solving a
    > specific
    > problem.


    How about something like this?

    #!/usr/bin/env ruby -wKU

    require "rubygems"
    require "faster_csv"

    data = FCSV.parse( DATA.read, :headers => true,
    :header_converters => :symbol,
    :return_headers => true )
    FCSV { |csv| csv << data[0].fields }
    data[:name].uniq.each do |name|
    next if name == "Name"
    matches = data.select { |row| row[:name] == name }
    FCSV { |csv| csv << [ name,
    matches.first[:estimated_hours],
    matches.map { |row| row[:actual_hours] }.
    inject(0) { |sum, n| sum + n.to_i },
    matches.map { |row|
    row[:date] }.sort.reverse.first ] }
    end

    __END__
    Name,Estimated Hours,Actual Hours,Date
    Black, 30, 10, 2009-03-31
    Black,30,10,2009-04-30
    Casey,200,200,2009-04-30
    Clothier,80,40,2009-04-30
    Avino,100,100,2009-05-31
    Black,30,5,2009-05-31
    Clothier,80,50,2009-05-31

    Hope that helps.

    James Edward Gray II
     
    James Gray, Jun 11, 2009
    #3
  4. Nick Barba

    Joe Smoe Guest

    Joe Smoe, Jun 11, 2009
    #4
  5. >
    > data = FCSV.parse( DATA.read, :headers => true,
    > :header_converters => :symbol,
    > :return_headers => true )


    I really like the logic in this code. It makes good but when I try this
    on my CSV file I am getting no results. Do you know where DATA in the
    above code comes from? I have tried to set DATA with the path to where
    my file is located but it says that I have an uninitialized constant
    DATA (Name error).

    DATA = FasterCSV.read("C:/myCSV.CSV") #when I tried this I removed the
    read from the FCSV(parse(DATA.read

    DATA = "C:/myCSV.CSV"
    --
    Posted via http://www.ruby-forum.com/.
     
    Mmcolli00 Mom, Nov 3, 2009
    #5
  6. On Nov 3, 2009, at 4:26 PM, Mmcolli00 Mom wrote:

    >>
    >> data = FCSV.parse( DATA.read, :headers => true,
    >> :header_converters => :symbol,
    >> :return_headers => true )

    >
    > I really like the logic in this code. It makes good but when I try
    > this
    > on my CSV file I am getting no results. Do you know where DATA in the
    > above code comes from?


    DATA is a special Ruby shortcut for easy scripting. Inside a Ruby
    source file you can start a line with the magic __END__ tag to end the
    code and start the data section. Ruby will open an IO object, point
    it at the data you put below __END__, and stick that object in the
    constant DATA for easy access.

    I bet you can go back and read the email where I used it and it will
    make more sense now. See how I just dumped the CSV under my __END__
    tag? That's what the code was reading.

    You could replace DATA.read in your own code with File.read("path/to/
    file.csv"). Or you could just put the path where I stuck DATA.read,
    but call FCSV.read() instead of FCSV.parse().

    I hope that helps.

    James Edward Gray II
     
    James Edward Gray II, Nov 3, 2009
    #6

  7. >
    > DATA is a special Ruby shortcut for easy scripting. Inside a Ruby
    > source file you can start a line with the magic __END__ tag to end the
    > code and start the data section. Ruby will open an IO object, point
    > it at the data you put below __END__, and stick that object in the
    > constant DATA for easy access.
    >
    > I bet you can go back and read the email where I used it and it will
    > make more sense now. See how I just dumped the CSV under my __END__
    > tag? That's what the code was reading.
    >


    Very cool!!! I like this! Thanks so much for explaining James!

    -MC
    --
    Posted via http://www.ruby-forum.com/.
     
    Mmcolli00 Mom, Nov 3, 2009
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. spike
    Replies:
    8
    Views:
    1,543
    Steve Holden
    Feb 9, 2010
  2. William James

    Faster CSV parsing

    William James, Oct 30, 2005, in forum: Ruby
    Replies:
    10
    Views:
    407
    Stefan Lang
    Oct 30, 2005
  3. Wes Gamble
    Replies:
    1
    Views:
    115
    James Edward Gray II
    Jun 28, 2007
  4. Junkone

    faster csv locking file

    Junkone, Dec 24, 2007, in forum: Ruby
    Replies:
    1
    Views:
    135
    Jano Svitok
    Dec 24, 2007
  5. Junkone

    faster csv issue

    Junkone, Jun 4, 2008, in forum: Ruby
    Replies:
    2
    Views:
    104
    Junkone
    Jun 5, 2008
Loading...

Share This Page