CSV import to hash to compare with database

Discussion in 'Ruby' started by John Mcleod, Aug 17, 2009.

  1. John Mcleod

    John Mcleod Guest

    Hello All,
    Is it practical to import data from a csv file, via FasteCSV to a
    temporary data structure, like a hash, for editing purposes?
    I'm looking to read a file, send the data to a temp location, then
    compare the file with similar files in a database table, if any. Then
    import the edited data to the database table.

    Thank you for any help.

    JohnM
    --
    Posted via http://www.ruby-forum.com/.
    John Mcleod, Aug 17, 2009
    #1
    1. Advertising

  2. John Mcleod

    James Gray Guest

    On Aug 17, 2009, at 10:54 AM, John Mcleod wrote:

    > Hello All,


    Hello.

    > Is it practical to import data from a csv file, via FasteCSV to a
    > temporary data structure, like a hash, for editing purposes?


    I don't see why it wouldn't be. FasterCSV::Row has a to_hash()
    method. I use this regularly to pump CSV rows into ActiveRecord's
    create!() method to add records to my database.

    James Edward Gray II
    James Gray, Aug 17, 2009
    #2
    1. Advertising

  3. John Mcleod

    John Mcleod Guest

    Mr. Gray,
    Thank you for the quick reply.
    I'm a PHP guy with 3 weeks in Ruby on Rails (Department changed platform
    3 months after they hired me)
    Could you please point me to a good 'FasterCSV' resource with some
    examples?
    Thank you very much.
    JohnM


    James Gray wrote:
    > On Aug 17, 2009, at 10:54 AM, John Mcleod wrote:
    >
    >> Hello All,

    >
    > Hello.
    >
    >> Is it practical to import data from a csv file, via FasteCSV to a
    >> temporary data structure, like a hash, for editing purposes?

    >
    > I don't see why it wouldn't be. FasterCSV::Row has a to_hash()
    > method. I use this regularly to pump CSV rows into ActiveRecord's
    > create!() method to add records to my database.
    >
    > James Edward Gray II


    --
    Posted via http://www.ruby-forum.com/.
    John Mcleod, Aug 17, 2009
    #3
  4. John Mcleod

    James Gray Guest

    On Aug 17, 2009, at 11:56 AM, John Mcleod wrote:

    > Mr. Gray,


    You can stick with James. I'm not that old yet. ;)

    > Thank you for the quick reply.


    Sure. Happy to help.

    > Could you please point me to a good 'FasterCSV' resource with some
    > examples?


    The documentation is pretty thorough:

    http://fastercsv.rubyforge.org/

    and the tests show usage:

    http://fastercsv.rubyforge.org/svn/trunk/test/

    If your needs are simple though, maybe this code is all you need:

    FCSV.foreach( path, :headers => true,
    :header_converters => :symbol ) do |row|
    YourARModelClass.create!(row.to_hash)
    end

    Hopefully that at least gets you started.

    James Edward Gray II
    James Gray, Aug 18, 2009
    #4
  5. John Mcleod

    John Mcleod Guest

    James,
    Thank you for the example code.
    It's funny, I found an old post of yours from 2005.

    h = Hash.new(0)
    file = FasterCSV.read(filename)[1..-1].each { |row| h[row[0]] += 1 }

    I've tested it and it seems to work.
    I will however, check out the documentation and your test code.

    This is step 1 though. Next, populate a hash with data from the
    database comparing data from the CSV file and making a editable CSV grid
    (post to rails forum).

    I'm not done yet.

    Thanks again.

    JohnM

    James Gray wrote:
    > On Aug 17, 2009, at 11:56 AM, John Mcleod wrote:
    >
    >> Mr. Gray,

    >
    > You can stick with James. I'm not that old yet. ;)
    >
    >> Thank you for the quick reply.

    >
    > Sure. Happy to help.
    >
    >> Could you please point me to a good 'FasterCSV' resource with some
    >> examples?

    >
    > The documentation is pretty thorough:
    >
    > http://fastercsv.rubyforge.org/
    >
    > and the tests show usage:
    >
    > http://fastercsv.rubyforge.org/svn/trunk/test/
    >
    > If your needs are simple though, maybe this code is all you need:
    >
    > FCSV.foreach( path, :headers => true,
    > :header_converters => :symbol ) do |row|
    > YourARModelClass.create!(row.to_hash)
    > end
    >
    > Hopefully that at least gets you started.
    >
    > James Edward Gray II


    --
    Posted via http://www.ruby-forum.com/.
    John Mcleod, Aug 18, 2009
    #5
  6. John,

    if I understand you correctly you have data in CSV files and you have
    a relational database. You want to compare the content of those files
    with data already present in the database and edit the CSV data before
    importing it into the DB.

    I do not know what your schema looks like nor what "comparing" in your
    case means, but did you consider first loading CSV data into the
    database using some staging table, doing the comparison and editing
    there and then copying the data over to the target location? That way
    you can use the full power of SQL for comparison purposes. And there
    are a lot of tools that allow to edit database data in tabular
    representation.

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Aug 18, 2009
    #6
  7. John Mcleod

    John Mcleod Guest

    Robert,
    Yes, you understand my dilemma.
    I want to thank you for your keen insight. I didn't even thought of
    that.

    When I say "compare", I mean a similarity comparison not an exact
    comparison.
    I installed 'amatch' gem. It has several comparison features that I
    believe will benefit me. In particular, the Levenshtein distance.

    Thank you again for the advice.

    JohnM

    Robert Klemme wrote:
    > John,
    >
    > if I understand you correctly you have data in CSV files and you have
    > a relational database. You want to compare the content of those files
    > with data already present in the database and edit the CSV data before
    > importing it into the DB.
    >
    > I do not know what your schema looks like nor what "comparing" in your
    > case means, but did you consider first loading CSV data into the
    > database using some staging table, doing the comparison and editing
    > there and then copying the data over to the target location? That way
    > you can use the full power of SQL for comparison purposes. And there
    > are a lot of tools that allow to edit database data in tabular
    > representation.
    >
    > Kind regards
    >
    > robert


    --
    Posted via http://www.ruby-forum.com/.
    John Mcleod, Aug 18, 2009
    #7
  8. 2009/8/18 John Mcleod <>:
    > Yes, you understand my dilemma.
    > I want to thank you for your keen insight. I didn't even thought of
    > that.


    It may not be a viable solution in your case but it could be
    worthwhile to ponder.

    > When I say "compare", I mean a similarity comparison not an exact
    > comparison.
    > I installed 'amatch' gem. It has several comparison features that I
    > believe will benefit me. In particular, the Levenshtein distance.


    Two things come to mind: depending on the RDBMS you are using it may
    have similar tools (functions) for text processing and analysis.
    Depending on your requirements you can even write a database function
    that calculates Levenshtein Distance.

    Second, you could extract the data from the database via any of Ruby's
    database interfaces, do the comparison in Ruby code and write out
    results. Then, edit them and import the result back into the DB.

    > Thank you again for the advice.


    Your welcome!

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Aug 18, 2009
    #8
  9. John Mcleod

    John Mcleod Guest

    Robert,

    >Two things come to mind: depending on the RDBMS you are using it may
    > have similar tools (functions) for text processing and analysis.


    Currently, my development database is SQLite, but my supervisor's plan
    is to have Oracle as a production database.

    Once again, I'll be in learning mode. (knowledge of Oracle = 0) I'm
    experienced in mySQL mostly.

    > Second, you could extract the data from the database via any of Ruby's
    > database interfaces, ...


    Are the "interfaces" easy to work with?

    >Then, edit them and import the result back into the DB.


    I've been looking at jquery jgrid for rails. It looks easy to implement
    (that's a topic for another forum)

    Thanks again.

    John

    Robert Klemme wrote:
    > 2009/8/18 John Mcleod <>:
    >> Yes, you understand my dilemma.
    >> I want to thank you for your keen insight. I didn't even thought of
    >> that.

    >
    > It may not be a viable solution in your case but it could be
    > worthwhile to ponder.
    >
    >> When I say "compare", I mean a similarity comparison not an exact
    >> comparison.
    >> I installed 'amatch' gem. It has several comparison features that I
    >> believe will benefit me. In particular, the Levenshtein distance.

    >
    > Two things come to mind: depending on the RDBMS you are using it may
    > have similar tools (functions) for text processing and analysis.
    > Depending on your requirements you can even write a database function
    > that calculates Levenshtein Distance.
    >
    > Second, you could extract the data from the database via any of Ruby's
    > database interfaces, do the comparison in Ruby code and write out
    > results. Then, edit them and import the result back into the DB.
    >
    >> Thank you again for the advice.

    >
    > Your welcome!
    >
    > Kind regards
    >
    > robert


    --
    Posted via http://www.ruby-forum.com/.
    John Mcleod, Aug 18, 2009
    #9
  10. 2009/8/18 John Mcleod <>:
    > Robert,
    >
    >>Two things come to mind: depending on the RDBMS you are using it may
    >> have similar tools (functions) for text processing and analysis.

    >
    > Currently, my development database is SQLite, but my supervisor's plan
    > is to have Oracle as a production database.
    >
    > Once again, I'll be in learning mode. =A0(knowledge of Oracle =3D 0) =A0I=

    'm
    > experienced in mySQL mostly.
    >
    >> Second, you could extract the data from the database via any of Ruby's
    >> database interfaces, ...

    >
    > Are the "interfaces" easy to work with?


    DBI is pretty straightforward - once you have it installed IIRC. It's
    a while since I used it.

    But you can as easily export data as CSV with SQL Plus - example for
    the SCOTT demo schema that comes with the database (if it is
    installed):

    c:\Temp>sqlplus -S scott/PWD @emp
    7369;"SMITH";"CLERK"
    7499;"ALLEN";"SALESMAN"
    7521;"WARD";"SALESMAN"
    7566;"JONES";"MANAGER"
    7654;"MARTIN";"SALESMAN"
    7698;"BLAKE";"MANAGER"
    7782;"CLARK";"MANAGER"
    7788;"SCOTT";"ANALYST"
    7839;"KING";"PRESIDENT"
    7844;"TURNER";"SALESMAN"
    7876;"ADAMS";"CLERK"
    7900;"JAMES";"CLERK"
    7902;"FORD";"ANALYST"
    7934;"MILLER";"CLERK"

    c:\Temp>

    emp.sql:

    set pagesize 0 linesize 200
    set feedback off
    select empno || ';"' || ename || '";"' || job || '"'
    from emp
    order by empno
    /
    exit

    You can find all the docs over at http://tahiti.oracle.com/

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Aug 19, 2009
    #10
  11. Hi,
    If you wish to avoid programming and would consider a third party
    tool, Try this one:

    http://www.nobhillsoft.com/Columbo.aspx

    it pretty much comapres everything tabular, so you can easily compare
    CSV files. Its still on beta so its free, and our tech support will be
    happy to walk you through the steps required to get it done, should
    you run into difficulties
    Jonathan Sion, Sep 16, 2009
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dave
    Replies:
    4
    Views:
    1,807
    Tom Forsmo
    Oct 23, 2006
  2. Chris Rebert

    Re: python compare and process a csv file

    Chris Rebert, Nov 4, 2009, in forum: Python
    Replies:
    1
    Views:
    1,456
    Peter Otten
    Nov 4, 2009
  3. rp
    Replies:
    1
    Views:
    520
    red floyd
    Nov 10, 2011
  4. Srijayanth Sridhar
    Replies:
    19
    Views:
    617
    David A. Black
    Jul 2, 2008
  5. Colvin
    Replies:
    3
    Views:
    168
    Colvin
    Dec 30, 2003
Loading...

Share This Page