1.9 CSV Parsing Issues

Discussion in 'Ruby' started by Kenny Lam, Nov 4, 2010.

  1. Kenny Lam

    Kenny Lam Guest

    I'm currently porting a script to 1.9 and I'm having problems getting
    CSV parsing to work. This script worked fine in 1.8.7 and used the
    FasterCSV library for parsing. After playing around in the IRB, I have
    determined that the current parser seems incapable of handling newlines
    as row seperators (a rather basic and important feature).

    I tested with a simple file whose contents are:
    field1,field2
    field3,field4

    This file was created using a basic text editor and does not contain any
    unorthodox newline characters. Attempting to parse this file results in
    the following error:

    C:/Ruby192/lib/ruby/1.9.1/csv.rb:1885:in `block (2 levels) in shift':
    Unquoted fields do not allow \r or \n (line 1). (CSV::MalformedCSVError)
    from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1856:in `each'
    from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1856:in `block in shift'
    from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1818:in `loop'
    from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1818:in `shift'
    from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1760:in `each'

    The return value of the opened csv file shows row_sep to be "\r\n" which
    seems correct. I have tried manually setting the value of row_sep when
    calling CSV::eek:pen but I get the same issue.

    Once again, I do not have this problem with FasterCSV under 1.8.7 (which
    as I understand, is the same code used in 1.9's csv library). I'm using
    Ruby 1.9.2p0 on Windows XP. I would greatly appreciate any help.

    --
    Posted via http://www.ruby-forum.com/.
     
    Kenny Lam, Nov 4, 2010
    #1
    1. Advertising

  2. On Nov 4, 2010, at 1:40 PM, Kenny Lam wrote:

    > I'm currently porting a script to 1.9 and I'm having problems getting
    > CSV parsing to work.


    > I tested with a simple file whose contents are:
    > field1,field2
    > field3,field4


    CSV should definitely handle that data. Indeed it does for me:

    $ ruby -v -r csv -e 'p CSV.parse("field1,field2\r\nfield3,field4\r\n")'
    ruby 1.9.2dev (2010-04-28 trunk 27536) [x86_64-darwin10.3.0]
    [["field1", "field2"], ["field3", "field4"]]

    > This file was created using a basic text editor and does not contain any
    > unorthodox newline characters.


    Can we see exactly what the file does contain, with code like:

    $ ruby -e 'p File.read("path/to/file.csv")'

    ?

    James Edward Gray II
     
    James Edward Gray II, Nov 4, 2010
    #2
    1. Advertising

  3. Kenny Lam

    Kenny Lam Guest

    File.read shows "field1,field2\nfield3,field4\n"
    I have played around with the some of the other methods and have
    determined that this problem only seems to occur when using CSV::eek:pen
    and then looped through with CSV::each. CSV::foreach and CSV::parse
    seem fine. Unfortunately, I need to use CSV::eek:pen because I need a
    reference to the opened file object in order to do some file cursor
    manipulation.

    Other things I have noted is that when running CSV.open('file','r') the
    result is show:
    <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
    col_sep:"," row_sep:"\r\n" quote_char:"\"">

    While CSV.open('test.log','r',:row_sep => '\r\n') shows result:
    <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
    col_sep:"," row_sep:"\\r\\n" quote_char:"\"">

    The double backslashes make me question if the escape character is being
    processed correctly. I am relatively new to Ruby, am I using the
    language incorrectly or is this a bug?

    --
    Posted via http://www.ruby-forum.com/.
     
    Kenny Lam, Nov 4, 2010
    #3
  4. On Nov 4, 2010, at 2:26 PM, Kenny Lam wrote:

    > File.read shows "field1,field2\nfield3,field4\n"


    Great. That's what we expected to see. You are right about the =
    content.

    > I have played around with the some of the other methods and have
    > determined that this problem only seems to occur when using CSV::eek:pen
    > and then looped through with CSV::each. CSV::foreach and CSV::parse
    > seem fine.


    Ah, and let me guess, you always pass a read mode of 'r' to open(), =
    right? CSV is clever and it shuts off Ruby's line ending translation on =
    Windows using 'rb' if you don't specify a mode. By specify a mode, you =
    leave this feature on which allows Ruby to switch \r\n to \n as it did =
    with the read above.

    > Unfortunately, I need to use CSV::eek:pen because I need a
    > reference to the opened file object in order to do some file cursor
    > manipulation.


    No worries, open() is going to work for you.

    > Other things I have noted is that when running CSV.open('file','r') =

    the
    > result is show:
    > <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
    > col_sep:"," row_sep:"\r\n" quote_char:"\"">
    >=20
    > While CSV.open('test.log','r',:row_sep =3D> '\r\n') shows result:
    > <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0=20
    > col_sep:"," row_sep:"\\r\\n" quote_char:"\"">
    >=20
    > The double backslashes make me question if the escape character is =

    being
    > processed correctly. I am relatively new to Ruby, am I using the
    > language incorrectly or is this a bug?


    You have a misunderstanding of Ruby Strings. Double quotes allow for =
    escapes like \r or \n, but single quotes do not. You've set the =
    :row_sep to literally slash, r, slash, and n.

    I image all you need to do is switch your open() call to:

    CSV.open('path/to/file')

    The library should take it from there.

    Hope that helps.

    James Edward Gray II=
     
    James Edward Gray II, Nov 4, 2010
    #4
  5. Kenny Lam

    Kenny Lam Guest

    Kenny Lam, Nov 4, 2010
    #5
  6. On Nov 4, 2010, at 2:52 PM, Kenny Lam wrote:

    > Excellent, that works perfectly. Thanks a lot for your help.


    My pleasure.

    James Edward Gray II
     
    James Edward Gray II, Nov 4, 2010
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    935
    GIMME
    Feb 11, 2004
  2. Michal Mikolajczyk
    Replies:
    0
    Views:
    694
    Michal Mikolajczyk
    Feb 13, 2004
  3. Skip Montanaro
    Replies:
    0
    Views:
    767
    Skip Montanaro
    Feb 13, 2004
  4. Tintin92
    Replies:
    1
    Views:
    1,813
    Andrew Thompson
    Feb 14, 2007
  5. jliu66
    Replies:
    0
    Views:
    566
    jliu66
    Oct 19, 2007
Loading...

Share This Page