Changing the quote-character in csv parsing

Discussion in 'Ruby' started by Jens Auer, Mar 28, 2006.

  1. Jens Auer

    Jens Auer Guest

    I have a bunch of files containing lines as comma-seperated values.
    Unfortunately, the character used for quoting is a single quote (') and
    not the double quote ("). How can I tell the csv library (or fastercsv
    or any other csv-parsing library) which character is used for quoting?
    Some of the files contain fields like 'quoted, but with comma', which
    are seperated into two fields at the comma:
    irb(main):003:0> line = "one, 'quoted', 'quoted, but with comma'"
    => "one, 'quoted', 'quoted, but with comma'"
    irb(main):006:0> CSV::parse_line('some words "some quoted text" some
    more words', ' ')
    => ["some", "words", "some quoted text", "some", "more", "words"]
    irb(main):001:0> require 'rubygems'
    => true
    irb(main):002:0> require_gem 'fastercsv'
    => true
    irb(main):004:0> line.parse_csv
    => ["one", " 'quoted'", " 'quoted", " but with comma'"]

    The output should be ["one", "'quoted'", "'quoted, but with comma'"]

    I already have searched the rdoc for the csv library without any success.
    Jens Auer, Mar 28, 2006
    1. Advertisements

  2. Jens Auer

    Jan Topinski Guest

    Don't know how it is with fastercsv but csv.rb has double quote hardcoded as
    I understand. I think best is to substitute all double quots in your text
    with single and vice versa. You can do it with gsub this way:

    line.gsub!(/'|\"/){ |c|
    if c == "'"

    Jan Topinski, Mar 28, 2006
    1. Advertisements

  3. "He said, \"I don't care.\"".tr("'\"", "\"'")
    William James, Mar 28, 2006
  4. Jens Auer

    Jan Topinski Guest

    ups me blind ;)
    Jan Topinski, Mar 28, 2006
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.