Complex CSV parsing

S

Stuart Clarke

I am trying to parse data from a file where the values are common
seperated, however there is slightly more to the file than just commas,
see below

for
(;;);{"t":"msg","c":"p_114000000","ms":[{"type":"msg","msg":{"text":"you
around"}]}

From reading around fasterCSV seems the way forward therefore I have
this code

require 'rubygems'
require 'faster_csv'

FasterCSV.foreach("C:\\Documents and
Settings\\sjc\\Desktop\\p_1149549999=2[1].txt", :row_sep => ",") do
|row|
puts row[0]
break
end

However I am getting an error like this

C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/faster_csv.rb:1
650:in `shift': Illegal quoting on line 1.
(FasterCSV::MalformedCSVError)
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1568:in `loop'
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1568:in `shift'
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1513:in `each'
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1017:in `foreach'
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1191:in `open'
from C:/Program
Files/ruby/lib/ruby/gems/1.8/gems/fastercsv-1.4.0/lib/fa
ster_csv.rb:1016:in `foreach'
from C:/Documents and Settings/sjc/Desktop/test.rb:4

Can anyone help me out with this?

Many thanks
 
B

Brian Candler

Stuart said:
I am trying to parse data from a file where the values are common
seperated, however there is slightly more to the file than just commas,
see below

for
(;;);{"t":"msg","c":"p_114000000","ms":[{"type":"msg","msg":{"text":"you
around"}]}

That is not valid CSV, so FasterCSV won't help you.

I think you'll need to describe more carefully how you want this input
line broken up, and give some more examples.

It looks to me like a nested structure. If every line has exactly the
same set of fields you may get away with a regexp. But if not, you may
have to write a full-blown parser for this language.

However, this may be sufficiently close to JSON that you could use an
existing JSON parser. http://www.json.org/

(But in that case, I don't know what the "for(;;);" is doing on the
front)
 
M

Mark Thomas

I am trying to parse data from a file where the values are common
seperated, however there is slightly more to the file than just commas,
see below

for
(;;);{"t":"msg","c":"p_114000000","ms":[{"type":"msg","msg":{"text":"you
around"}]}

That's not a CSV file. It looks like some sort of serialized data
structure. If you know how it was serialized, you should be able to
easily restore the structure.

If not, you can use Treetop to specify the grammar including the
balanced delimiters {}, (), and [], which apparently take precedence
over the commas, and perform the parsing.
 
O

Ollivier Robert

I am trying to parse data from a file where the values are common
seperated, however there is slightly more to the file than just commas,
see below

for
(;;);{"t":"msg","c":"p_114000000","ms":[{"type":"msg","msg":{"text":"you
around"}]}

It looks more JSON than CSV, try using a JSON parser instead.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top