faster_csv vs File+split, why it is not faster?

P

Pablo Q.

[Note: parts of this message were removed to make it a legal post.]

Hi folks,

Why I'm getting this result? is It due just to this specif problem?

the file has 293858 record, here is some record samples:


"MARCOS, LUIS","547 N LAKE ST","","MUNDELEIN","IL","000000000"
"BALDWIN, T & S","4732 NE 203RD ST","","LAKE FOREST PARK","WA","000000000"
"RYBOLT, C","401 CEDAR DR","","CLINTON","IL","000000000"
"WELDT, KRISTINA","1945 N ORLEANS ST","","MCHENRY","IL","000000000"
.....

CODE

require 'benchmark'

Benchmark.bm do |x|
x.report do
FasterCSV.foreach("data_test/match.csv") do |row|
end
end
end

Benchmark.bm do |x|
x.report do
File.new("data_test/match.csv",'r').each{|line|
row = line.split("\",\"",-1)
row[0].gsub!('"','')
row[a.length-1].gsub!('"','')
}
end
end


RESULTS

user system total real
16.180000 0.740000 16.920000 ( *17.246190*)
user system total real
5.830000 0.120000 5.950000 ( *6.028469*)

is this true?
 
J

James Gray

RESULTS

user system total real
16.180000 0.740000 16.920000 ( *17.246190*)
user system total real
5.830000 0.120000 5.950000 ( *6.028469*)

is this true?

Is it true that File.split() is faster than FasterCSV? Yeah, I bet it
is. Likely reasons are:

* It's written in C
* It doesn't handle all types of CSV data, so it has less work to do

To give some examples, you split code doesn't parse this valid CSV data:

no,quotes

Or this:

"embedded
newlines"

Hope that explains things a bit.

James Edward Gray II
 
P

Pablo Q.

[Note: parts of this message were removed to make it a legal post.]

I thought so...

I'm just comparing a single case of FasterCSV to all the implementation of
the library.

Thank you for your time!
 

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top