Special characters in csv header using fastercsv

J

John Mcleod

Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters, but
the csv headers have the special characters.

Is there a way to remove the special characters before inserting in
database?

Here is a view of my code.

def import_irb_file
# set file name
file = params[:irb][:file]
rowcount = 0

Irb.transaction do
FasterCSV.parse(file,
:headers => true,
:header_converters => :symbol,
:converters => :all,
:encoding => 'u' ) do |row|
Irb.create!(row.to_hash)
rowcount += 1
end
end
# if successful then display, then redirect to index page
flash[:notice] = "Successfully added #{rowcount} project(s)."
redirect_to :action => :index

rescue => exception
file_name = params[:irb]['file'].original_filename
file_parts = params[:irb]['file'].original_filename.split('.')
ext = file_parts[1]

if ext != 'csv'
error = "CSV file is required"
else
error = ERB::Util.h(exception.to_s) # get the error and HTML
escape it
end
# If an exception in thrown, the transaction rolls back and we end
up in this
# rescue block

flash[:error] = "Error adding projects to IRB table. (#{error}).
Please try again. "

redirect_to :controller => 'irbs', :action => 'new'

end

Thank you for any advice.

JohnM
 
M

Marnen Laibow-Koser

John said:
Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters,
[...]

Why can't they? And are you creating a DB column for each column in the
CSV file?

Best,
 
J

John Mcleod

Hello Marnen,
Thank you for replying.

I would consider the non-use of special characters in a database column
title, a matter of good form.
The only special character I use in database column titles is an
underscore and sometimes not that.

John
John said:
Hello all,
I'm kind of new to fastercsv and only have 2 months with Ruby on Rails.
So this my sound a little newbie.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.

Second, while reading or parsing the file, the rows are inputted into a
database.

My problem is my database columns can not have special characters,
[...]

Why can't they? And are you creating a DB column for each column in the
CSV file?

Best,
 
J

James Edward Gray II

Hello all,
Hello.

First, I read a csv file (approx. 6,000 lines). This file comes from a
different department, so I have no control over the headers.
=20
Second, while reading or parsing the file, the rows are inputted into = a
database.
=20
My problem is my database columns can not have special characters, but
the csv headers have the special characters.
=20
Is there a way to remove the special characters before inserting in
database?

Sure. FasterCSV support header_converters that can transform your =
headers in any way that you need. Can you show a sample header and what =
you would like it to become?

James Edward Gray II=
 
J

John Mcleod

James,
Thank you for replying.

sample headers:
Q.20C - Population May Include (Target)
IRB#

Desired headers:
Q20C_Population_May_Include_Target
IRB_id

John
 
J

James Edward Gray II

James,
Thank you for replying.

sample headers:
Q.20C - Population May Include (Target)
IRB#

Desired headers:
Q20C_Population_May_Include_Target
IRB_id

Try adding this argument where faster CSV opens the file:

:header_converters => lambda { |h| h.tr(" ", "_").delete("^a-zA-Z0-9_") }

Hope that helps.

James Edward Gray II
 
M

Marnen Laibow-Koser

[Please do not top-post.]

John said:
Hello Marnen,
Thank you for replying.

I would consider the non-use of special characters in a database column
title, a matter of good form.
The only special character I use in database column titles is an
underscore and sometimes not that.

I agree with you when I'm creating the column names manually. But if
you're loading them dynamically from a CSV file, then it probably makes
sense to transform the names as little as possible


Best,
 
J

John Mcleod

I'm not sure if the placement is correct but I'm still getting "Error
adding projects to IRB table. (unknown attribute:
Q16_Research_Category_International). Please try again." errors.

Here's my updated code.

def import_irb_file
# set file name
file = params[:irb][:file]
rowcount = 0

Irb.transaction do
FasterCSV.parse(file,
:headers => true,
:header_converters => lambda { |h| h.tr(" ",
"_").delete("^a-zA-Z0-9_")},
:converters => :all ) do |row|
Irb.create!(row.to_hash)
rowcount += 1
end
end
# if successful then display, then redirect to index page
flash[:notice] = "Successfully added #{rowcount} project(s)."
redirect_to :action => :index

rescue => exception
file_name = params[:irb]['file'].original_filename
file_parts = params[:irb]['file'].original_filename.split('.')
ext = file_parts[1]

if ext != 'csv'
error = "CSV file is required"
else
error = ERB::Util.h(exception.to_s) # get the error and HTML
escape it
end
# If an exception in thrown, the transaction rolls back and we end
up in this
# rescue block
flash[:error] = "Error adding projects to IRB table. (#{error}).
Please try again. "
redirect_to :controller => 'irbs', :action => 'new'
end

John
 
J

James Edward Gray II

I'm not sure if the placement is correct

You're placement looks fine to me.
but I'm still getting "Error=20
adding projects to IRB table. (unknown attribute:=20
Q16_Research_Category_International). Please try again." errors.

The name Q16_Research_Category_International doesn't have any characters =
in it. It looks like what you asked me for, but the table doesn't seem =
to have that column. So, you tell me what we did wrong. :)

James Edward Gray II=
 
J

John Mcleod

You are correct.
I need to check all my DB column titles. I updated the column title and
the next error was another DB column.

In your code...
:header_converters => lambda { |h| h.tr(" ",
"_").delete("^a-zA-Z0-9_")},

what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

John
 
A

Aldric Giacomoni

John said:
what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")

it looks like you're checking a table row for an instance of " " and
replacing with "_"
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

John

http://rubular.com/ is your best friend.
 
J

James Edward Gray II

In your code...
:header_converters =3D> lambda { |h| h.tr(" ",=20
"_").delete("^a-zA-Z0-9_")},
=20
what is happening here?
h.tr(" ", "_").delete("^a-zA-Z0-9_#")
=20
it looks like you're checking a table row for an instance of " " and=20=
replacing with "_"

tr(), for transliterate, is used to replace characters. You are right =
that I'm using it to switch all spaces to underscores.
but I'm unsure of the ".delete("^a-zA-Z0-9_")"

delete() allows me to list characters I want to remove. It understands =
simple ranges, like a-z and 0-9. Also, if the first character is a ^, =
the entire character set is negated. Thus my call means delete all non =
letters, numbers, and underscore characters.

James Edward Gray II=
 
A

Aldric Giacomoni

James said:
I didn't use any regular expressions. ;)

James Edward Gray II

I'll be quiet now :) My brain parsed it correctly and categorized it
incorrectly.
 
J

John Mcleod

So to delete or remove say additional special characters, you would add
additional characters to the range?

h.tr(" ", "_").delete("^a-zA-Z0-9_#$-")
 
J

James Edward Gray II

So to delete or remove say additional special characters, you would = add=20
additional characters to the range?
=20
h.tr(" ", "_").delete("^a-zA-Z0-9_#$-")

As I said in my last message, the leading ^ means NOT. Thus, I deleted =
all characters that are NOT letters, numbers, or underscores. That =
includes characters like #, $, and -. Your change added those =
characters to the list NOT to delete, so they would now be skipped.

James Edward Gray II=
 
J

John Mcleod

Thanks for all the help.
After reading your posts and reading "FasterCSV" docs, I'm starting to
understand better.
This issue is solved.
Now on to another, thus a different post on a different FasterCSV
problem.

Thank you James.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top