Convert text string i.e 'Peter' into integer ID

J

Justus Ohlhaver

Ken said:
I'm going to suggest what Todd Benson and Rolando Abarca suggested,
which
is to just work with strings in the database. Don't bother with
computing
some kind of (possibly unique) hash. Use a CREATE INDEX statement to
index the headline field, and you'll probably never notice a speed
difference between your roundabout method and feeding in the string
directly to the database.

--Ken

Thanks again for all you help everyone!

I have made one small test already using an additional integer column
instead of the original headline string. To convert the headline string
into an integer value I used the .hash method. The db I'm using is
mysql. Using a very small sample of entries (about 1000) I found
virtually no difference at all in the time it took to check the entire
table for existing entries when comparing using the string column vs.
using the integer column for all searches. If there is any difference in
time it takes it would be less than 1%. Considering that there is an
additional computation (.hash method) being performed when using the
integer column one could maybe assume that the latter - the integer
column - by itself must slightly faster for the database to check. In
any case I am going to stick with the original string column for the
headline field for now.

I will try to optimize the table indexing the headline field as
suggested. One question regarding this: Can this be done from rails or
are these mysql commands ('CREATE INDEX' etc.)?

Thanks again for all the help!
Justus







and found virtually no difference in the time it took to compare about
100,000 rows in MySQL.
using an integer value which was derived using 'headline'.hash

Again, thanks everybody
 
P

Peter Szinek

I will try to optimize the table indexing the headline field as
suggested. One question regarding this: Can this be done from rails or
are these mysql commands ('CREATE INDEX' etc.)?


Yeah, you can do it from Rails migrations:

http://apidock.com/rails/ActiveRecord/Migration, section 'Available
transformations':

add_index(table_name, column_names, options):
Adds a new index with the name of the column. Other options
include :name and :unique (e.g. { :name => "users_name_index", :unique
=> true }).

Cheers,
Peter
 
L

Lloyd Linklater

Justus said:
Todd, I was told that searching by integers instead of strings would
speed up performance when using large mysql tables. Is that not so?

Well, it depends.

Searching for numbers can be faster only if the numbers are sorted. For
this you would usually have an index. This way, you halfsplit your way
to the number you want.

i.e. Go the middle: is it bigger or smaller? That narrows your search
by half with one check. Then go to the middle of the half it will be in
and repeat.

The problem with headlines is that you are not necessarily searching for
a whole headline.

e.g. searching for "man on the moon" might not find "There is a man on
the moon again." Then, there is case sensitivity which will affect
conversions to numbers often enough.

You could have what many websites have: a key word search. You could
have a list of key words, "man", "moon", etc., in another file and have
that column indexed. Then, you search for keywords and yield all
articles that have all of them. That would be fast and cover a lot of
bases.

IHTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top