J
julie_smith
Hi,
I have an articles table containing columns like
id,name,author,section,creationdate,description,longmatter, etc.
I am using mysql.
some of them are fixed value fields(enumerations)
like->section will have news,sports,politics etc...
while description will be a text field with any amount of arbitrary
text.
now I have 50000 articles under different sections.
I want to implement a "similar articles" feature.
By this I mean when an article is shown,
I want to display all the similar articles based on that article.(10
per page).
Now how do I calculate the similarity of 1 article with all the 50000
articles ?
I dont want articles from the same section only.
Since the search result has to be very fast,
Can I create some algorithm that will look through all the fields in
each row of the
articles table and assign a weight/checksum to it.
And then in the similar articles part I display all the articles wth a
+-5 difference in checksum with the
current displayed articles checksum ?
Thanks in advance,
Julie
I have an articles table containing columns like
id,name,author,section,creationdate,description,longmatter, etc.
I am using mysql.
some of them are fixed value fields(enumerations)
like->section will have news,sports,politics etc...
while description will be a text field with any amount of arbitrary
text.
now I have 50000 articles under different sections.
I want to implement a "similar articles" feature.
By this I mean when an article is shown,
I want to display all the similar articles based on that article.(10
per page).
Now how do I calculate the similarity of 1 article with all the 50000
articles ?
I dont want articles from the same section only.
Since the search result has to be very fast,
Can I create some algorithm that will look through all the fields in
each row of the
articles table and assign a weight/checksum to it.
And then in the similar articles part I display all the articles wth a
+-5 difference in checksum with the
current displayed articles checksum ?
Thanks in advance,
Julie