G
Gandalf
Hi All! I need to create a "site search" feature for a website. I would
like to create a service which could be
pointed to a directory. It should go over all subfolders, read all
HTML,ASP,PHP,TXT and PDF files, and
create a table indexed by words. The most important would be...
1. It should index PDF files too. (The site contains many datasheets so
this is curical.)
2. It should not index special keywords inside HTML and PDF file (so if
somebody would search for "green" then it should only lookup "green
cables" and "green grass", but not <FONT COLOR="GREEN">)
Is there a library out there that can do the task for me? I can easily
do all parts except parsing a file and gather keywords.
Thanks in advance.
Laci 2.0
like to create a service which could be
pointed to a directory. It should go over all subfolders, read all
HTML,ASP,PHP,TXT and PDF files, and
create a table indexed by words. The most important would be...
1. It should index PDF files too. (The site contains many datasheets so
this is curical.)
2. It should not index special keywords inside HTML and PDF file (so if
somebody would search for "green" then it should only lookup "green
cables" and "green grass", but not <FONT COLOR="GREEN">)
Is there a library out there that can do the task for me? I can easily
do all parts except parsing a file and gather keywords.
Thanks in advance.
Laci 2.0