Search and Asp.Net

S

shapper

Hello,

I am working on a web site which displays various information from a
database:

1. Text articles

2. PDF files for download. The links and displayed information are also
retrieved from a database.

I need to create a search engine on my web site.

I am using Asp.Net 2.0 and SQL 2005.

Could someone give me some tips?

What are the options in the market, in terms of search in Asp.Net?

And what about Google search?

Anyway, I am completly ignorant about this, in terms of development, so
any help will be usefull.

Thanks,

Miguel
 
A

Alec MacLean

Hi Miguel,

If I've understood you correctly, you have a database containing various PDF
files and the other data values you mention?

If this is the case then your search would need to be conducted on the SQL
content. You could use a parameterised query to do this, but I would see
ther being an issue with trying to search the content of the PDF files, as
these would be held as image objects.

Unless you have some other meta data fields that indicate the content of the
PDFs?

Anyway, general approach to searching the "links and displayed information"
would be to do a SELECT using the LIKE operator:

CREATE PROCEDURE usp_SearchMyDB
@Criteria VARCHAR(15)
AS
SELECT RecordID, myDescription, myLink
FROM myTableName
WHERE myDescription LIKE '%' + @Criteria + '%'

This uses a wildcard wrap around the criteria value to allow greater
matching opportunity.

You should probably also ensure that you take steps to prevent the @Criteria
value containing potential malicious code (SQL Injection), although sprocs
in general mitigate this somewhat. (I'm not a DBA, so a post to the SQL
newsgroups will probably yield better advice).

Anyway, hope that helps.

Al
 
V

Van den Driessche Willy

The easiest way is to rely on the robots like Google to do their work.

If you want to include a search of your own, you can use microsoft indexing
service to do the work for you.
Microsoft indexing service should be properly configured on your machine
(normally it is).

Then you can query the catalog using normal ADO.NET objects :

http://idunno.org/dotNet/indexserver.aspx

(Note, for PDF files you needs to download a free iFilter implementation at
Adobe (http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611))
As I found it, querying the catalog is easy. The difficulty is finding the
most useful query for multiple keyword searches without bothering the user
too much with all of it.

(you can always check a query before in the MMC indexing service snapin
under Control Panel/Administrative tools/system)

Also, do not forget to specify which directories you want to search because
every file in every directory gets included by default.

Last, microsoft has a newsgroup devoted to indexing service.

(I used the indexing service for www.fwo.be (still in beta) and I find it
works quite good (and fast - but not on the home page yet))

Hope this helps.
 
S

shapper

Hi,

I will have keywords field for every file or article on my web site.

So in this first step querying the database will be enough ... I think.

I want to place a search text box in my web site.

However, I have a problem. I would like to make possible to people use
something like:

England AND Maps, England NOT London, "New York Turism", ...

Just like in Google. Do you think this would be easy?

Is something like that done?

Basically, I need to identify what are the Search Keywords and
operations like "AND", "NOT", etc.

And, of course, transform that in a Query to the database.

This was my first idea ... but I feel it will be really complicated to
do.

Thanks,
Miguel
 
V

Van den Driessche Willy

So you have more of a parsing problem. (which should probably go into a
parser newsgroup)
Parsing is not so difficult but that counts for most things you've done more
than once.

What you need is first of all a specification of your "language", just like
a programming language syntax.

You will specify the lexical tokens, probably "word", "and", "or", "not",
"comma", "OpenParentheses", "CloseParentheses".
You will write a lexical analyzer that will convert a "character stream"
into a "token stream" (a lexical analyzer has everything to with regular
expressions so you might want to use the regexp class for this)

Using these you will come up with a syntax for your language (this is out of
my head so please see that is what you want):
Query ::= AndFactor [ #comma# Query ]
AndFactor ::= OrFactor [#AND# AndFactor]
OrFactor ::= NotFactor [#OR# OrFactpr]
NotFactor ::= WordList | #Not# NotFactor
WordList ::= Word [WordList] | #OpenParentheses# Query #CloseParentheses#

This will most probably result in a typical parse tree. That parsetree will
have to "unparsed" into an SQL substring.

On the other hand, you can probably google-find a nice working parser
already made for you.

Hope this helps
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top