[Search Engine - Internal site] DB or not DB ?

Discussion in 'ASP .Net' started by rs, Jun 8, 2006.

  1. rs

    rs Guest

    Hallo,

    I have a site with more than 15000 (15 thousand) pages.
    Each page has almost a textual content.
    Each page is about 10-25 Kb.

    I need to build an internal search engine
    by using Asp Net code.


    Which is the best way:

    1)


    create a DB (I have SQL 2005 Express)
    with a Table containing 5 columns:
    Id, page-link, page-title, keywords, all the textual content of the page

    Column example:
    05
    /Einstein.htm
    Einstein life
    birth, death
    Einstein was born in... and hand won the Nobel prize... and has dead in
    Berlin.

    then access to the DB using SELECT
    and CONTAINS (for the 5th column)
    and then go with
    Me.Response.Write WhatIFound



    or



    2)


    use no DB
    and search among the page Tags (Title, Keywords, Body)
    I presume by using the Regular Expression commands and the StringBuilder
    and then go with
    Me.Response.Write WhatIFound



    -----------------

    Which method of the two is better?

    Also, any suggestion, optimization, advice... about
    one or the two method is welcome.

    -------------------


    Thanks
     
    rs, Jun 8, 2006
    #1
    1. Advertising

  2. rs

    Guest

    Re: DB or not DB ?

    Ask yourself, WWGD (what would google do). You definitely need to
    create some sort of indexing tool here to spider the pages in case
    content changes and then store the indexed results in a db. All that
    being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
    party tools to do exactly what you want. Just search google for
    intranet search engine


    rs wrote:
    > Hallo,
    >
    > I have a site with more than 15000 (15 thousand) pages.
    > Each page has almost a textual content.
    > Each page is about 10-25 Kb.
    >
    > I need to build an internal search engine
    > by using Asp Net code.
    >
    >
    > Which is the best way:
    >
    > 1)
    >
    >
    > create a DB (I have SQL 2005 Express)
    > with a Table containing 5 columns:
    > Id, page-link, page-title, keywords, all the textual content of the page
    >
    > Column example:
    > 05
    > /Einstein.htm
    > Einstein life
    > birth, death
    > Einstein was born in... and hand won the Nobel prize... and has dead in
    > Berlin.
    >
    > then access to the DB using SELECT
    > and CONTAINS (for the 5th column)
    > and then go with
    > Me.Response.Write WhatIFound
    >
    >
    >
    > or
    >
    >
    >
    > 2)
    >
    >
    > use no DB
    > and search among the page Tags (Title, Keywords, Body)
    > I presume by using the Regular Expression commands and the StringBuilder
    > and then go with
    > Me.Response.Write WhatIFound
    >
    >
    >
    > -----------------
    >
    > Which method of the two is better?
    >
    > Also, any suggestion, optimization, advice... about
    > one or the two method is welcome.
    >
    > -------------------
    >
    >
    > Thanks
     
    , Jun 9, 2006
    #2
    1. Advertising

  3. rs

    rs Guest

    Re: DB or not DB ?

    I will not add a lot of pages (5-10 pages a year)
    so indexing is not a problem.

    I'm a new programmer and want to learn.

    I'd like to receive technical information
    about sizes, speed, query, chaching...
    and at last to decide which of the two methods is better...


    >Ask yourself, WWGD (what would google do). You definitely need to
    >create some sort of indexing tool here to spider the pages in case
    >content changes and then store the indexed results in a db. All that
    >being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
    >party tools to do exactly what you want. Just search google for
    >intranet search engine
    >
    >
    >rs wrote:
    >> Hallo,
    >>
    >> I have a site with more than 15000 (15 thousand) pages.
    >> Each page has almost a textual content.
    >> Each page is about 10-25 Kb.
    >>
    >> I need to build an internal search engine
    >> by using Asp Net code.
    >>
    >>
    >> Which is the best way:
    >>
    >> 1)
    >>
    >>
    >> create a DB (I have SQL 2005 Express)
    >> with a Table containing 5 columns:
    >> Id, page-link, page-title, keywords, all the textual content of the page
    >>
    >> Column example:
    >> 05
    >> /Einstein.htm
    >> Einstein life
    >> birth, death
    >> Einstein was born in... and hand won the Nobel prize... and has dead in
    >> Berlin.
    >>
    >> then access to the DB using SELECT
    >> and CONTAINS (for the 5th column)
    >> and then go with
    >> Me.Response.Write WhatIFound
    >>
    >>
    >>
    >> or
    >>
    >>
    >>
    >> 2)
    >>
    >>
    >> use no DB
    >> and search among the page Tags (Title, Keywords, Body)
    >> I presume by using the Regular Expression commands and the StringBuilder
    >> and then go with
    >> Me.Response.Write WhatIFound
    >>
    >>
    >>
    >> -----------------
    >>
    >> Which method of the two is better?
    >>
    >> Also, any suggestion, optimization, advice... about
    >> one or the two method is welcome.
    >>
    >> -------------------
    >>
    >>
    >> Thanks
     
    rs, Jun 9, 2006
    #3
  4. rs

    Guest

    Re: DB or not DB ?

    You want to automate the indexing here because the flexibility that
    will allow makes the effort it would take to create well worth it.
    Store your collection/indexing results in a database and the query,
    caching, speed and sizes will be handled for you (you can learn about
    database tuning here, a piece of knowledge almost all programmers
    should have). You can use a built in text searching mechanism (every
    RDBMS that I know of has one) or write (or reuse) an implementation of
    any of the string searching algorithms out there. Make sure you
    abstract whatever implementation you choose for each part,
    collection/indexing/searching/etc as much as possible so you can modify
    things as desired/needed (ie plugging in a different search algorithm,
    database, etc).


    rs wrote:
    > I will not add a lot of pages (5-10 pages a year)
    > so indexing is not a problem.
    >
    > I'm a new programmer and want to learn.
    >
    > I'd like to receive technical information
    > about sizes, speed, query, chaching...
    > and at last to decide which of the two methods is better...
    >
    >
    > >Ask yourself, WWGD (what would google do). You definitely need to
    > >create some sort of indexing tool here to spider the pages in case
    > >content changes and then store the indexed results in a db. All that
    > >being said, I wouldn't reinvent the wheel here. There are plenty of 3rd
    > >party tools to do exactly what you want. Just search google for
    > >intranet search engine
    > >
    > >
    > >rs wrote:
    > >> Hallo,
    > >>
    > >> I have a site with more than 15000 (15 thousand) pages.
    > >> Each page has almost a textual content.
    > >> Each page is about 10-25 Kb.
    > >>
    > >> I need to build an internal search engine
    > >> by using Asp Net code.
    > >>
    > >>
    > >> Which is the best way:
    > >>
    > >> 1)
    > >>
    > >>
    > >> create a DB (I have SQL 2005 Express)
    > >> with a Table containing 5 columns:
    > >> Id, page-link, page-title, keywords, all the textual content of the page
    > >>
    > >> Column example:
    > >> 05
    > >> /Einstein.htm
    > >> Einstein life
    > >> birth, death
    > >> Einstein was born in... and hand won the Nobel prize... and has dead in
    > >> Berlin.
    > >>
    > >> then access to the DB using SELECT
    > >> and CONTAINS (for the 5th column)
    > >> and then go with
    > >> Me.Response.Write WhatIFound
    > >>
    > >>
    > >>
    > >> or
    > >>
    > >>
    > >>
    > >> 2)
    > >>
    > >>
    > >> use no DB
    > >> and search among the page Tags (Title, Keywords, Body)
    > >> I presume by using the Regular Expression commands and the StringBuilder
    > >> and then go with
    > >> Me.Response.Write WhatIFound
    > >>
    > >>
    > >>
    > >> -----------------
    > >>
    > >> Which method of the two is better?
    > >>
    > >> Also, any suggestion, optimization, advice... about
    > >> one or the two method is welcome.
    > >>
    > >> -------------------
    > >>
    > >>
    > >> Thanks
     
    , Jun 9, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. moondaddy
    Replies:
    4
    Views:
    522
    Steven Cheng[MSFT]
    May 6, 2004
  2. =?Utf-8?B?TG9nYW4gTm95ZXM=?=

    Search engine software to integrate into ASP.NET site

    =?Utf-8?B?TG9nYW4gTm95ZXM=?=, Apr 14, 2006, in forum: ASP .Net
    Replies:
    2
    Views:
    371
    clintonG
    Apr 14, 2006
  3. igthibau
    Replies:
    4
    Views:
    341
    Moonlit
    Oct 27, 2003
  4. Sasha
    Replies:
    3
    Views:
    591
    Sasha
    May 22, 2007
  5. pandi
    Replies:
    5
    Views:
    455
    pandi
    Dec 14, 2009
Loading...

Share This Page