.NET and Oracle BLOB

Discussion in 'ASP .Net' started by Robert Vabo, Aug 20, 2003.

  1. Robert Vabo

    Robert Vabo Guest

    I have a database that is going to contain a lot of documents i
    ..DOC,.TXT,.PPT,.PDF etc. formats. I want to index the documets to use a free
    text search on the database table. I also want to insert and retrieve the
    documents using .NEt (C# or VB.NET) !

    Is there anyone of you out there that can give me some tips, links or other
    helpful hints ?

    --
    Regards
    Robert Vabo
    Gecko AS
    www.gecko.no
    Robert Vabo, Aug 20, 2003
    #1
    1. Advertising

  2. Robert Vabo

    Mark Kamoski Guest

    Robert--

    Hmmm. We'll you've asked a lot, actually.

    Here are some thoughts, because I find the question interesting.

    While storing documents in a database has often "seemed" like a good idea,
    the truth is that it is not. In short, a database is for storing data. A
    file system is for storing files. Sure, one can store binary data in a
    database and maybe (just maybe) this is OK in a case or two, the best place
    for documents seems to be the file system. That's the OS's job and it does
    it VERY well. One can get a DB to do it, but it is clunky at best.

    A good way to manage such documents, if you must have database "handle" on
    them, is to store the filename and perhaps the location in a database, as a
    "pointer" to the file itself. However, if you do this then there is an
    argument that says there are plenty of built-in DotNet classes for getting
    to and from the file system (which is a good argument), so the database is
    redundant anyway. Still, having the filenames collected neatly may be a
    good idea now and again.

    With files in a file system, one can hook to the file system's
    functionality for searching, or use some kind of indexing system, and so
    on. Usually, to build a searchable index, one gets a product or use's the
    OS's functionality. It is an involved task to write this sort of code; but,
    of course, it CAN be done.

    Now, if one simply MUST store files in a database, then it is going to be
    tricky building a dynamic index on documents of type PPT and the like. I
    expect it can be done, but I should want to avoid it. But, I am a shirker
    looking for the easiest way. Furthermore, building and keeping this "search
    index" fresh is going to take time, especially if there are "a lot of
    documents", as you have mentioned. Then again, some data analysis is
    required here-- for example, if the system is not in-use 24-hours a day,
    and if one does not need an up-to-the-minute index, then building a day-old
    index would be an option. And so on.

    Now, another way that I have addressed this issue is to truly separate
    content from format. I have designed a newsgroup system that stores each
    post's text in the database, as plain text. The formatting is handled by
    CSS and/or XSLT. This way, the database just handles plain text and it is
    easy to search. Furthermore, this is a relatively low-traffic newsgroup.
    Finally, there is a limit to the size of each post (which I control), so
    the database is not storing large pieces of text. All of this, however,
    makes for a much different problem set when compared to the one you
    describe; but, it may help you to think about the issues involved.

    As I mentioned, this is a BIG topic, so I'll stop here while I'm behind.
    There will be many arguments for and against what I have said, some good on
    both sides. Please just take this as food for thought. I doubt that I have
    clarified anything at all here; but, I hope that I have at least muddied
    the waters.

    HTH.

    --Mark.


    "Robert Vabo" <> wrote in message
    news:%23%...
    I have a database that is going to contain a lot of documents i
    ..DOC,.TXT,.PPT,.PDF etc. formats. I want to index the documets to use a
    free
    text search on the database table. I also want to insert and retrieve the
    documents using .NEt (C# or VB.NET) !

    Is there anyone of you out there that can give me some tips, links or other
    helpful hints ?

    --
    Regards
    Robert Vabo
    Gecko AS
    www.gecko.no
    Mark Kamoski, Aug 27, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. AZNewsh
    Replies:
    1
    Views:
    7,440
    bioscom
    Oct 15, 2009
  2. Michael Meyer

    oracle blob with weblogic 6.1

    Michael Meyer, Dec 16, 2003, in forum: Java
    Replies:
    0
    Views:
    467
    Michael Meyer
    Dec 16, 2003
  3. chomiq

    HIBERNATE, ORACLE, BLOB > 4k

    chomiq, Oct 20, 2005, in forum: Java
    Replies:
    1
    Views:
    8,656
    Adam Maass
    Oct 21, 2005
  4. Replies:
    3
    Views:
    2,539
  5. Feyruz
    Replies:
    4
    Views:
    2,094
    Sherm Pendley
    Oct 14, 2005
Loading...

Share This Page