Random Access File as Set

Discussion in 'Java' started by carlbernardi@gmail.com, Oct 24, 2006.

  1. Guest

    Hi,

    I need to be able to use the Random Access File class as a Set class so
    there will be no duplicate entries. I thought about building a Random
    Access File that scans itself and won't add a similar entry but I have
    a few million unique entries of which many could be similar.

    Thanks,

    Carl
     
    , Oct 24, 2006
    #1
    1. Advertising

  2. wrote:
    > Hi,
    >
    > I need to be able to use the Random Access File class as a Set class so
    > there will be no duplicate entries. I thought about building a Random
    > Access File that scans itself and won't add a similar entry but I have
    > a few million unique entries of which many could be similar.


    Before there was SQL, and before there were B-trees, there were hash
    files. You're basically asking the first question asked by the first
    programmer who, nearly half a century ago. designed the first IBM RAMAC
    305 disk file.

    --
    John W. Kennedy
    "The blind rulers of Logres
    Nourished the land on a fallacy of rational virtue."
    -- Charles Williams. "Taliessin through Logres: Prelude"
     
    John W. Kennedy, Oct 24, 2006
    #2
    1. Advertising

  3. <> wrote in message
    news:...
    > Hi,
    >
    > I need to be able to use the Random Access File class as a Set class so
    > there will be no duplicate entries. I thought about building a Random
    > Access File that scans itself and won't add a similar entry but I have
    > a few million unique entries of which many could be similar.


    With a million entries this really sounds like it should be a database.
    I've built file-based object sets by using an index that contains the hash
    code and points to the address in the Random Access File where the object is
    found, but I was dealing with only tens of thousands. What are you storing
    in the file--arbitrary objects or strings or numbers? Are you expecting to
    delete items from the file? How big are the objects and are they easy to
    test for equality? Can you make a truly unique hash code?

    Matt Humphrey http://www.iviz.com/
     
    Matt Humphrey, Oct 24, 2006
    #3
  4. Mark Rafn Guest

    <> wrote:
    >I need to be able to use the Random Access File class as a Set class so
    >there will be no duplicate entries. I thought about building a Random
    >Access File that scans itself and won't add a similar entry but I have
    >a few million unique entries of which many could be similar.


    How big is your dataset? Don't overlook the idea of just dumping it all into
    a HashSet or TreeSet in memory (and deciding which to use will be useful even
    if you decide to go disk-based). Modern systems with multiple GiB of RAM can
    handle things that would have been lunacy a few years ago.

    If you intend to scale to many gigs, then on-disk solutions are needed. Look
    into sleepycat or some other disk hashing or tree storage sytem. Don't try to
    write it yourself unless you're doing it as a learning project and don't mind
    making bunches of mistakes.
    --
    Mark Rafn <http://www.dagon.net/>
     
    Mark Rafn, Oct 24, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Darren Clark

    Random NOt random?

    Darren Clark, Jun 24, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    481
    mikeb
    Jun 24, 2004
  2. Maziar Aflatoun

    Random not really random...

    Maziar Aflatoun, Aug 4, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    26,775
    Maziar Aflatoun
    Aug 5, 2004
  3. Kevin
    Replies:
    19
    Views:
    1,153
    Tris Orendorff
    Feb 13, 2006
  4. globalrev
    Replies:
    4
    Views:
    797
    Gabriel Genellina
    Apr 20, 2008
  5. VK
    Replies:
    15
    Views:
    1,284
    Dr J R Stockton
    May 2, 2010
Loading...

Share This Page