Data Structures Question...

Discussion in 'C++' started by Saikrishna, Apr 6, 2004.

  1. Saikrishna

    Saikrishna Guest

    Hi friends,
    I am trying to choose the best possible data structure for the probelm
    I am going to describe now.

    I have lets say tens of thousands of numbers in file1 and tens of
    thousands of numbers in another file2. file1 & file2 contents (only
    numbers) can be entirely different.

    Now the program should be able to read the files and give me a
    difference file. Ofcourse this is a very easy implementation if I go
    with primitive programming using "ifs" and "whiles".

    This is what I need to do. First I need to rearrange the data in file1
    in vary compact format so that my program uses as little RAM as
    possible. SETS are one way of doing that.

    For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
    file1 may contain several thousand numbers) then I can rearrange them
    like these sets

    {1-4} // Numbers 1 to 4
    {9-12} // Numbers 9 to 12
    {6-7} // Numbers 6 to 7
    {22} // Single number 22.

    And similraly I need to rearrange contents of file2 in this format.

    Now my question is the best possible method for storage of this
    datastructure.
    Linked Lists, Tree or others or simply use STLs? Also I will be happy
    if somebody can give me an idea about the best way of doing
    comparision process
    between sets of file1 and file2 and produce the difference set in a
    number format in a result file "file3".

    Thanks,
    Sai
     
    Saikrishna, Apr 6, 2004
    #1
    1. Advertising

  2. Saikrishna wrote:
    > Hi friends,
    > I am trying to choose the best possible data structure for the probelm
    > I am going to describe now.
    >
    > I have lets say tens of thousands of numbers in file1 and tens of
    > thousands of numbers in another file2. file1 & file2 contents (only
    > numbers) can be entirely different.
    >
    > Now the program should be able to read the files and give me a
    > difference file. Ofcourse this is a very easy implementation if I go
    > with primitive programming using "ifs" and "whiles".
    >
    > This is what I need to do. First I need to rearrange the data in file1
    > in vary compact format so that my program uses as little RAM as
    > possible. SETS are one way of doing that.
    >
    > For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
    > file1 may contain several thousand numbers) then I can rearrange them
    > like these sets
    >
    > {1-4} // Numbers 1 to 4
    > {9-12} // Numbers 9 to 12
    > {6-7} // Numbers 6 to 7
    > {22} // Single number 22.
    >
    > And similraly I need to rearrange contents of file2 in this format.
    >
    > Now my question is the best possible method for storage of this
    > datastructure.
    > Linked Lists, Tree or others or simply use STLs? Also I will be happy
    > if somebody can give me an idea about the best way of doing
    > comparision process
    > between sets of file1 and file2 and produce the difference set in a
    > number format in a result file "file3".
    >
    > Thanks,
    > Sai


    Your "sets" are also termed "runs" in a typical Merge Sort
    algorithm.

    This is possibly an algorithm issue, not a language issue.
    My suggestion is to sort both files into a third, then
    remove duplicates. Perhaps the folks in news:comp.programming
    can offer better advice. Follow-ups set.

    --
    Thomas Matthews

    C++ newsgroup welcome message:
    http://www.slack.net/~shiva/welcome.txt
    C++ Faq: http://www.parashift.com/c -faq-lite
    C Faq: http://www.eskimo.com/~scs/c-faq/top.html
    alt.comp.lang.learn.c-c++ faq:
    http://www.raos.demon.uk/acllc-c /faq.html
    Other sites:
    http://www.josuttis.com -- C++ STL Library book
     
    Thomas Matthews, Apr 6, 2004
    #2
    1. Advertising

  3. Re: [OT] Data Structures Question...

    There are three DIFFerent methods to attack the problem.

    The first method is to look at the implementation of the diff utility
    for Linux/Unix.
    The second is to look at the implemenation of the sequnce-alignment from
    biological-computer-science. There they determine the best aligments of
    the gene sequences and it may have smth to do with your problem.
    The laziest method is to ask in some programming / algorithmics group,
    since algorithms/data structures are

    O F F - T O P I C

    here. That means that people here are not always qualified enough to
    reply. (No offence meant.)

    --
    Best regards,
    Alex.

    PS. To email me, remove "loeschedies" from the email address given.
     
    Alexander Malkis, Apr 6, 2004
    #3
  4. Thomas Matthews schrieb:
    > Saikrishna wrote:
    > > I am trying to choose the best possible data structure for the probelm
    > > I am going to describe now.


    > > For example if file1 contains 2,3,1,7,9,10,11,12,4,6,22 ( In reality
    > > file1 may contain several thousand numbers) then I can rearrange them
    > > like these sets
    > >
    > > {1-4} // Numbers 1 to 4
    > > {9-12} // Numbers 9 to 12
    > > {6-7} // Numbers 6 to 7
    > > {22} // Single number 22.
    > >
    > > And similraly I need to rearrange contents of file2 in this format.
    > >
    > > Now my question is the best possible method for storage of this
    > > datastructure.
    > > Linked Lists, Tree or others or simply use STLs?


    Have a look at the data structure called "discrete interval encoding
    tree" or "diet"; I think that's what you're after.

    See e.g. http://www.nist.gov/dads/HTML/discretintrv.html

    Michael
    --
    Feel the stare of my burning hamster and stop smoking!
     
    Michael Mendelsohn, Apr 8, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tweak
    Replies:
    14
    Views:
    2,788
    Eric Sosman
    Jun 11, 2004
  2. Shwetabh

    Data structures question

    Shwetabh, Jul 26, 2005, in forum: C Programming
    Replies:
    5
    Views:
    275
    Lawrence Kirby
    Aug 1, 2005
  3. Alfonso Morra
    Replies:
    11
    Views:
    721
    Emmanuel Delahaye
    Sep 24, 2005
  4. JSeb
    Replies:
    4
    Views:
    467
    Juha Nieminen
    Feb 12, 2008
  5. codingJoe
    Replies:
    3
    Views:
    321
    codingJoe
    Oct 31, 2009
Loading...

Share This Page