help on how to save/load this data structure?

Discussion in 'Java' started by Kevin, May 18, 2005.

  1. Kevin

    Kevin Guest

    Hi Guys,
    I am wondering if any suggestions on how to do the coding for this data
    structure and requirements:

    The story:

    1) There are a large number of log data, which is line by line (text).
    Each line has an line ID (integer). Basically, we can think each line
    of data is logs at that time, say, each second a line is added to the
    log. The total lines are more than 10 millions.
    2) There are a large number of possible events (say 200K, with event ID
    to identify them). When one even occurs, it will generate a value in
    the log data. Since the events can occur con-currently, so one line of
    data may have many values in it.

    The abstract data structure:

    It is required that one event ID (Integer) corresponds to many line IDs
    (Integer), in which this event occurs.
    If the total size is small, we can use a naïve way as: save all the
    IDs into a hash, with event ID as key, and an ArrayList (or Hashtable
    since we do not need the lineIDs to be in order) as value to the hash,
    each item in the ArrayList is line ID (Integer).
    There are some methods that can save some memory, such as customized
    array and do not use Integer (8 bytes each one), etc. But with the
    above mentioned size, these ways are just no help.

    The required operations on the data:

    The application needs to build such a data structure which supports
    these two operations:
    1) Given an event ID, find all the line IDs of that event.
    2) Given a group of event IDs, find all the line IDs of the group
    (basically a "union" of the set of line IDs of each event ID).

    Any idea of how to build such a big structure? I think there should not
    be any way to fit them into memory (java 1.4's stack size is max
    1.3G, on win32, I think). If we can swap some of them out to a file,
    read them in only when needed, how to construct the structure so we can
    do the job more efficiently? Or will it be better (faster) if we put
    all the IDs into a database table and use SQL to get them?

    Thanks a lot and you have a great day. :)

    By the way, any faster way to write/read large number of int to and
    from a file? Some days ago, I did a test using ObjectOutputStream's
    writInt(), if I remember right, it took about 3 seconds to write 10^7
    int to a file, which resulted in a file about 38M.
     
    Kevin, May 18, 2005
    #1
    1. Advertising

  2. Kevin

    Wendy Smoak Guest

    "Kevin" <> wrote:

    > Or will it be better (faster) if we put
    > all the IDs into a database table and use SQL to get them?


    I think you answered your own question. :)

    --
    Wendy
     
    Wendy Smoak, May 19, 2005
    #2
    1. Advertising

  3. Kevin

    Kevin Guest

    I never program SQL in Java before. Would that be slow to issue SQL
    calls? I have the feeling that large number of SQL calls will be slow
    (especially I can only find a normal, not super fast, machine for DB
    server).

    Wendy Smoak wrote:
    > "Kevin" <> wrote:
    >
    > > Or will it be better (faster) if we put
    > > all the IDs into a database table and use SQL to get them?

    >
    > I think you answered your own question. :)
    >
    > --
    > Wendy
     
    Kevin, May 19, 2005
    #3
  4. Kevin

    Kevin Guest

    By the way, myself don't mind using database or not. But it seems the
    end user would like a "stand-alone" program, using database will make
    him kind of unhappy. :-(
     
    Kevin, May 19, 2005
    #4
  5. "Kevin" <> wrote:

    >By the way, myself don't mind using database or not. But it seems the
    >end user would like a "stand-alone" program, using database will make
    >him kind of unhappy. :-(


    There are "embedded" databases which don't require a separate server,
    for example http://hsqldb.sourceforge.net/ .

    To your original question: Yes, I think a database is the way to go -
    that's what databases exist for :)

    Robert
     
    Robert Mischke, May 19, 2005
    #5
  6. Kevin

    Kevin Guest

    Thanks a lot. That URL will be very helpful. :)
     
    Kevin, May 19, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ABC
    Replies:
    0
    Views:
    346
  2. Kaidi
    Replies:
    2
    Views:
    1,130
    Kaidi
    Jul 1, 2004
  3. Kevin
    Replies:
    1
    Views:
    441
    Wibble
    Jun 12, 2005
  4. =?Utf-8?B?VG9tIFMu?=

    load and save PDF xml data

    =?Utf-8?B?VG9tIFMu?=, May 5, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    421
    =?Utf-8?B?VG9tIFMu?=
    May 5, 2006
  5. A
    Replies:
    27
    Views:
    1,609
    Jorgen Grahn
    Apr 17, 2011
Loading...

Share This Page