Huge Data Handling

Discussion in 'Perl Misc' started by Vishal G, Sep 30, 2008.

  1. Vishal G

    Vishal G Guest

    Hi Guys,

    I am trying to edit some bioinformatic package written in perl which
    was written to handle DNA sequence of about 500,000 base long (a
    string containg 500000 chrs)..

    I have to enhance it to handle 100 million base long DNA...

    Each base in DNA has this information, base (A, C, G or T), qual
    (0-99), position (1-length)

    there is one main DNA sequence and on average 500,000 parts (max 2000
    chrs long with the same set of information)...

    The program first creates an alignment like

    *
    Main - .....ACCCTTTGTCTAGTCGTATCGTCGATCGTCGCTAGCTCTGCT....
    Part -
    GTCGTATCGTCGAACGTCGCTAGCTC
    Part - CTTTGTCTAGTCGTATCGTCGATCGTCGCT
    Part
    -
    TCGAACGTCGCTAGCTCTG

    Now, lets say I have to go thorugh each position and find how many
    variations are present at certain position (with their original
    position and quality).

    Look at * position, there is T-A variation

    Right now they are using hash to caputure this

    %A, %C, %G, %T

    Loop For Main DNA {
    $A{$pos} = $qual; # this tells
    me that there is A base at certain position

    with some qual for main
    }

    Update the qual by adding the qual of parts

    Loop For Parts {
    $A{$pos} += $qual # for A parts

    $T{$pos} += $qual $ for T parts
    }
    But because the dataset is huge, it consumes lot of memory...

    so basically I am trying to figure out a way to store this information
    without using much memory

    If you dont understand the above problem, dont worry....

    just tell me how to handle huge data which need to accessed frequently
    using least possible memory..

    Thanks in advance
    Vishal G, Sep 30, 2008
    #1
    1. Advertising

  2. Vishal G wrote:
    >
    > just tell me how to handle huge data which need to accessed frequently
    > using least possible memory..


    perldoc -q "How can I make my Perl program take less memory"


    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
    John W. Krahn, Sep 30, 2008
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    494
  2. Simon
    Replies:
    13
    Views:
    635
    Eric Sosman
    Mar 25, 2011
  3. Simon Ng
    Replies:
    5
    Views:
    254
  4. Simon Ng
    Replies:
    5
    Views:
    217
  5. Vishal G

    Handling Huge Data

    Vishal G, Sep 30, 2008, in forum: Perl Misc
    Replies:
    7
    Views:
    115
    Ilya Zakharevich
    Oct 3, 2008
Loading...

Share This Page