Huge Data Handling

V

Vishal G

Hi Guys,

I am trying to edit some bioinformatic package written in perl which
was written to handle DNA sequence of about 500,000 base long (a
string containg 500000 chrs)..

I have to enhance it to handle 100 million base long DNA...

Each base in DNA has this information, base (A, C, G or T), qual
(0-99), position (1-length)

there is one main DNA sequence and on average 500,000 parts (max 2000
chrs long with the same set of information)...

The program first creates an alignment like

*
Main - .....ACCCTTTGTCTAGTCGTATCGTCGATCGTCGCTAGCTCTGCT....
Part -
GTCGTATCGTCGAACGTCGCTAGCTC
Part - CTTTGTCTAGTCGTATCGTCGATCGTCGCT
Part
-
TCGAACGTCGCTAGCTCTG

Now, lets say I have to go thorugh each position and find how many
variations are present at certain position (with their original
position and quality).

Look at * position, there is T-A variation

Right now they are using hash to caputure this

%A, %C, %G, %T

Loop For Main DNA {
$A{$pos} = $qual; # this tells
me that there is A base at certain position

with some qual for main
}

Update the qual by adding the qual of parts

Loop For Parts {
$A{$pos} += $qual # for A parts

$T{$pos} += $qual $ for T parts
}
But because the dataset is huge, it consumes lot of memory...

so basically I am trying to figure out a way to store this information
without using much memory

If you dont understand the above problem, dont worry....

just tell me how to handle huge data which need to accessed frequently
using least possible memory..

Thanks in advance
 
J

John W. Krahn

Vishal said:
just tell me how to handle huge data which need to accessed frequently
using least possible memory..

perldoc -q "How can I make my Perl program take less memory"


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top