Huge Data Handling

Vishal G · Sep 30, 2008

Hi Guys,

I am trying to edit some bioinformatic package written in perl which
was written to handle DNA sequence of about 500,000 base long (a
string containg 500000 chrs)..

I have to enhance it to handle 100 million base long DNA...

Each base in DNA has this information, base (A, C, G or T), qual
(0-99), position (1-length)

there is one main DNA sequence and on average 500,000 parts (max 2000
chrs long with the same set of information)...

The program first creates an alignment like

*
Main - .....ACCCTTTGTCTAGTCGTATCGTCGATCGTCGCTAGCTCTGCT....
Part -
GTCGTATCGTCGAACGTCGCTAGCTC
Part - CTTTGTCTAGTCGTATCGTCGATCGTCGCT
Part
-
TCGAACGTCGCTAGCTCTG

Now, lets say I have to go thorugh each position and find how many
variations are present at certain position (with their original
position and quality).

Look at * position, there is T-A variation

Right now they are using hash to caputure this

%A, %C, %G, %T

Loop For Main DNA {
$A{$pos} = $qual; # this tells
me that there is A base at certain position

with some qual for main
}

Update the qual by adding the qual of parts

Loop For Parts {
$A{$pos} += $qual # for A parts

$T{$pos} += $qual $ for T parts
}
But because the dataset is huge, it consumes lot of memory...

so basically I am trying to figure out a way to store this information
without using much memory

If you dont understand the above problem, dont worry....

just tell me how to handle huge data which need to accessed frequently
using least possible memory..

Thanks in advance

John W. Krahn · Sep 30, 2008

Vishal said:
just tell me how to handle huge data which need to accessed frequently
using least possible memory..

perldoc -q "How can I make my Perl program take less memory"

John

Handling Huge Data	7	Sep 30, 2008
Windows LLDP Driver Responds With No Data	0	Mar 17, 2023
Data saving in condition of changing reality	0	Apr 29, 2022
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
Exception Handling	33	Mar 11, 2012
Perl storing huge data(300MB) in a scalar	21	Dec 5, 2006
Getting huge data into memory in perl	9	Nov 3, 2006
Efficient (HUGE) prime modulus	9	Nov 19, 2007

Huge Data Handling

Vishal G

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads