Speed Freak

Dave Bee · Dec 29, 2003

This is a conceptual question rather than a specific coding one, but
hopefully someone might have played around with something similar. In
a nutshell, I have around 10 million information entries with lots of
data points. My current script has two stages - the first organises
certain data points of the data into large (huge) hashes, and the
second stage forks off lots of children and does the subsequent
processing to produce ldifs, using the information in the hashes
(thanks to copy-on-write, and the fact that the children don't need to
update the hashes this doesn't use a great deal of memory).

My current problem is with stage one - it is, by current necessity, a
single process, since it needs to refer to information within the
hashes as it builds them, and the processing required by the single
processor is the choke point here. I would like to cut down the
current time it takes to do the first stage processing (~50 minutes)
and I am at liberty to use any interesting techniques in order to do
so - my hardware is somewhat above spec (24 CPU 6800, 48G RAM etc),
and can be dedicated 100% to the script when it runs, so unusual and
incredibly memory / CPU wasteful techniques are more than welcome.

I've thought of threading (no real experience, but I could probably
figure something out), parent hash-controller with multiple forked
children etc, I'm just curious if anyone has done something similar
and already knows the most efficient way of doing this.

Dave

How to speed up XML reading	11	Sep 11, 2012
EEG stream data with mne and brainfolw	0	Jul 26, 2023
An idea for heap allocation at near stack allocation speed	14	Feb 13, 2011
ifstream speed	3	Nov 16, 2008
Beginner's Guide to getting CipherSweet working with PDO and MYSQL	1	Dec 1, 2022
speed up linecache.getline()	1	Oct 13, 2009
[ANN] New paper published (Volume 7 of The Python Papers) -High-Speed Data Shredding using Python	1	Jul 30, 2012
Interning own classes like strings for speed and size?	11	Dec 27, 2010

Speed Freak

Dave Bee

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads