need advices... accessing a huge collection

G

GrelEns

hello,

having almost 1,000 tar.gz files in different directories (could not change
that) and these archives contain over 1,000,000 text files. I would like to
build a tool to access as quickly as possible any or sub-collection of these
text files to serve them by http upon user request.

does anyone have ideas on the good way to do it ?

(i was thinking of a mapping in a dictionary whose keys would be filename,
value - path to archive containing it, and extract all the files from a same
archive at the same time)

i also was wondering which is fastest :
- upon each user request, re-building a dictionary from reading key/value
from a file,
- or on the first request building a hard-coded python dictionary and then
importing it,
- or maybe other suggestions (storing in a database...) ?

thanx
 
P

Paul Rubin

GrelEns said:
i also was wondering which is fastest :
- upon each user request, re-building a dictionary from reading key/value
from a file,
- or on the first request building a hard-coded python dictionary and then
importing it,
- or maybe other suggestions (storing in a database...) ?

If the tar files are static (not being updated), simplest thing is use
dbm to store the dictionary.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top