Multi thread reading a file

M

Mag Gam

Hello All,

I am very new to python and I am in the process of loading a very
large compressed csv file into another format. I was wondering if I
can do this in a multi thread approach.

Here is the pseudo code I was thinking about:

Let T = Total number of lines in a file, Example 1000000 (1 million files)
Let B = Total number of lines in a buffer, for example 10000 lines


Create a thread to read until buffer
Create another thread to read buffer+buffer ( So we have 2 threads
now. But since the file is zipped I have to wait until the first
thread is completed. Unless someone knows of a clever technique.
Write the content of thread 1 into a numpy array
Write the content of thread 2 into a numpy array

But I don't think we are capable of multiprocessing tasks for this....


Any ideas? Has anyone ever tackled a problem like this before?
 
L

Lawrence D'Oliveiro

Mag Gam said:
I am very new to python and I am in the process of loading a very
large compressed csv file into another format. I was wondering if I
can do this in a multi thread approach.

Why bother?
 
M

Mag Gam

LOL! :)

Why not. I think I will take just do it single thread for now and if
performance is really that bad I can re investigate.

Either way, thanks everyone for your feedback! I think I like python a
lot because of the great user community and wiliness to help!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top