How can you make idle processors pick up java work?

Q

qwertmonkey

Is there a way to make these processors pick up/share work also, or
Use multiple threads?
~
a) I need to actually scan large text files (10+ million lines).
b) On each line there is a NL sentence.
c) That processing should be run only once, but as fast as possible.
~
d) If you go:
d.1) int iPrx = Runtime.getRuntime().availableProcessors();
d.2) count all lines
d.3) split the file in (total lines)/iPrx
d.4) then run iPrx threads (or executable instances using a batch script)
the time you waste on d.2) and d.3) will make all that strat senseless
~
I have no way to influence how those large files are generated
~
e) because of the large sizes of the files you can't even go
~
FIS = new FileInputStream(IFl);
FileChannel IFlChnl = FIS.getChannel();
int iChnlSz = (int)IFlChnl.size();
MappedByteBuffer MptBytBfr = IFlChnl.map(FileChannel.MapMode.READ_ONLY, 0, iChnlSz);
~
so, apparently, the only option I have is:
~
BfR = Files.newBufferedReader(DirPth, ChrStUTF8);
String aSx = BfR.readLine();
while(aSx != null){

aSx = BfR.readLine();
}
~
do you know of a faster way to go about this?
~
lbrtchx
 
D

David Lamb

~
a) I need to actually scan large text files (10+ million lines).
b) On each line there is a NL sentence.
c) That processing should be run only once, but as fast as possible.
~
d) If you go:
d.1) int iPrx = Runtime.getRuntime().availableProcessors();
d.2) count all lines
d.3) split the file in (total lines)/iPrx
d.4) then run iPrx threads (or executable instances using a batch script)
the time you waste on d.2) and d.3) will make all that strat senseless

How slow is the NL processing? Does it make any sense to read lines in
one thread and pass each off to one of the iPrx-1 other threads that
might run on separate processors?
 
J

Joshua Cranmer

[Gah, your newsreader is incapable of threading posts correctly. Please
find a non-broken one.]

~
a) I need to actually scan large text files (10+ million lines).
b) On each line there is a NL sentence.
c) That processing should be run only once, but as fast as possible.

Only 10M-line files?

The easiest way to do this is to just make a ThreadPoolExecutor and have
your main thread dispatch requests as fast as possible to the pool. Or
you can do the work pooling yourself, which may be faster since you're
not continually posting Runnable's, but timing results would be
necessary to convince me.

There are other options, but chances are, your disk drive is going to
saturate first (in short, it involves reading non-consecutive pages of
the file, which is generally a recipe for disaster).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top