Runtime.getRuntime().exec() very slow in Java program.

A

au.danji

I am trying to run the Runtime.getRuntime().exec(shellCMD) to copy
files on a linux system.
but the getRuntime.exec() is very slow, it can only copy around 2-10
documents/second to my target directory when I have 1000 files. Can
anyone give some suggestions about my code below? thanks.

for (doc ahit : docList) {

try{
shellCMD="cp "+ srcDir + "/"+ ahit.doc_id + " " +
tarDumpDir; //copy xml to tmp folder
Process process = Runtime.getRuntime().exec
(shellCMD);
//process.waitFor();
process.getInputStream().close();
process.getOutputStream().close();
process.getErrorStream().close();
}
catch (Exception e){
logError("Copy XML fail: " + e);
}
}
 
L

Lew

I am trying to run the Runtime.getRuntime().exec(shellCMD) to copy
files on a linux system.
but the getRuntime.exec() is very slow, it can only copy around 2-10
documents/second to my target directory when I have 1000 files.

How fast should it be?

Why do you think it should be that fast instead?
Can anyone give some suggestions about my code below? thanks.

for (doc ahit : docList) {

            try{
                shellCMD="cp "+ srcDir + "/"+ ahit.doc_id + " " +
tarDumpDir;  //copy xml to tmp folder
                Process process = Runtime.getRuntime().exec
(shellCMD);
                //process.waitFor();
                process.getInputStream().close();
                process.getOutputStream().close();
                process.getErrorStream().close();
            }
            catch (Exception e){
                logError("Copy XML fail: "  + e);
            }
        }

I am only speculating, but several things occur to me.

How fast would a shell script run:

#!/bin/bash
for fl in $*
do
bash cp ${srcDir}/${fl} ${tarDumpDir}/
done
?

Your program has to start a shell for each file copied. Given that
you show us the "cp" command, presumably that shell has to process /
etc/profile, ~/.profile and ~/.bashrc (or equivalent) each time, not
to mention the scripts in /etc/profile.d/. Add to that the overhead
of 'Runtime#exec()'.

It would likely run faster if you either ran a single shell command to
copy all the files, or used pure Java to do the copy without using
'Runtime' at all.

With a pure Java approach, you can put each copy in its own thread to
achieve a measure of parallelism. Plus it would be portable.
 
K

Knute Johnson

I am trying to run the Runtime.getRuntime().exec(shellCMD) to copy
files on a linux system.
but the getRuntime.exec() is very slow, it can only copy around 2-10
documents/second to my target directory when I have 1000 files. Can
anyone give some suggestions about my code below? thanks.

for (doc ahit : docList) {

try{
shellCMD="cp "+ srcDir + "/"+ ahit.doc_id + " " +
tarDumpDir; //copy xml to tmp folder
Process process = Runtime.getRuntime().exec
(shellCMD);
//process.waitFor();
process.getInputStream().close();
process.getOutputStream().close();
process.getErrorStream().close();
}
catch (Exception e){
logError("Copy XML fail: " + e);
}
}

Just use the Java buffered streams with large buffers to do your copying.
 
L

Lew

BTW, 'doc' as a type name does not follow the Java naming conventions,
which call for the first letter of type names to be upper case.

<http://java.sun.com/docs/codeconv/index.html>

A common and sensible variant of these conventions is to place an
opening brace on its own line, indented to the same level as the
control structure and the closing brace. Otherwise the Java community
follows them rather closely. To do so promotes effective
communication and helps minimize bugs.
 
A

Arne Vajhøj

I am trying to run the Runtime.getRuntime().exec(shellCMD) to copy
files on a linux system.
but the getRuntime.exec() is very slow, it can only copy around 2-10
documents/second to my target directory

That does not say anything, because we do not know how big
your files are.
when I have 1000 files. Can
anyone give some suggestions about my code below? thanks.

for (doc ahit : docList) {

try{
shellCMD="cp "+ srcDir + "/"+ ahit.doc_id + " " +
tarDumpDir; //copy xml to tmp folder
Process process = Runtime.getRuntime().exec
(shellCMD);
//process.waitFor();
process.getInputStream().close();
process.getOutputStream().close();
process.getErrorStream().close();
}
catch (Exception e){
logError("Copy XML fail: " + e);
}
}

Both process creation and file creation are expensive operations. But
they should be a lot faster than 2-10 per second.

So if it is small files, then it should be faster.

But there are a lot of parameters that influence it: available memory,
other IO on the same disks etc..

BTW, what is the purpose of getting all the streams and closing them?

And not calling waitFor in the loop will help paralleization, but
not waitFor at all makes the situation after the loop a bit
fuzzy.

Arne
 
J

Jon Gómez

Lew said:
I am only speculating, but several things occur to me.

How fast would a shell script run:

#!/bin/bash
for fl in $*
do
bash cp ${srcDir}/${fl} ${tarDumpDir}/
done
?

You should remove "bash" in the line underneath "do", as it expects a
script in that context and spawning another copy of bash is anyway
un-necessary.

Also, if the filenames have certain kinds of whitespace the shell
variables should be quoted: "${fl}", etc. This can cause trouble,
otherwise:

touch tmp1
hi="tmp1 tmp2"
cp ${tmp1}
# tmp2 now exists
# expansion outside quotes
# depends on IFS

The OP could also straight-out shell glob in the call to exec(). Here's
an example:

Runtime.getRuntime().exec(
new String[] {
"sh", "-c", "cp ? to/"
},
null,
null
);

You stole my thunder though, with the shell script suggestion :). I
totally didn't even think of multithreading, I was so obsessed with the
inefficiency of buffered I/O for a simple system call (SYS_rename?), but
that's a great idea (especially if we want to be more portable).

Jon.
 
A

Arne Vajhøj

Jon said:
You should remove "bash" in the line underneath "do", as it expects a
script in that context and spawning another copy of bash is anyway
un-necessary.

It is unnecessary functionality wise.

But it is necessary to do the same as the Java program.

Arne
 
R

Roedy Green

I am trying to run the Runtime.getRuntime().exec(shellCMD) to copy
files on a linux system.
but the getRuntime.exec() is very slow, it can only copy around 2-10
documents/second to my target directory when I have 1000 files. Can
anyone give some suggestions about my code below? thanks.

Use the FileTransfer class to copy files purely within Java. This will
be much faster than spawning a command processor and copy. It will
not have to load any new code for each copy.

see http://mindprod.com/products1.html#FILETRANSFER

If the files are fairly small, you can do it even faster with

http://mindprod.com/products1.html#HUNKIO
--
Roedy Green Canadian Mind Products
http://mindprod.com

Now for something completely different:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top