Invoking 'diff' from java with piped input

D

David Kensche

Hello,
I want to call GNU diff from a java class with the following command

diff -u r2.txt r1.txt > r2r1.patch

except, that the three files don't exist, that is I have given the
input as two Strings and I want to read the output from a Stream.
Afaics I have to invoke

Process diffProc = Runtime.exec("diff -u");

and then can read the the result from the InputStream available by

diffProc.getInputStream();

My problem is: how do I give the input? I assume there has to be a
way to pipe the Strings to the process but I can't make out how.
Has anyone tried something similar successfully?

Thanks beforehand,
David
 
M

Michael Borgwardt

David said:
Hello,
I want to call GNU diff from a java class with the following command []
My problem is: how do I give the input? I assume there has to be a
way to pipe the Strings to the process but I can't make out how.

Just write them to a file, then give the file name as argument. No,
there is no other way, because a process can have only one input stream.

The other way would be to find and use a diff implementation in Java.
 
G

Gordon Beaton

I want to call GNU diff from a java class with the following command

diff -u r2.txt r1.txt > r2r1.patch

except, that the three files don't exist, that is I have given the
input as two Strings and I want to read the output from a Stream.
Afaics I have to invoke

Process diffProc = Runtime.exec("diff -u");

and then can read the the result from the InputStream available by

diffProc.getInputStream();

My problem is: how do I give the input? I assume there has to be a
way to pipe the Strings to the process but I can't make out how.
Has anyone tried something similar successfully?

Any process has only one standard input stream. It can't read from two
such streams simultaneously, however it can read from stdin. You tell
it to do that by specifying a hyphen (-) as of one of the filenames.
This is described in the diff documentation. Your program can write to
its stdin by writing to diffProc.getOutputStream().

From Java you can write to named pipes using regular file operations,
so presumably you could create a named pipe and pass its name to diff,
then then write to it from your Java program. You will need to use an
external helper to create the named pipe however (mknod or mkfifo).

If this seems complicated, consider writing the data to two temporary
files and passing their names to diff.

Alternately, there may be a Java version of diff that you can invoke
directly. For example, this: http://www.bmsi.com/java/#diff

/gordon
 
D

David Kensche

Hello again,
I tried to implement the solution with the named pipes. This is my
second try. I first tried it without the threads. The result is that
diff does not produce output, i.e. in the end 'patchString' is null.
When I don't flush nor close the writers I get the same behaviour.
On the other hand, if I flush or close (or both) diff does not
terminate. Do you know how to handle these streams?

Again, many thanks beforehand,
David

public PatchScript createPatch(final String orig, final String rev)
throws DiffFailedException {
logger.debug("orig=\n"+orig);
logger.debug("rev=\n"+rev);
PatchScript patch = null;
logger.debug("Write input to named pipes.");
new Thread() {
public void run() {
try {
logger.debug("Open pipe to write input to.");
Runtime.getRuntime().exec("mkfifo original");
FileWriter oWriter = new FileWriter(new File("original"));
oWriter.write(orig);
oWriter.flush();
oWriter.close();
} catch(IOException e) {
logger.warn("Could not write 'orig' to named pipe.", e);
}
}
}.start();
new Thread() {
public void run() {
try {
logger.debug("Open pipe to write input to.");
Runtime.getRuntime().exec("mkfifo revision");
FileWriter rWriter = new FileWriter(new File("revision"));
rWriter.write(rev);
rWriter.flush();
rWriter.close();
} catch(IOException e) {
logger.warn("Could not write 'rev' to named pipe.", e);
}
}
}.start();

try {
logger.debug("Start 'diff' process.");
Process diffProc = Runtime.getRuntime().exec("diff -u original
revision");
logger.debug("Wait for 'diff' to finish.");
diffProc.waitFor();
logger.debug("diff-status="+diffProc.exitValue()+". Read patch
script.");
BufferedReader reader = new BufferedReader(new
InputStreamReader(diffProc.getInputStream()));
String patchString = null;
String line = reader.readLine();
if(line != null) patchString = line;
while((line = reader.readLine()) != null) patchString += line + "\n";
logger.debug("patch=\n" + patchString);
patch = parser.parse(patchString);
} catch(Exception e) {
throw new DiffFailedException("Could not create patch script!", e);
}
return patch;
}
 
G

Gordon Beaton

I tried to implement the solution with the named pipes. This is my
second try. I first tried it without the threads. The result is that
diff does not produce output, i.e. in the end 'patchString' is null.
When I don't flush nor close the writers I get the same behaviour.
On the other hand, if I flush or close (or both) diff does not
terminate. Do you know how to handle these streams?

Your code works when I run it, however there may be a race condition
when you create the fifos in a separate thread. Do you know that they
exist before diff attempts to open them? Create the fifos in the main
thread; write to them in the writer threads.

Also, in your example, you wait for diff to finish before getting the
output stream. To avoid deadlocking, you need to read from diffs
output while it runs. If diff's output stream fills and there is
nobody reading from it, diff is prevented from continuing (and you
block waiting for it).

/gordon
 
A

Alan Gutierrez

David said:
Hello,
I want to call GNU diff from a java class with the following command []
My problem is: how do I give the input? I assume there has to be a
way to pipe the Strings to the process but I can't make out how.

Just write them to a file, then give the file name as argument. No,
there is no other way, because a process can have only one input stream.

The other way would be to find and use a diff implementation in Java.

Like, for example, the diff algorithm that comes with Eclipse.

It is under org.eclipse.compare.

I've extracted it for use with a testing framework. It is very easy
to use, and it is cross-platform pure Java.
 
D

David Kensche

Gordon said:
Your code works when I run it, however there may be a race condition
when you create the fifos in a separate thread. Do you know that they
exist before diff attempts to open them? Create the fifos in the main
thread; write to them in the writer threads.

Also, in your example, you wait for diff to finish before getting the
output stream. To avoid deadlocking, you need to read from diffs
output while it runs. If diff's output stream fills and there is
nobody reading from it, diff is prevented from continuing (and you
block waiting for it).

/gordon
Thank you,
now it works fine :). The problem was my waiting for diff before
reading the result. This is my running code:

public PatchScript createPatch(final String orig, final String rev)
throws DiffFailedException {
PatchScript patch = null;

try {
logger.debug("Create named pipes to write input to.");
Runtime.getRuntime().exec("mkfifo " + ORIG_PIPE_NAME);
Runtime.getRuntime().exec("mkfifo " + REV_PIPE_NAME);

logger.debug("Write input to named pipes.");
new Thread() {
public void run() {
try {
logger.debug("Open pipe:" + ORIG_PIPE_NAME);
File oPipe = new File(ORIG_PIPE_NAME);
if(oPipe.exists()) {
FileWriter oWriter = new FileWriter(oPipe);
oWriter.write(orig);
oWriter.flush();
oWriter.close();
logger.debug("Revision written. rWriter closed.");
} else logger.warn("Could not find named pipe: " +
ORIG_PIPE_NAME);
} catch(IOException e) {
logger.warn("Could not write 'orig' to named pipe.", e);
}
}
}.start();
new Thread() {
public void run() {
try {
logger.debug("Open pipe:" + REV_PIPE_NAME);
File rPipe = new File(REV_PIPE_NAME);
if(rPipe.exists()) {
FileWriter rWriter = new FileWriter(rPipe);
rWriter.write(rev);
rWriter.flush();
rWriter.close();
logger.debug("Revision written. rWriter closed.");
} else logger.warn("Could not find named pipe: " +
REV_PIPE_NAME);
} catch(IOException e) {
logger.warn("Could not write 'rev' to named pipe.", e);
}
}
}.start();

logger.debug("Start 'diff' process.");
Process diffProc = Runtime.getRuntime().exec("diff -u original
revision");

logger.debug("Read result.");
BufferedReader reader = new BufferedReader(new
InputStreamReader(diffProc.getInputStream()));
String line = reader.readLine();
String patchString = null;
if(line != null) patchString = line;
while((line = reader.readLine()) != null) patchString += line + "\n";
logger.debug("patch=\n" + patchString);

logger.debug("Wait for 'diff' to finish.");
diffProc.waitFor();
logger.debug("diff-status="+diffProc.exitValue()+". Read patch
script.");

patch = parser.parse(patchString);
} catch(Exception e) {
throw new DiffFailedException("Could not create patch script!", e);
}
return patch;
}
 
D

David Kensche

Alan said:
David Kensche wrote:

Hello,
I want to call GNU diff from a java class with the following command
[]

My problem is: how do I give the input? I assume there has to be a
way to pipe the Strings to the process but I can't make out how.

Just write them to a file, then give the file name as argument. No,
there is no other way, because a process can have only one input stream.

The other way would be to find and use a diff implementation in Java.


Like, for example, the diff algorithm that comes with Eclipse.

It is under org.eclipse.compare.

I've extracted it for use with a testing framework. It is very easy
to use, and it is cross-platform pure Java.
Hello,
my first implementation used jrcs but this was prohibitively slow in
patching. This is why I decided to try GNU diff/patch. But to be honest
I thought about trying eclipse instead but I was sure, that eclipse
uses diff and patch as provided by the cvs installation. But if there
is a java implementation, I will try this, too.

Thanks,
David
 
D

David Kensche

David said:
Thank you,
now it works fine :). The problem was my waiting for diff before
reading the result. This is my running code:
Damn, I was wrong. The only reason why it worked was that I forgot to
replace the filenames in the diff invocation by the newly introduced
constants (which had different values). Thus the diff read from pipes
which were created in earlier tests and the pipes created by the given
code were not read from at all! I now corrected this but the
'original.fifo' cannot be found, even if I switch the order of the two
writer threads. I think I got something seriously wrong with thread
programming?

public PatchScript createPatch(final String orig, final String rev)
throws DiffFailedException {
PatchScript patch = null;

try {
logger.debug("Create named pipes to write input to.");
Runtime.getRuntime().exec("mkfifo " + ORIG_PIPE_NAME);
Runtime.getRuntime().exec("mkfifo " + REV_PIPE_NAME);
long millis = System.currentTimeMillis() + 5000;
while(System.currentTimeMillis() < millis) {}

logger.debug("Write input to named pipes.");
new Thread() {
public void run() {
try {
logger.debug("Open pipe:" + ORIG_PIPE_NAME);
File oPipe = new File(ORIG_PIPE_NAME);
if(oPipe.exists()) {
FileWriter oWriter = new FileWriter(oPipe);
oWriter.write(orig);
oWriter.flush();
oWriter.close();
logger.debug("Original written. Writer closed.");
} else logger.warn("Could not find named pipe: " +
ORIG_PIPE_NAME);
} catch(IOException e) {
logger.warn("Could not write 'orig' to named pipe.", e);
}
}
}.start();
new Thread() {
public void run() {
try {
logger.debug("Open pipe:" + REV_PIPE_NAME);
File rPipe = new File(REV_PIPE_NAME);
if(rPipe.exists()) {
FileWriter rWriter = new FileWriter(rPipe);
rWriter.write(rev);
rWriter.flush();
rWriter.close();
logger.debug("Revision written. Writer closed.");
} else logger.warn("Could not find named pipe: " +
REV_PIPE_NAME);
} catch(IOException e) {
logger.warn("Could not write 'rev' to named pipe.", e);
}
}
}.start();

logger.debug("Start 'diff' process.");
Process diffProc = Runtime.getRuntime().exec("diff -u " +
ORIG_PIPE_NAME + " " + REV_PIPE_NAME);

logger.debug("Read result.");
BufferedReader reader = new BufferedReader(new
InputStreamReader(diffProc.getInputStream()));
String line = reader.readLine();
String patchString = null;
if(line != null) patchString = line + "\n";
while((line = reader.readLine()) != null) patchString += line + "\n";
logger.debug("patch=\n" + patchString);

logger.debug("Wait for 'diff' to finish.");
diffProc.waitFor();
logger.debug("diff-status="+diffProc.exitValue()+". Read patch
script.");

patch = parser.parse(patchString);
} catch(Exception e) {
throw new DiffFailedException("Could not create patch script!", e);
}
return patch;
}
 
G

Gordon Beaton

Thank you,
now it works fine :). The problem was my waiting for diff before
reading the result. This is my running code:

Good you got it working, but after suggesting that you use fifos I
realized that using fifos and temporary files involve virtually the
same steps. The difference is that the fifo solution doesn't require
(much) disk space, but it's the less portable alternative and
additionally the fifos need to be explicitely created. In retrospect,
I would have used temporary files.

Note that you only needed to create one fifo, since diff will read one
of the "files" from stdin if you specify "-" as the filename.

/gordon
 
A

Alan Gutierrez

Alan Gutierrez wrote:
my first implementation used jrcs but this was prohibitively slow in
patching. This is why I decided to try GNU diff/patch. But to be honest
I thought about trying eclipse instead but I was sure, that eclipse
uses diff and patch as provided by the cvs installation. But if there
is a java implementation, I will try this, too.

Please share your experiences with all these different
difference implementations. I'd like to know how the Eclipse compare
algorithm compares to GNU diff and JRCS.
 
D

David Kensche

Please share your experiences with all these different
difference implementations. I'd like to know how the Eclipse compare
algorithm compares to GNU diff and JRCS.

To be honest I just tested my old implementation some more times and it
seems that I have an error in it and _this_ is the reason why it does
not work correctly. My problem with jrcs that I could not find a parser
for the patch format, and also I am not able to understand the cryptic
code. Consequently I had to wrap deltas and chunks by classes which
create an (inefficient) XML representation so I could write an XML
parser being agnostic about the semantics of the used variables. But
somewhere therein has to be the error. I'll try to find it.

Greetings,
David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top