problems getting stderr and stdout from process object

H

horos22

All,

I'm using the following idiom to get stderr and stdout from a java
process (iterating first over stderr, and then over stdout), and am
running into a world of hurt.

What is happening is that the input stream seems to be 'getting
stuck'; ie: my guess is some internal buffer associated with the
InputStream fills up for stdout which causes the first loop to hang.

So my question is - is there a way to 'interleave' the output from the
two input streams?

Ed

(
btw - what follows is the source for doing this..the code stalls on
the br_out.readLine() line..
)


Process proc = Runtime.getRuntime().exec(_mycmd);

InputStream stderr = proc.getErrorStream ();
InputStream stdout = proc.getInputStream ();

BufferedReader br_err = new BufferedReader (new InputStreamReader
(stderr));
BufferedReader br_out = new BufferedReader (new InputStreamReader
(stdout));


while ((line = br_err.readLine ()) != null) { .. }
while ((line = br_out.readLine()) != null) { .. }
 
L

Lothar Kimmeringer

horos22 said:
BufferedReader br_err = new BufferedReader (new InputStreamReader
(stderr));
BufferedReader br_out = new BufferedReader (new InputStreamReader
(stdout));

while ((line = br_err.readLine ()) != null) { .. }
while ((line = br_out.readLine()) != null) { .. }

You have to put these two while-loops into two independent
threads. STDERR is open until the process ends, so br_err.readLine
will block until the process sends something on STDERR. If that
doesn't happen but something is given out on STDOUT, the buffer
there will be full sooner or later, blocking the process.

Another solution would be a polling-mechanism:

while (subProcessRunning()){
if (br_err.availble() != 0){
line = br_err.readLine();
...
}
if (br_out.availble() != 0){
line = br_out.readLine();
...
}
}

But I'm no friend of this kind of thing.


Regards, Lothar
--
Lothar Kimmeringer E-Mail: (e-mail address removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!
 
H

horos11

Lothar,

Just curious, but why no polling?

To me, this seems a lot cleaner - with threads, I see getting
interleaved output (snippets of strings from stdin and stdout all
jumbled together)
Is it simply a question of performance?

Ed
 
L

Lew

horos11 said:
Just curious, but why no polling?

Polling spins a lot of CPU cycles with no work being done.
To me, this seems a lot cleaner - with threads, I see getting
interleaved output (snippets of strings from stdin and stdout all
jumbled together)

Nonsense. The suggestion is to use separate threads to pull stdout
and stderr - they don't have to mix what they receive if you don't
want them to.
Is it simply a question of performance?

Simple doesn't mean wrong, irrelevant or trivial.
 
D

Daniel Pitts

horos22 said:
All,

I'm using the following idiom to get stderr and stdout from a java
process (iterating first over stderr, and then over stdout), and am
running into a world of hurt.

What is happening is that the input stream seems to be 'getting
stuck'; ie: my guess is some internal buffer associated with the
InputStream fills up for stdout which causes the first loop to hang.
Actually, what may be happening is that the process itself isn't
finishing because it is blocking on something else altogether. It is
also possible that the stderr buffer is full, so the process blocks
until *it* completes.

In general, you will need to read the stdout and stderr from separate
threads. This is a flaw in the API in my opinion.
So my question is - is there a way to 'interleave' the output from the
two input streams?
Look into ProcessBuilder, I seem to recall there is a way to ask for
interleaved output (you won't be able to tell which was which, however).
<http://java.sun.com/javase/6/docs/api/index.html?java/lang/ProcessBuilder.html>

Look at "redirectErrorStream".
 
H

horos11

Nonsense.  The suggestion is to use separate threads to pull stdout
and stderr - they don't have to mix what they receive if you don't
want them to.


Simple doesn't mean wrong, irrelevant or trivial.

Put it this way; suppose I have a separate threads to do what you say,
but at the same time I don't want to do buffering, ie: I want to show
output *as it happens*, that the loops look like

thread1: while (line = ... get stdout) { System.out.println(line) }
thread2: while (line = ... get stderr) { System.err.println(line) }

what guarantee do I have that the first thread doesn't interfere with
the second? I would need to put locks around the two threads, which to
me is logically equivalent to the polling.

Ed
 
L

Lew

horos11 said:
Put it this way; suppose I have a separate threads to do what you say,
but at the same time I don't want to do buffering, ie: I want to show
output *as it happens*, that the loops look like

thread1: while (line = ... get stdout) { System.out.println(line) }

Could you rephrase that in Java, please?
thread2: while (line = ... get stderr) { System.err.println(line) }

what guarantee do I have that the first thread doesn't interfere with
the second? I would need to put locks around the two threads, which to
me is logically equivalent to the polling.

Well, since you're 'println()'ing to two different streams I don't see the
need for locks. You would have to use different local variables to capture
the process output, rather than a common pointer 'line' for both, of course.

In fact, I have no clue as to what you mean by "putting locks around two
threads". What shared data do you plan to guard?

Based on the pseudocode you show here, I don't see any thread interference.

The question of buffering is moot.
 
L

Lew

Lew said:
Could you rephrase that in Java, please?


Well, since you're 'println()'ing to two different streams I don't see
the need for locks. You would have to use different local variables to
capture the process output, rather than a common pointer 'line' for
both, of course.

In fact, I have no clue as to what you mean by "putting locks around two
threads". What shared data do you plan to guard?

Based on the pseudocode you show here, I don't see any thread interference.

The question of buffering is moot.

Oh, and locking a common resource is not equivalent to polling, logically or
otherwise.
 
A

alexandre_paterson

I'm using the following idiom to get stderr and stdout from a java
process (iterating first over stderr, and then over stdout), and am
running into a world of hurt.

It may or may not apply to you but I did something quite
different in several Real-World [TM] projects (including
on a Java Webapp ran on a Linux system) and on a client
side "OS X only Java app" (it's an app written in Java
that makes sense only on OS X).

I do a little bit less on the Java side of things and
a little bit more on the Un*x side.

I wrap my external process in a shell script that
redirects every output to file(s). I then poll
these files from my Java app to get back what I need.

I'm calling an external process from Java that produces
nothing on stdout nor on stderr. That process is a
Bash shell script that redirects sdtout/err to files
and that itself calls an AppleScript/Osascript.

I call that Osascript repeatedly, only sleeping a split
second between each calls.

I did it with an osacript but you can wrap and redirect
I/O from pretty much anything in a shell script.

From Java I then parse the temporary files created by
my Bash shell script repeatedly, only sleeping 125 ms
between each poll.

System load on an old Mac Mini as reported by OS X's
activity monitor (on a non-stop running script): 0.3 %


0.3 %

Typically I've found out that when I'm working
with external processeses I can greatly ease my
pain (and live in a world of less hurt ;) by
adding a few lines of non-Java code.

My actual scripts are more complicated than that
but the basic idea is to redirect the streams
directly from a shell script and then to poll
the output files from Java.

This is Real-World [TM] tried and tested, it
Just Works [TM] and it incurs an insignificant
system load.

That technique always worked fine for me, since
years ;)

It may work fine for you too!?



PS: I have to call this Osascript because I use
a functionality of OS X that is only accessible
through Osascript, there's no public API available
yet doing the same so JNI/JNA are a big no-no.
 
D

Daniel Pitts

Peter said:
I'm curious: how would you fix it?

You cannot buffer the output for either stdout or stderr indefinitely.
It would not even be practical or robust to buffer until a memory
allocation simply fails. So the question becomes, what to do when the
buffer becomes filled? You have two obvious choices: discard data, or
block output until room is made.

Do you see some other practical alternative?
The problem I have isn't with the blocking-on-buffer-fill behavior, but
with the fact that you *must* have two threads running to use this API
successfully.

I can think of three alternatives off the top of my head:

1. Don't use two InputStream instances, but instead use a new kind of
IO class designed to handle interleaved data. It would allow better
correlation between events in each "stream", and it would allow you to
read the streams in the current thread, without spawning a new one.

2. Offer some sort of "select()" based waiting for the streams. This
allows one thread to handle multiple streams.

3. 1 and 2 combined.

It would be important for the select() to also support "OutputStream"
readiness, because you could otherwise end up with a deadlock (the
process is supposed to receive more input, but the output buffer is
full, so the whole thing might block)


As far as those two choices go, I'm quite pleased that the design choice
made was to block output, rather than to discard data.

I suppose that now, with NIO (which came well after Process), the API
could provide a SelectableChannel implementation, allowing a single
thread to process more than one stream. But, the main motivation for
that NIO feature is to avoid the creation of thousands of threads when
you have that many streams to deal with; a process is only going to have
at most these two output streams, so all the work to implement a
SelectableChannel just to avoid the creation of one extra thread seems
like overkill to me.

Creating one thread is more than just run-time overhead. There is a
development cost with multi-threading. You are more prone to deadlocks,
synchronization problems, and much more, when you create a new Thread.
Yes, those problems can be avoided, but its *much* more to think about.

I think providing SelectableChannels and/or an InterleavedInputStream
would provide the least error-prone API.
 
T

Tom Anderson

The problem I have isn't with the blocking-on-buffer-fill behavior, but with
the fact that you *must* have two threads running to use this API
successfully.

I can think of three alternatives off the top of my head:

1. Don't use two InputStream instances, but instead use a new kind of IO
class designed to handle interleaved data. It would allow better correlation
between events in each "stream", and it would allow you to read the streams
in the current thread, without spawning a new one.

Does available() work on the stdout from a child process? If so, i think
you could implement single-thread interleaved (or at least interleaveish)
IO on top of the current API.
2. Offer some sort of "select()" based waiting for the streams. This allows
one thread to handle multiple streams.

This would be really useful.
3. 1 and 2 combined.

What would also be useful would be a way to redirect the child's input or
output from or to a file (or /dev/null or its equivalent). You could then
run a process, feed it input, wait for it to finish, then read the output
file. Only one thread needed, and no IO weirdness.
It would be important for the select() to also support "OutputStream"
readiness, because you could otherwise end up with a deadlock (the
process is supposed to receive more input, but the output buffer is
full, so the whole thing might block)


Creating one thread is more than just run-time overhead. There is a
development cost with multi-threading. You are more prone to deadlocks,
synchronization problems, and much more, when you create a new Thread.
Yes, those problems can be avoided, but its *much* more to think about.

If all your thread is doing is pumping data into the child process, there
is nothing more to think about. The above paragraph looks mostly like FUD
to me - no offence, i just think this is a manifestation of the
superstitious fear of threads that is commonplace in the java world.

tom
 
D

Daniel Pitts

Tom said:
Does available() work on the stdout from a child process? If so, i think
you could implement single-thread interleaved (or at least
interleaveish) IO on top of the current API.


This would be really useful.


What would also be useful would be a way to redirect the child's input
or output from or to a file (or /dev/null or its equivalent). You could
then run a process, feed it input, wait for it to finish, then read the
output file. Only one thread needed, and no IO weirdness.
That could be additional functionality that might be useful, although it
can currently be achieved multiple ways. Though you'd still have to deal
with IO, which is the ultimate weirdness in CS theory :)
It would be important for the select() to also support "OutputStream"
readiness, because you could otherwise end up with a deadlock (the
process is supposed to receive more input, but the output buffer is
full, so the whole thing might block)
[snip]
Creating one thread is more than just run-time overhead. There is a
development cost with multi-threading. You are more prone to
deadlocks, synchronization problems, and much more, when you create a
new Thread. Yes, those problems can be avoided, but its *much* more to
think about.

If all your thread is doing is pumping data into the child process,
there is nothing more to think about. The above paragraph looks mostly
like FUD to me - no offence, i just think this is a manifestation of the
superstitious fear of threads that is commonplace in the java world.

tom
Actually, in this case, the extra thread is pumping the data *out of*
the external process, into the current JVM. The implementation isn't
actually that hard to get correct. I might have given into hyperbole,
but I have seen many incorrect attempts, and only a few correct ones.
 
T

Tom Anderson

That could be additional functionality that might be useful, although it
can currently be achieved multiple ways. Though you'd still have to deal
with IO, which is the ultimate weirdness in CS theory :)

Nah, it's just a monad.

Huh. Yeah, good point.
Actually, in this case, the extra thread is pumping the data *out of*
the external process, into the current JVM. The implementation isn't
actually that hard to get correct. I might have given into hyperbole,
but I have seen many incorrect attempts, and only a few correct ones.

Fair enough. You could probably say that about a truly depressing number
of problems, though!

tom
 
D

Daniel Pitts

Tom said:
Fair enough. You could probably say that about a truly depressing number
of problems, though!

Definitely true. As someone who creates APIs frequently, I consider it
one of my jobs to make it easy to come up with a correct solution, and
difficult to shoot yourself in the foot.

In my opinion, the ProcessBuilder and Runtime.exec() API don't fulfill
that constraint. Actually, there are a lot of API's in Java that don't.
Of course, they can't easily be fixed, without breaking a lot of
existing programs.
 
L

Lew

Daniel said:
As someone who creates APIs frequently, I consider it one of
my jobs to make it easy to come up with a correct solution,
and difficult to shoot yourself in the foot.

I wonder if you would assess whether a thesis concords with your observations:

From conversations with programmers I discern certain primary mental
frameworks, or grounds of being, among them. In one dimension you have
object-oriented vs. functional programmers, static/stodgy vs. dynamic/careless
styles and that sort of dichotomy. In a deeper dimension, you have coders,
API writers and maintenance programmers. A dragon shifts shape among these as
gives him advantage.

A coder writes software to a specific situation. They tend to favor terse
languages and structures, abbreviated variable names, chameleon-like
variables. They tend to be very productive in terms of code volume, which is
good because code re-use is less of a priority.

An API writer is an autocrat. They lock down the types, behaviors,
visibility, extensibility and screw-it-upility pretty tightly. They tend to
be productive in terms of functionality with lower code volume, organized into
libraries both internal and external.

The maintenance programmer, when they write as originators rather than
modifying existing code, favor short routines and long variable names with
lots and lots of comments. They tend to provide complete Javadocs as they
write code. A simple getter has four code lines, including the opening brace
on its own line, of course, six Javadocs lines and a blank line. They tend to
be productive in terms of not requiring other people to spend a lot of time
fixing things.

The coder bitches at the API's inflexibility and the maintenance programmer's
verbosity. The API writer bitches at the coder's profligacy but merely shakes
their head over the Javadocs. The maintenance programmer doesn't bitch at
anybody, just tweaks their code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top