urlconnection weirdness

M

martijn

Hi,

When I download files over an urlconnection, Im able to monitor
progress by counting the nr bytes read from the stream (see code
below). But When I download a .class file, I can't, it seems to
block.... whats going on here?

URL url = new URL("http://localhost/test/test.class");
URLConnection c = url.openConnection();
c.setUseCaches(true);
int contentLength = c.getContentLength();
InputStream f = c.getInputStream();
BufferedInputStream bis = new BufferedInputStream(f);
byte[] byteData = new byte[contentLength];
int realLength=0, bytesRead;
int length=bis.available();

if(length>5120) length=5120;
byte[] tmpBuff=new byte[length];

System.out.println("Downloading...");

while((bytesRead = bis.read(tmpBuff,0,length)) != -1)
{
System.arraycopy(tmpBuff, 0, byteData, realLength, bytesRead);
realLength+=bytesRead;
System.out.println("read "+bytesRead+" bytes of data...");
}
 
C

chris_k

Hi,

I've just run a test downloading .class(and other) files using your
code. It works fine.
Do you receive any exceptions(timeout, else...)?

chris
 
M

martijn

Well, there's nothing wrong with the downloading itself indeed. But if
you look at the console, compare downloading a file with an arbitrary
extension like .test and a file with a .class extension.

You'll notice that for the .test extension it will output the progress
nicely to the console; 10 bytes read, 100 bytes read, 200 bytes read
etc. But when i download a .class extension it will ouput 0 bytes read
and then all suddenly finished downloading. it seems the stream is
blocking me from reading progress it makes, only for classfiles... it
has something todo with the plugin caching this class file in its
plugin cache? Are there any workarounds for that?
 
C

chris_k

martijn said:
But when i download a .class extension it will ouput 0 bytes read
and then all suddenly finished downloading. it seems the stream is
blocking me from reading progress it makes, only for classfiles... it
has something todo with the plugin caching this class file in its
plugin cache? Are there any workarounds for that?

Really have no idea. The only thing is to disable plugin caching and
retry, but probably you've already done it.
Anyway, "c.setUseCaches" should be set to "false".
 
I

Ingo R. Homann

Hi,
Well, there's nothing wrong with the downloading itself indeed. But if
you look at the console, compare downloading a file with an arbitrary
extension like .test and a file with a .class extension.

You'll notice that for the .test extension it will output the progress
nicely to the console; 10 bytes read, 100 bytes read, 200 bytes read
etc. But when i download a .class extension it will ouput 0 bytes read
and then all suddenly finished downloading. it seems the stream is
blocking me from reading progress it makes, only for classfiles... it
has something todo with the plugin caching this class file in its
plugin cache? Are there any workarounds for that?

Is it really a problem of the 'class'-extension? Do the two files have
the same size? What happens if you only change the extension of the two
files?

Ciao,
Ingo
 
R

Roedy Green

You'll notice that for the .test extension it will output the progress
nicely to the console; 10 bytes read, 100 bytes read, 200 bytes read
etc. But when i download a .class extension it will ouput 0 bytes read
and then all suddenly finished downloading. it seems the stream is
blocking me from reading progress it makes, only for classfiles... it
has something todo with the plugin caching this class file in its
plugin cache? Are there any workarounds for that?

Try downloading from a slower site.

Keep in mind that class files are usually very short. If you want to
see multiple load messages use a smaller buffer.
 
M

martijn

I tried... what I did was this:
-Download a .zip file which was 1 MB, this showed the progress on the
console like it should
-Download the same zip but renamed it to .class, this did not display
any download progress on the console, until it was fully downloaded

Note that the file with .class extension was cached in the plugin cache
under the directory files, this was not the case with the .zip
extension

When i set setUseCache(false) it does display progress for the .class,
but as a result no downloading progress is shown. It looks like the JVM
hijacks the inputstream when setUseCache is enabled for a .class file??!
 
C

Chris Uppal

martijn said:
You'll notice that for the .test extension it will output the progress
nicely to the console; 10 bytes read, 100 bytes read, 200 bytes read
etc. But when i download a .class extension it will ouput 0 bytes read
and then all suddenly finished downloading.

This is a guess, but on the face of it that's not surprising. If you are
downloading stuff from the network, then read() will answer chunks of data as
it is supplied by the network/OS, so it'll arrive at your application in pieces
(of unpredictable sizes). If you are reading from a file, then the OS will
quite probably (but not necessarily) choose to fill the buffer you asked for in
one go. If the download is cached, then the implementation pretty much has to
be that it first downloads to a file, and then hands you a stream which is
reading that file. The end result is that you get no progress reports while
the download to cache is happening, and then get all the data in one fell swoop
at the end. Unless there's a progress-reporting API build into URLConnection
(there might be, but I haven't looked) the only way you can get the effect is
by turning off caching.

Speaking as a user, I'd rather have not have to wait for unnecessarily repeated
downloads, even if there /is/ progress notification...

-- chris
 
I

Ingo R. Homann

Hi,
Note that the file with .class extension was cached in the plugin cache
under the directory files, this was not the case with the .zip
extension

Fine - so you've found the problem! Now, the question is, how to disable
the cache. Since you do not say if it is a Java-cache or a OS-cache or
what else, I cannot help you... sorry!

Ciao,
Ingo
 
M

martijn

That indeed sounds like a likely scenario for what's happening... So
the question is how I can monitor this stream to cache file... I've
looked in the Javadocs, but as far as the URLConnection class is
concerned, im only allowed to get a plain inputstream from it.

Any idea when this downloading to file happens? When I query for
urlconnection.getInputStream() or when I try to read from that
inputstream?

I tried it before to wrap the inputstream from the urlconnection into
an ProgressMonitorInputStream, but it didn't read any progress
neither... but it could aswell been me misusing it... I'll try that
again, but maybe someone who knows more about that can comment on
wrapping in a ProgressMonitorInputStream?
(http://java.sun.com/j2se/1.5.0/docs/api/javax/swing/ProgressMonitorInputStream.html)
 
C

Chris Uppal

martijn said:
That indeed sounds like a likely scenario for what's happening... So
the question is how I can monitor this stream to cache file...

I'm not particularly knowledgeable about URLConnection and its friends, but I
suspect it'll be very hard to do if there isn't already a defined API from
tracking this progress.

The entire URL/protocol mechanism is java.net.* is pluggable, so in theory you
should be able to replace the HTTP URL handling with a version of your own
which allowed you "see" more of what is going on. Presumably your handler
would contain an instance of the "normal" handler and let that do most of the
work (HTTP is somewhat less than trivial). The biggest problem would be that
HTTP has the ability to multiplex more than one request on the same network
connection, so it might be very difficult to monitor the progress of just one
request.

OTOH, you could use a different implementation of HTTP altogether, such as the
HTTP client in the Jakarta Commons library. The problem with /that/ is that it
wouldn't be using the cache when possible (which is presumably what you want to
do). Indeed, you could get much the same effect by turning caching off for the
download. I suppose you /could/ turn off the system caching and do it all
yourself, but that sounds like quite a lot of work too.

All in all, and unless someone else can suggest something that I don't know
about (very possible), I rather think this is just one of those things that are
Too Hard(tm) to be worthwhile.

Any idea when this downloading to file happens? When I query for
urlconnection.getInputStream() or when I try to read from that
inputstream?

I'd /guess/ it happens when you first ask for the input stream. If you want
more information (if only for background) then I'd find out what the actual
concrete subclass of URLConnection Java is giving you, and then see if you can
find that in the source. It won't be a "public" java.net.* class, but one of
Sun's "private" implementation classes.

-- chris
 
M

martijn

Using a different implementation of HTTP would indeed ruin the caching
feature, which is nice to have :)

You mention it might be possible to override sun's implementation, I've
looked in the definition for the (abstract) URLConnection class, but
it's no use since it doesn't really contain a implementation. But this
was the public java.net class, where could i find the private one's you
mentioned, are these the actual c++ source for sun's jvm?
 
C

Chris Uppal

martijn said:
You mention it might be possible to override sun's implementation, I've
looked in the definition for the (abstract) URLConnection class, but
it's no use since it doesn't really contain a implementation. But this
was the public java.net class, where could i find the private one's you
mentioned, are these the actual c++ source for sun's jvm?

Ah yes. Sorry. The source that Sun provide with the SDK does provide some of
the "private" classes' source, but not all -- I didn't think to check whether
the real protocol handlers are included in that distribution.

You can get the full source from Sun's website. Be cautious, it is available
under two licenses. One of them is absolutely abominable (I wouldn't accept it
even if I worked for Sun), the other /may/ be acceptable to you. The source
includes the C++ source for the JVM itself (interesting, but not relevant
here), the C++ source for the "native" methods (e.g. for much of AWT, but again
not relevant here), and the Java source for the "private" classes -- which is
what you might want to look at.

As for replacing Sun's protocol handlers; I used to know how to install a
protocol handler, but I've since forgotten... Googling for "protocol handler"
and "java" should turn up information for you as quickly as it would for me, so
I'll leave you to look for it yourself ;-)

-- chris
 
M

martijn

Ok, thanks a million!

I'll look into these private classes, if I ever manage to get something
working I'll share it with this group. I hope it's possible to reuse
and extend the protocol handler for http/caching...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,162
Latest member
GertrudeMa
Top