Downloading a file in Linux

G

Grzesiek

Hi,

I use the following function to download a jar file from my website:

public synchronized boolean copyFileFromWeb(){

try
{
URL url = new URL(sourceURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(destinationPath);
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
return true;
}
catch (Exception e){
System.err.println(e.toString());
}
return false;

}


In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?
 
G

Grzesiek

Hi,

I use the following function to download a jar file from my website:

public synchronized boolean copyFileFromWeb(){

try
{
URL url = new URL(sourceURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(destinationPath);
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
return true;
}
catch (Exception e){
System.err.println(e.toString());
}
return false;

}

In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?

I wonder wheather HTTP Proxy Server is involved in it. I know that
someone puts in code something like this:

System.setProperty("http.proxyHost","xyz.com");
System.setProperty("http.proxyPort", 8080);

Is it the case?

I found a link about downloading a file in Linux

http://linux.sys-con.com/read/39248.htm
 
D

Daniel Pitts

Hi,

I use the following function to download a jar file from my website:

public synchronized boolean copyFileFromWeb(){

try
{
URL url = new URL(sourceURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(destinationPath);
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
return true;
}
catch (Exception e){
System.err.println(e.toString());
}
return false;

}

In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?

It shouldn't be any different on Linux, unless there is something else
fundamentally different about your set up.

Is it the same machine, dual booted into one OS or the other? Is it
two similar machines on the same network subnet? Is it two very
different machines, or on different networks? There are a lot of
possibilities here.

One thing I would suggest, regardless of your machines, is that you
read into a byte[] (at least 1024 bytes, if not larger, probably
between 16k and 256k) instead of one byte at a time. It is extremely
inefficient to read/write one byte at a time.

BTW, instead of System.setProperty, you can use -
Dhttp.proxyHost=xyz.com on the command line before the class name when
you execute your program, but I really don't think its proxy, I think
its the byte-at-a-time reading.
 
G

Grzesiek

I use the following function to download a jar file from my website:
public synchronized boolean copyFileFromWeb(){
try
{
URL url = new URL(sourceURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(destinationPath);
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
return true;
}
catch (Exception e){
System.err.println(e.toString());
}
return false;

In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?

It shouldn't be any different on Linux, unless there is something else
fundamentally different about your set up.

Is it the same machine, dual booted into one OS or the other? Is it
two similar machines on the same network subnet? Is it two very
different machines, or on different networks? There are a lot of
possibilities here.

One thing I would suggest, regardless of your machines, is that you
read into a byte[] (at least 1024 bytes, if not larger, probably
between 16k and 256k) instead of one byte at a time. It is extremely
inefficient to read/write one byte at a time.

BTW, instead of System.setProperty, you can use -
Dhttp.proxyHost=xyz.com on the command line before the class name when
you execute your program, but I really don't think its proxy, I think
its the byte-at-a-time reading.- Ukryj cytowany tekst -

- Poka cytowany tekst -

Hi Daniel,


It is two similar machines on the same local network. But i tried to
run the program on completly difrent machine in another network and it
didnt work either. The program didnt work on Windows 2000
either. But im not sure wheather the copyfromWeb() function was the
only problem that time.

I read one byte at a time because i download a JAR FILE not an image.
No corrupted bytes are allowed here. In fact i tried reading into
byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
instead of 116kB- which is the size of the file i want to downlaod. To
large file is corrupted and cannot be run.

According to the link:

http://linux.sys-con.com/read/39248.htm

i changed the function like this:

public synchronized boolean copyFileFromWeb2(){
try{
URL url = new URL(newConfigProgURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(tempConfigProgPath);
DataOutputStream out=new DataOutputStream(fos);
DataInputStream in=new
DataInputStream(urlC.getInputStream());

int oneChar, count=0;
while ((oneChar=in.read()) != -1)
{
fos.write(oneChar);
count++;
}
is.close();
fos.close();
System.out.println(count + " byte(s) copied");
return true;
}catch(Exception e){
System.err.println(e);
}
return false;

}

Now it works! Still downloading 116 kB jar file in Linux takes about
30 secunds while in Windows Xp it takes maybe 1 secund.
 
T

Thomas Hawtin

Grzesiek said:
Hi,

I use the following function to download a jar file from my website:

public synchronized boolean copyFileFromWeb(){

try
{
URL url = new URL(sourceURL);
URLConnection urlC = url.openConnection();
InputStream is = url.openStream();
System.out.print("Copying resource (type: " +
urlC.getContentType());
Date date=new Date(urlC.getLastModified());
System.out.flush();
FileOutputStream fos=null;
fos = new FileOutputStream(destinationPath);

Why assign to null and then assign a proper value the statement after?
int oneChar, count=0;
while ((oneChar=is.read()) != -1)
{

Copying one character is liable to be relatively slow. At least copy
through a byte array.
fos.write(oneChar);
count++;
}
is.close();
fos.close();

These should each be in a finally block of a try-finally.
System.out.println(count + " byte(s) copied");
return true;
}
catch (Exception e){
System.err.println(e.toString());
}

It's not a great idea to catch Exception rather than the actual
exception type you wish to catch.
return false;

}


In Windows XP it works perfectly, but in Linux it works very slow and
the downloaded file is corrupted! What is wrong?

When you say slowly, is it the first byte which is slow or each
subsequent byte. If it is only up to the first byte, then on obvious
suspect is DNS misconfiguration (which happens more often on Windows).

When you say the file is corrupt, what do you actually get? Truncated?
Complete rubbish? Some bytes wrong? Something else?

You might want to try nc to see what the web server is actually doing.

Tom Hawtin
 
G

Grzesiek

Hi Tom
When you say the file is corrupt, what do you actually get? Truncated?
Complete rubbish? Some bytes wrong? Something else?


When i used my first copyFromWeb() function and one byte at a time
reading i got truncated file. It was 73kB instead of 116kB and the
error was Socekt Error: connection reset. So i dont think its DNS's
error.
Each time connection was reset at 73 kB !

But i updated copyFromWeb() and i wonder why now it works. And i still
wonder why it is by far slower on Linux then Windows XP.
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Grzesiek said:
I read one byte at a time because i download a JAR FILE not an image.
No corrupted bytes are allowed here. In fact i tried reading into
byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
instead of 116kB- which is the size of the file i want to downlaod. To
large file is corrupted and cannot be run.

You can get any file by reading with large buffers - it only
affects performance not functionality.

Code snippet:

URL url = new URL(urlstr);
HttpURLConnection con =
(HttpURLConnection)url.openConnection();
con.connect();
if(con.getResponseCode() == HttpURLConnection.HTTP_OK) {
InputStream is = con.getInputStream();
OutputStream os = new FileOutputStream(fnm);
byte[] b = new byte[100000];
int n;
while((n = is.read(b)) >= 0) {
os.write(b,0,n);
}
os.close();
is.close();
}
con.disconnect();

Arne
 
G

Grzesiek

Grzesiek said:
I read one byte at a time because i download a JAR FILE not an image.
No corrupted bytes are allowed here. In fact i tried reading into
byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
instead of 116kB- which is the size of the file i want to downlaod. To
large file is corrupted and cannot be run.

You can get any file by reading with large buffers - it only
affects performance not functionality.

Code snippet:

URL url = new URL(urlstr);
HttpURLConnection con =
(HttpURLConnection)url.openConnection();
con.connect();
if(con.getResponseCode() == HttpURLConnection.HTTP_OK) {
InputStream is = con.getInputStream();
OutputStream os = new FileOutputStream(fnm);
byte[] b = new byte[100000];
int n;
while((n = is.read(b)) >= 0) {
os.write(b,0,n);
}
os.close();
is.close();
}
con.disconnect();

Arne

Thanx Arne,

i used your snippet and now my function works fine :) There is no
diffrence between Linux and Windows Xp now. So reading one byte at a
time was the problem.

Thanx all :)
 
L

Lothar Kimmeringer

Grzesiek said:
When i used my first copyFromWeb() function and one byte at a time
reading i got truncated file. It was 73kB instead of 116kB and the
error was Socekt Error: connection reset. So i dont think its DNS's
error.
Each time connection was reset at 73 kB !

Connection reset means, that the connection has been closed by
the partner (or an intermediate proxy). Most operating systems
but also the underlying framework you're using (HttpUrlConnection
in your case) do some buffering as well, so what happens here is
that the connection is reading in 73 kB into a buffer that you
read byte by byte.

It seems that the unbuffered write to the filesystem is taking
much longer on Linux than on Windows. Writing to a filesystem is
OS-dependent and - in case of Linux - also depends on the type
of filesystem as well. If it's a network-based (NFS, SMB, XFS, ...)
writing one single byte (incl. sync etc) might take some time.

Because it takes so long and you're acutally reading from a local
buffer rather than the network-connection the server gets bored
and closes the connection due to a timeout being reached. The
moment your internal buffer is empty and the connection is trying
to fetch the next bunch of data it's running against the wall
(connection reset).
But i updated copyFromWeb() and i wonder why now it works. And i still
wonder why it is by far slower on Linux then Windows XP.

Writing and reading blockwise (with an array) is much mure efficient
than single bytes (that can be regarded as blocks with length 1),
because that's the way file- and network-operations are designed
to be. Because the internal buffer of the connection is now empties
much more faster, the timeout on the server-side doesn't happen,
therefore you receive the whole bunch of data.


Regards, Lothar
--
Lothar Kimmeringer E-Mail: (e-mail address removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!
 
D

Daniel Pitts

Grzesiek said:
I read one byte at a time because i download a JAR FILE not an image.
No corrupted bytes are allowed here. In fact i tried reading into
byte[1024] and byte[4096] but then downloaded file is 140kB and 160kB
instead of 116kB- which is the size of the file i want to downlaod. To
large file is corrupted and cannot be run.
You can get any file by reading with large buffers - it only
affects performance not functionality.
Code snippet:
URL url = new URL(urlstr);
HttpURLConnection con =
(HttpURLConnection)url.openConnection();
con.connect();
if(con.getResponseCode() == HttpURLConnection.HTTP_OK) {
InputStream is = con.getInputStream();
OutputStream os = new FileOutputStream(fnm);
byte[] b = new byte[100000];
int n;
while((n = is.read(b)) >= 0) {
os.write(b,0,n);
}
os.close();
is.close();
}
con.disconnect();

Thanx Arne,

i used your snippet and now my function works fine :) There is no
diffrence between Linux and Windows Xp now. So reading one byte at a
time was the problem.

Thanx all :)

Glad that worked for you. Something else I forgot to mention was that
reading one Character at a time is VERY different from reading one
Byte at a time. There are some conversions that Java does, which
would explain your corrupt data. Unlike C/C++, Char are 2 bytes, and
they are usually encoded/decoded when written to/read from streams, so
you end up with unexpected values if you're trying to read non-
character data.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top