The simplest way to download a file from http resource that needauthentication

A

Andrea Francia

I need write a program which download many files from different web
sites. The web sites requires basic authentication. And I want spread
many threads, each one downloading a list of files that need different
authentication credentials.

Anyone knows what is the simplest way to achieve this? I found many http
libraries but all seem very complex.

I studied how to use the standard java library to achieve this, but it
seems that is not feasible with it.

For example the next code download a file that not require authentication.

URL url = new URL("http://www.example.org/file.txt");
URLConnection con = url.openConnection();

BufferedInputStream in;
in = new BufferedInputStream(con.getInputStream());
OutputStream out = new FileOutputStream("C:\\file.txt");

int i = 0;
byte[] bytesIn = new byte[8096];
while ((i = in.read(bytesIn)) >= 0) {
out.write(bytesIn, 0, i);
}
out.close();
in.close();

But the problems arise when you try to download a file that need
authentication in a threaded enviroment.
To provide the authentication credentials you should use the
Authenticator.setDefault() method which is a static method and therefore
not usable in a threaded enviroment.

I tried also embedding the username and password in the URL but these
where ignored.
url = new URL("http://username:[email protected]/file.txt");

Thanks
 
L

Lew

Andrea said:
Authenticator.setDefault() method which is a static method and therefore
not usable in a threaded enviroment.

Static methods can be used in a multi-threaded program.
 
A

Andrea Francia

Lew said:
Static methods can be used in a multi-threaded program.
Nooo, really?

There is a race conditions. Here the example:

We have two thread: t1 and t2 that executes the following code

void download(URL url, String username, String password)
throws IOException
{

Authenticator.setDefault(new Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(username,
password.toCharArray());
}});
URLConnection con = url.openConnection();
BufferedInputStream in;
in = new BufferedInputStream(con.getInputStream());
OutputStream out = new FileOutputStream("C:\\file.txt");

int i = 0;
byte[] bytesIn = new byte[8096];
while ((i = in.read(bytesIn)) >= 0) {
out.write(bytesIn, 0, i);
}
out.close();
in.close();
}

Each thread should download different url using different username and
password.
t1 use "http://example.org/foo" as url and "foo","foo" as username,
password.

t1 use "http://example2.org/bar" as url and "bar","bar" as username,
password.

Ipotize this course of events:
t1 starts
t1 call Authenticator.setDefault() using "foo","foo" as
username,password.
t1 is suspensed by the scheduler
t2 starts
t2 call Authenticator.setDefault() using "bar","bar" as
username,password.
t2 call openConnection(); that will use the correct username and
password ("bar","bar")
t2 download the file.
t2 terminates.
t1 is resumed by the scheduler
t1 call openConnection(); but the username and password were changed,
from the correct values ("foo", "foo") to the values used by t2
("bar,"bar"). The openConnection fails.

The correcteness of the programs depend of the scheduling, hence there
is a race condition. The race conditions depends from the fact that
Authenticator.setDefault() is static and hence the same data is shared
by all threads.
 
A

Andrea Francia

Lew said:
Static methods can be used in a multi-threaded program.

There is a race conditions. Here the example:

We have two thread: t1 and t2 that executes the following code

void download(URL url, String username, String password)
throws IOException
{

Authenticator.setDefault(new Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(username,
password.toCharArray());
}});
URLConnection con = url.openConnection();
BufferedInputStream in;
in = new BufferedInputStream(con.getInputStream());
OutputStream out = new FileOutputStream("C:\\file.txt");

int i = 0;
byte[] bytesIn = new byte[8096];
while ((i = in.read(bytesIn)) >= 0) {
out.write(bytesIn, 0, i);
}
out.close();
in.close();
}

Each thread should download different url using different username and
password.
t1 use "http://example.org/foo" as url and "foo","foo" as username,
password.

t1 use "http://example2.org/bar" as url and "bar","bar" as username,
password.

Ipotize this course of events:
t1 starts
t1 call Authenticator.setDefault() using "foo","foo" as
username,password.
t1 is suspensed by the scheduler
t2 starts
t2 call Authenticator.setDefault() using "bar","bar" as
username,password.
t2 call openConnection(); that will use the correct username and
password ("bar","bar")
t2 download the file.
t2 terminates.
t1 is resumed by the scheduler
t1 call openConnection(); but the username and password were changed,
from the correct values ("foo", "foo") to the values used by t2
("bar,"bar"). The openConnection fails.

The correcteness of the programs depend of the scheduling, hence there
is a race condition. The race conditions depends from the fact that
Authenticator.setDefault() is static and hence the same data is shared
by all threads.
 
L

Lew

Andrea said:
Nooo, really?

Yes, really.
There is a race conditions. ....
The correcteness of the programs depend of the scheduling, hence there
is a race condition. The race conditions depends from the fact that
Authenticator.setDefault() is static and hence the same data is shared
by all threads.

So?

Non-static methods can have race conditions, too. Deadlocks, even. There's
no difference from static methods in that regard. Why do you single out
static methods?

Fortunately Java has a number of lovely built-in constructs to keep threads
synchronized, starting with the keyword "synchronized", which are notably
absent from the example you posted.

Of course if you don't synchronize your threads, there will be trouble, static
or non-static methods notwithstanding. If you think using instance methods
without synchronization will solve your threading problems, you're doomed.

For an introduction to the topic, read
<http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html>
 
A

Andreas Leitgeb

Lew, sometimes I really wonder if you aren't actually trolling.
Non-static methods can have race conditions, too. Deadlocks, even. There's
no difference from static methods in that regard. Why do you single out
static methods?

It's perhaps not so much the static methods, but rather the static data
that gets set by the former, and which is supposed to be specific
to each thread. Having to synchronize the whole "set user-data and
fetch file"-block almost voids the whole point of parallelizing the task.

Perhaps it suffices to synchronize setting the user and opening the
connection, and leave the actual transfer unsynchronized, but I don't
feel very comfortable that way.
 
L

Lew

Andreas said:
Lew, sometimes I really wonder if you aren't actually trolling.


It's perhaps not so much the static methods, but rather the static data
that gets set by the former, and which is supposed to be specific
to each thread. Having to synchronize the whole "set user-data and
fetch file"-block almost voids the whole point of parallelizing the task.

Perhaps it suffices to synchronize setting the user and opening the
connection, and leave the actual transfer unsynchronized, but I don't
feel very comfortable that way.

Andrea felt the same way, but I really don't understand the reaction. It is
true that static methods can be used in a multi-threaded program. The
statement to the contrary was not correct, and it is normal in Usenet to set
the record straight.

There are any number of programs that find it useful or convenient to share
static data and methods among threads. I stuck to the technical facts, and
provided correct information that should be useful to the OP and everyone else
reading. So why the hostility?
 
A

Andreas Leitgeb

Andrea felt the same way, but I really don't understand the reaction.

I think he (or she, can't deduce from name) quite clearly described the
problem: User-data is set statically. So despite the exact wording it
seemed obvious to me that the problem was the data, and not the method
by which it was set. The latter was merely what made the static storage
obvious.

We seem to differ on the level of obviousness ;-)
It is
true that static methods can be used in a multi-threaded program. The
statement to the contrary was not correct, and it is normal in Usenet to set
the record straight.

It might have been worth a comment like: "Of course static methods are
not inherently problematic with threads, but ..." ideally followed
by a trick to solve the actual problem :)
There are any number of programs that find it useful or convenient to share
static data and methods among threads.

But these actually also share the value stored in those static variables.
The point here is, that each thread needs a different value.
So why the hostility?

Because your answer not only focussed on some technical tidbit, but
thereby also refuted the mere existence of the actual problem.

Your answer *conveyed*: "static methods are not problematic with
multi-threaded usage, so your problem doesn't exist"

At least, it seems like both Andrea and me understood it that way.
 
L

Lew

Andreas said:
Because your answer not only focussed on some technical tidbit, but
thereby also refuted the mere existence of the actual problem.

Your answer *conveyed*: "static methods are not problematic with
multi-threaded usage, so your problem doesn't exist"

At least, it seems like both Andrea and me understood it that way.

OK - you guys were upset about something I didn't say, and blamed me for it.

I assure you I never said, meant or thought, "Your problem doesn't exist." I
said only what I said, and what I said was meant to be helpful. I didn't say
what I didn't say.
 
A

Andreas Leitgeb

I don't really see any hostility.
OK - you guys were upset about something I didn't say, and blamed me for it.

Helpfulness is a strange concept.

Correcting speling errors in a technical question is one example
of a "helpful" action that is only rarely appreciated.

Focussing on irrelevant details of a posting is another one.

Answering a detail that appears to be crucial at very first glance,
but really isn't, is often even explicitly un-appreciated. Probably
because it has a likely effect that future readers of the thread may
think it's already answered and skip it, even if they perhaps did know
the correct answer.

PS: Don't ask me why, but an initial phrase like "This doesn't really
help with the question, but [correction of the tidbit]" would
probably boost the acceptance of detail-corrections enourmously.
 
L

Lew

Andreas said:
PS: Don't ask me why, but an initial phrase like "This doesn't really
help with the question, but [correction of the tidbit]" would
probably boost the acceptance of detail-corrections enourmously.

Excellent advice, but bear in mind that this is a discussion group and
discussions can range over a wide range of topics.

My concern was that when people say here that something is
a static method and therefore
not usable in a threaded enviroment.

that the general readership will believe such an inaccurate remark. Over the
years I've used Usenet, correction of such misinformation has not generally
been taken as an insult.

Furthermore, pruning the original post to a specific point and answering that
point alone should make it clear that the primary point is not under
discussion in such a post. Calling a person "trollish" for that was
completely out of line and downright insulting.

I don't know about you, but I think there is a distinct risk of bad practices
burgeoning if such misinformation is allowed to stand.

Suggesting that one coddle a respondent's feelings through the sort of
diplomacy you suggest is a good idea, but I suggest in return that people
focus on the facts under presentation and apply a little bit of reason and
logic to the information instead of getting all bent out of shape.

The fact is that static methods *are* suitable for multi-threaded programs,
just as much as instance methods are. No claim was made that that information
solved the OP's fundamental problem. OTOH, when such obvious
misinterpretation of the technology is evinced, it is possible that the
misunderstanding might indeed bear on the original problem.

Static methods are fully capable of managing distinct information per thread
if written to do so. It might not always be the best way, but it's often done
and quite safely. One might indeed reject a static method in that scenario,
but not because they cannot be used safely in multi-threaded programs.

For example, if the OP had followed my advice and used synchronization to
protect the static method call, they'd've been able to solve their problem.
It might not be the fastest way, but it certainly could work.

So everybody just take a chill pill and focus on the information provided.
Please stop the personal attacks.
 
A

Arne Vajhøj

Andrea said:
There is a race conditions. Here the example:

We have two thread: t1 and t2 that executes the following code
Authenticator.setDefault(new Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(username,
password.toCharArray());
}});
URLConnection con = url.openConnection();

Authenticator.setDefault is designed for proxy servers that
requires authentication and in that context all requests
need the same authenticator.

I think you will need to set HTTP headers manually.

Something like:

con.setRequestProperty ("Authorization", "Basic " +
basicauth("user","pass"));

where:

public static String basicauth(String un, String pw) throws
MessagingException, IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
OutputStream b64os = MimeUtility.encode(baos, "base64");
b64os.write((un + ":" + pw).getBytes());
b64os.close();
return new String(baos.toByteArray());
}

Arne
 
L

Lew

Andrea said:
The correcteness of the programs depend of the scheduling, hence there
is a race condition. The race conditions depends from the fact that
Authenticator.setDefault() is static and hence the same data is shared
by all threads.

Seems to me that the solution lies in having the registered Authenticator
itself be able to split up the logic according the desired authentication,
rather than having multiple Authenticators. IOW, instead of splitting the
logic to choose an Authenticator, have the Authenticator implement the split.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top