Advice/Help with Multithreading

D

DyslexicAnaboko

I wrote a method that will take a URL, and return its page in String
form.

Now depending on which webpage is being visited is how long it will
take to download its contents. There is a difference between getting
the contents of google vs. yahoo, obviously the page sizes differ.

Since I would have many pages to download, downloading them 1 at a time
takes forever. I just want to speed things up. I figured that
multithreading would be my answer since I could create several threads
to download pages simultaneously. I am inexperienced with
multithreading though, so I was just hoping that anyone could give me
some pointers or advice on where to begin.

Basically I want to do the following:

1. I want to create X threads, lets just say 10 for arguments sake.

2. I want each thread to get its own assigned URL. Will there be a
problem with more than one thread accessing the same method?

3. After downloading the contents of the page I intend to put the
strings into a list. Will there be a problem with more than one thread
accessing the same object? If so, should I use semaphores?

I'm not asking anyone to write this for me, I just don't know where to
begin. If anyone can spare an example or any advice I am all ears.

Thanks,

Eli
 
K

Knute Johnson

DyslexicAnaboko said:
I wrote a method that will take a URL, and return its page in String
form.

Now depending on which webpage is being visited is how long it will
take to download its contents. There is a difference between getting
the contents of google vs. yahoo, obviously the page sizes differ.

Since I would have many pages to download, downloading them 1 at a time
takes forever. I just want to speed things up. I figured that
multithreading would be my answer since I could create several threads
to download pages simultaneously. I am inexperienced with
multithreading though, so I was just hoping that anyone could give me
some pointers or advice on where to begin.

Basically I want to do the following:

1. I want to create X threads, lets just say 10 for arguments sake.

2. I want each thread to get its own assigned URL. Will there be a
problem with more than one thread accessing the same method?

3. After downloading the contents of the page I intend to put the
strings into a list. Will there be a problem with more than one thread
accessing the same object? If so, should I use semaphores?

I'm not asking anyone to write this for me, I just don't know where to
begin. If anyone can spare an example or any advice I am all ears.

Thanks,

Eli

You can run the same method in multiple threads. Assuming that you
synchronize access to any variables that are accessed by multiple
threads. So if you write a method, getString(URL url) you can then
create a thread to run that method in as follows:

Runnable r = new Runnable() {
public void run() {
getString(url);
}
};
new Thread(r).start();

You will need some code after the call to getString() to put it
somewhere but that is really all there is to it.

Start writing the program and post your progress.
 
D

DyslexicAnaboko

Will do, thank you that was very helpful, that is exactly what I needed
to get me started.

Eli
 
D

Daniel Pitts

DyslexicAnaboko said:
I wrote a method that will take a URL, and return its page in String
form.

Now depending on which webpage is being visited is how long it will
take to download its contents. There is a difference between getting
the contents of google vs. yahoo, obviously the page sizes differ.

Since I would have many pages to download, downloading them 1 at a time
takes forever. I just want to speed things up. I figured that
multithreading would be my answer since I could create several threads
to download pages simultaneously. I am inexperienced with
multithreading though, so I was just hoping that anyone could give me
some pointers or advice on where to begin.

Basically I want to do the following:

1. I want to create X threads, lets just say 10 for arguments sake.

2. I want each thread to get its own assigned URL. Will there be a
problem with more than one thread accessing the same method?

3. After downloading the contents of the page I intend to put the
strings into a list. Will there be a problem with more than one thread
accessing the same object? If so, should I use semaphores?

I'm not asking anyone to write this for me, I just don't know where to
begin. If anyone can spare an example or any advice I am all ears.

Thanks,

Eli

Look at the java.util.concurrent package, it has helpful classes for
almost everything you're asking about.
<http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/package-summary.html>

Specifically ThreadPoolExecutor, and BlockingQueue.

You can submit download requests to the executor, and have them stuff
the results into the blocking queue. You would have one or more
seperate thread reading from the blocking queue and processing the
results. If you want all the results to end up in one List, then you
either need to syncronize on that list, or have only one thread reading
from the BlockingQueue and writing to the list.

If you are writing a Spider (or Robot, or whatever)... Be sure to
follow good netiquette and respect robots.txt
<http://www.robotstxt.org/>
 
D

DyslexicAnaboko

I never thought of my program as a robot, but I guess it could be
called that, never thought about it that way before.

I was also worried about servers thinking that I may be attacking them
(DOS attacks), not my intentions at all.
I will look through that link you provided, it never even crossed my
mind, thanks for the heads up.

I am collecting anonymous information about random people on MySpace
and my friend is using the information for statistics. Everything is
nameless and faceless, we are using peoples MySpace ID's only. It is
really neat stuff. That is why I am trying to speed up the program
because it is really painful to sit and wait for one page to be
downloaded at a time, especially when you are waiting on a sample of
10,000 people or more. There are +/- 149,142,765 accounts.

I will try working with the concurrent class as suggested.

Thank you,

Eli
 
D

DyslexicAnaboko

I wanted to apologize for not doing a follow up post. The semester
started for me and I couldn't even think about the program after that.
I did however purchase a book on java threads. Thanks again to
everyone for their help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top