How do i write a code that reads a site, then takes the neccisary information

D

Dustinscape

I want to make a program that takes the character stats from the
runescape highscore website (by the user entering their name) and then
takes their levels and adds them to the corrisponding int value

ex.
int woodcutting = 50
int firemaking = 50

the site can be manipuluated to view the levels of individual
characters by adding the user name at the end of the below URL
http://hiscore.runescape.com/lang/en/aff/runescape/hiscorepersonal.ws?user1=usernamehere


Here is the pseudo code version of my program
1) user inputs name, name is put into char value name
2) the name value is then placed at the end of this url which is read
by the program
http://hiscore.runescape.com/lang/en/aff/runescape/hiscorepersonal.ws?user1=dustin
3) the program searches through the text on the page and finds the
corrisponding levels with their corresponding values. Probably useing a
quicksort/mergesort/binary search method
int Overall
int Attack
int Defence
int Strength
int Hitpoints
int Ranged
int Prayer
int Magic
int Cooking
int Woodcutting
int Fletching
int Fishing
int Firemaking
int Crafting
int Smithing
int Mining
int Herblore
int Agility
int Thieving
int Slayer
int Farming
int Runecraft
4) easy codeing from there on, the program outputs those values and
outputs them



I was wondering if anyone knew of java code that took information from
other sites after manipulating the code, preferably some code that
would match this concept for my site, however all ideas are greatly
appreciated :)
 
R

raisenero

I don't know of any existing libraries that do precisely what you want
to do, but the Java API includes classes and methods to do most of
those individual steps.

A problem you might be facing is the level of abstraction in your
pseudo-code may be too high. When you express a solution in
pseudo-code, turning it into actual code should be a trivial matter.
Pseudo-code should express the solution, roughly line for line, the way
it will be expressed in the language. In my mind, the only difference
between pseudo-code and actual code is the use of human readable words
instead of language specific syntax.

As an example:

for(ArrayList employee : employeeList)
{
namelist.add(employee.name);
employee.increasePay(50000);
employee.cancelBenefits();
}

For each employee in the list
Store the employee's name
Upgrade the employee's salary
Cancel employee's benefits
End Loop

You should spend some time breaking down each of your 4 steps into
specific algorithms, and then expressing those algorithms in
pseudo-code.
 
S

Simon

Here is the pseudo code version of my program
1) user inputs name, name is put into char value name
2) the name value is then placed at the end of this url which is read
by the program
http://hiscore.runescape.com/lang/en/aff/runescape/hiscorepersonal.ws?user1=dustin
3) the program searches through the text on the page and finds the
corrisponding levels with their corresponding values. Probably useing a
quicksort/mergesort/binary search method
int Overall
int Attack
int Defence
int Strength
int Hitpoints
int Ranged
int Prayer
int Magic
int Cooking
int Woodcutting
int Fletching
int Fishing
int Firemaking
int Crafting
int Smithing
int Mining
int Herblore
int Agility
int Thieving
int Slayer
int Farming
int Runecraft
4) easy codeing from there on, the program outputs those values and
outputs them

As regards point 3): Have you ever hear of "arrays"? SCNR
I dont't see what you want to do with sorting algorithms here.

If the HTML in your site is actually XML you could try to use Java's XML
capabilities (org.w3c.dom, org.xml.sax) for parsing. However, I doubt that this
would be successful. It might be safe to use regular expressions
(java.util.regex) to match against someting like

<td>Skillname</td><td>value</td>

Cheers,
Simon
 
A

AndrewTK

You'll have to delve into the site's techiniques for user profiles as
you need to be logged in to do anything:

"Dustin does not feature in the hiscores. You have to be in the top 1
million and have a minimum skill level of 30"

After that you can read the page into memory and either parse the XML
(using Xerces or SAX for instance) or use regular expressions
(www.regular-expressions.info )

In either case, you'll have to get past user login first.
 
A

Alex Hunsley

AndrewTK said:
You'll have to delve into the site's techiniques for user profiles as
you need to be logged in to do anything:

"Dustin does not feature in the hiscores. You have to be in the top 1
million and have a minimum skill level of 30"

After that you can read the page into memory and either parse the XML
(using Xerces or SAX for instance) or use regular expressions
(www.regular-expressions.info )

Trying to interpret the HTML as if it were well formed XML might be
being a bit optimistic! (Given the laxity of most browsers in what they
accept as html.) I would go with a regular expression approach. Not a
satisfying solution, but one that might work.
 
R

Roedy Green

Trying to interpret the HTML as if it were well formed XML might be
being a bit optimistic! (Given the laxity of most browsers in what they
accept as html.) I would go with a regular expression approach. Not a
satisfying solution, but one that might work.

the big problem with that is regexes don't understand nesting.
There are various HTML parsers around. They can at least deal with
missing </li> etc.

The way I handle it is I keep my HTML code clean using HTMLValidator.
Then I can use quite simple minded parsers on it that just give up if
they discover anything invalid.

see http://mindprod.com/jgloss/htmlvalidator.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top