Java and HTML parsing.

  • Thread starter Mathias Mejborn
  • Start date
M

Mathias Mejborn

Hello.

Iam trying to make my first html parser in Java, but i have some
problems that i can't figure out how to solve.

The interesting method in my program looks like this:

public void findHTML(){
try{
while (s != null){
if(s.indexOf("title=\"DR1\"")>-1){
System.out.println("DR1 fundet");
dr1Fundet = true;
if(dr1Fundet){

int start = s.indexOf("style=\"margin:0px;\">")+20;
System.out.println("Udskriver start: " + start);

tid = s.substring(start,5);
System.out.println("Udskriver tid" + tid);
}
}
s = ind.readLine();
}
}catch(Exception e){}
}

(I hope that the code turns out right when i post this).

What iam trying to achieve is:

On the website http://ontv.dk/tv/1 i would like to parse the following html:

<p style="font-weight:bold; font-size:15px;">Senere i dag på
DR1</p><table cellspacing="0" style="width:100%;"><tr
style="background-color:#eeeeee;"><td style="width:40px;
text-align:right;"><p style="margin:0px;">17.00:</p></td><td><p
style="margin:0px;"><a href="/programinfo/11178550000">Troldspejlet

You can see the html block starting on line 159 in the html source, and
ending on line 171.

What i want to extract from the html is: 17.00 followed by Troldspejlet.

My problem is that i can't figure out how to do this in any way, hope
some of you would help me out.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,573
Members
45,046
Latest member
Gavizuho

Latest Threads

Top