Iterating over IMG in HTML file

M

Mike Mimic

Hi!

I would like that my program prints path of every image in a
HTML file (that is SRC attribute of IMG tag).

I have made this (code that prints names of all atributes in
all IMG tags):

HTMLEditorKit kit = new HTMLEditorKit();
kit.read(new BufferedReader(new FileReader(file)), html, 0);

HTMLDocument.Iterator it = html.getIterator(HTML.Tag.IMG);
while (it.isValid()) {
SimpleAttributeSet attrs =
(SimpleAttributeSet)it.getAttributes();
if (attrs != null) {
for (Enumeration e = attrs.getAttributeNames();
e.hasMoreElements();) {
System.out.println(e.nextElement());
}
}
}

and it does not work. But if I change HTML.Tag.IMG to HTML.Tag.A
it works as it should (for links).

HTML file has IMG tags (as well as A tags).


Mike
 
M

Mike Mimic

Hi!

Mike said:
I have made this (code that prints names of all atributes in
all IMG tags):

This works now for IMG but not for A (there is no A element in
element iterator list). :)

ElementIterator elit = new ElementIterator(html);
Element elem;
while ((elem = elit.next()) != null) {
if (elem.getName().equals("img")) {
// do something with elem.getAttributes()
}
}

So for everything else except A tags use this. And for A tag use
previous code.

Why?


Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top