L
lorean2007
Hello,
i'm would be interested in parsing a HTML files by its corresponding
opening and closing tags but by taking into account the class
attributes and its values,
<html>
<body>
....
<div class="one">
....
<div class="two">
</div>
....
</div>
....
<div class="one">...</div>
<a href="..." class="three">
</body>
</html>
in this example, i will need all content inside div with class="two",
or only class="one",
so i wondering if i should go with regular expression, but i do not
think so as i must jumpt after inner closing div, or with a simple
parser, i've searched and found
http://www.diveintopython.org/html_processing/basehtmlprocessor.html
but i would like the parser not to change anything at all (no
lowercase).
can you help ?
best.
i'm would be interested in parsing a HTML files by its corresponding
opening and closing tags but by taking into account the class
attributes and its values,
<html>
<body>
....
<div class="one">
....
<div class="two">
</div>
....
</div>
....
<div class="one">...</div>
<a href="..." class="three">
</body>
</html>
in this example, i will need all content inside div with class="two",
or only class="one",
so i wondering if i should go with regular expression, but i do not
think so as i must jumpt after inner closing div, or with a simple
parser, i've searched and found
http://www.diveintopython.org/html_processing/basehtmlprocessor.html
but i would like the parser not to change anything at all (no
lowercase).
can you help ?
best.