Hpricot - best way to parse based on comments

Jerome --- · Nov 20, 2006

I am trying to parse some files that contain comments like this:

<html>
<body>



images, text, etc...



Interesting text of site here.

</body>
</html>

I am wondering how to go about extracting the data within the comments
block using Hpricot. I am not aware of a way to refer to commented HTML
through CSS or XPath selectors.

Thanks for any ideas!

- Jerome

Keith Fahlgren · Nov 20, 2006

I am trying to parse some files that contain comments like this:
...
I am not aware of a way to refer to commented HTML
through CSS or XPath selectors.

The XPath comment() selector will select all comments:

For example (xpath after -m flag):
keith@devel ~ $ xml sel -t -m '//comment()' -v '.' -n simple.xml
one comment
two comment

keith@devel ~ $ cat simple.xml
<simple>

<foo/>

<bar/>
</simple>

HTH,
Keith

Ken Bloom · Nov 21, 2006

I am trying to parse some files that contain comments like this:

<html>
<body>



images, text, etc...



Interesting text of site here.

</body>
</html>

I am wondering how to go about extracting the data within the comments
block using Hpricot. I am not aware of a way to refer to commented HTML
through CSS or XPath selectors.

Thanks for any ideas!

- Jerome

Why not gsub out the unwanted sections before parsing with hpricot, or
if the data you want is nested between comments, use a regexp to narrow
down the document to only the text between the comments before parsing
with hpricot?

--Ken Bloom

Why is this WordPress comments form not submitting?	1	Jan 12, 2020
How to position the tooltip comment on these buttons?	9	Nov 4, 2023
using HPricot to parse a fiddly table	2	Jan 6, 2008
What's the best way to parse this HTML tag?	3	Mar 11, 2012
Help on best way to gather/sort results [Array/Hash]?	14	Mar 29, 2008
Best way to parse/update HTML file?	6	Jun 25, 2005
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
Parse inf comments in perl?	1	May 19, 2006

Hpricot - best way to parse based on comments

Jerome ---

Keith Fahlgren

Ken Bloom

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads