extracting part of a document

U

Une Bévue

the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --> mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)

following :
<http://www.quirksmode.org/js/events_mouse.html>
and :
<http://www.quirksmode.org/dom/getElementsByTagNames.html>

where iI did extract a usefull function, to me :

--- getElementsByTagNames(list,obj) ---
function getElementsByTagNames(list,obj) {
nb_calls++;
if (!obj) var obj = document.body;
var tagNames = list.split(',');
var resultArray = new Array();
var tags;
for (var i=0;i<tagNames.length;i++) {
tags = obj.getElementsByTagName(tagNames);
nb_tags+=tags.length;
for (var j=0;j<tags.length;j++) {
resultArray.push(tags[j]);
nb_loop++;
}
}
return resultArray;
}
-----------------------------------------------------

here are the probs i get )))

with an html page as mentionned above having 5 divs inside the body, in
order to simulate a "real life" document :
div banner, div left, div extract, div right and div footer.

the div extract dom structure being ;

--- div#extract ----------------------------------------
<div id="extract">
<h3 id="click">Mousedown, mouseup, click</h3>

<p>...</p>

<ol>
<li><code>...</code>...</li>
<li><code>...</code>,...</li>
<li><code>...</code>...
<code>...</code>...<code>...</code>...</li>
</ol>

<p>... <code>...</code> ...<code>...</code>
....<code>...</code>...<code>...</code>....<code>...</code>...</p>

<p>...<code>...</code>... <code>...</code> ....</p>

<p>...<code>...</code>...<code>click</code>...</p>

<p>...</p>
</div>
-----------------------------------------------------------


if, on this div extract i do :

extract=document.getElementById("extract")
then :
elts_extract=getElementsByTagNames(the_list,extract);
with :
var the_list="div, h1, h2, h3, h4, h5, p, img, ul, li, table, pre"

i get NO elements at all using the function
"getElementsByTagNames(list,obj)"

notice that the vars : nb_calls, nb_tags and nb_loop are there only
for debuging.

in case someone have some light abour that...
 
R

RobG

Une said:
the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --> mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)

That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:


function trimBody (htmlElement){
var docBody = document.body;
while ( docBody.firstChild ){
docBody.removeChild( docBody.firstChild );
}
docBody.appendChild( htmlElement );
}

Your method of getting elements by tag name will result in a set of
node collections, you have destroyed the structure and don't know how
to put it back. The above maintains the structure:

<script type="text/javascript">

function trimBody(htmlElement){
var docBody = document.body;
while (docBody.firstChild){
docBody.removeChild(docBody.firstChild);
}
docBody.appendChild(htmlElement);
}

</script>
<body>
<div id="one" onclick="trimBody(this);">
<p>Click here to keep just div <b>one</b>
</div>
<div id="two" onclick="trimBody(this);">
<p>Click here to keep just div <b>two</b>
</div>
<div id="three" onclick="trimBody(this);">
<p>Click here to keep just div <b>three</b>
</div>
</body>
 
U

Une Bévue

RobG said:
That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...
 
R

RobG

Une said:
<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...

OK, but don't believe the junk about "Safari bug": any node that
supports the event interface can be an event target, it's just that
webkit browsers have implemented it on text nodes where other browsers
haven't.

I think my method of removing child nodes is more efficient... you are
free to chose. :)
 
U

Une Bévue

RobG said:
I think my method of removing child nodes is more efficient... you are
free to chose. :)

yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?

i think i've notice i did taht in reverse order (from last to first) ???

new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>
 
R

RobG

Une said:
yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?

Using getElementsByTagNames created an array of elements that was not
the same as the original structure, I assumed you were going to just put
them back in the same order as the array, not where they started from.

It failed because your list of tag names was:

"div, h1, h2, h3, ..."

There were no div's inside div extract, and the other tags have leading
spaces so you were trying to match " h1" rather than "h1", etc.
i think i've notice i did taht in reverse order (from last to first) ???

It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.

Works fine (even in Safari) :)
 
U

Une Bévue

RobG said:
It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.

ok, i've understood now what u mean.
Works fine (even in Safari) :)

yes i even try it with Webkit the latest nightly build (doesn't give the
same height for the divs...)

right now i do have to write a ruby script in order to put this line
somewhere in the head :

<script type="text/javascript" src="js/trim_body.js"></script>

not a big tuff.

and test over any given page...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top