rf said:
apostolos
Ah. You want to steal bits of other peoples pages?
Please. I can think of at least two legitimate uses for this kind of
functionality.
One I saw as part of a smart personal spider: you could go to a site
(say, like Amazon, where the product detail pages all have a similar
structure and layout), select a representative sample of text, coax it
into a semantically rich XML structure, and then set the spider loose to
pull down and intelligently index similarly structured pages on the rest
of the site. You wouldn't say that Googlebot steals, so why would this
be any different?
Another is simply dealing with the fact that while more useful and
important information is being placed on the Web, the content providers
are still doing a horrible job of making sure that content doesn't move
or outright disappear. I've seen a number of articles in recent months,
about the difficulty in using web content as a reference (e.g., in a
footnote for a scholarly article) due to this very problem. Just as it's
fair use for a researcher to photocopy pages out of book for later
reference, it seems reasonable -- and frankly more of a necessity -- for
a researcher on the web to be able to snip and save parts of pages to
refer to later. After all, the odds are good that a month or two later,
the original page will be gone.
Oh, and then there's the OP's stated goal, which sounds like perfectly
legal use to me as well.
Have a nice day; it's sound like you need the encouragement.
--
Joel.
http://www.cv6.org/
"May she also say with just pride:
I have done the State some service."