About Web Linking

V

Victor

Dear all
Anyone could tell me how can I extract the contents of outside web site in
my html file, how can i write it?
thanks
 
S

SpaceGirl

Victor said:
Dear all
Anyone could tell me how can I extract the contents of outside web site in
my html file, how can i write it?
thanks

Can you rephrase that hon? It's not very clear what you want.
 
V

Victor

i mean i can extract some information from other web site ( i.e. weather)
and then display in my web site.
 
S

SpaceGirl

Dear all
site

i mean i can extract some information from other web site ( i.e. weather)
and then display in my web site.

Not easily - however many web sites now offer something called RSS; these
are free feeds of data in XML format, which you can include in your web site
totally free (and format it to make it look just the way you want). Anything
else is technically copyright theft, unless you explicitly get permission of
the owner of the web site you are leaching from. There is no simple way to
automate the stealing of other sites content and wrapping it up as if it
were your own on your web site, and for pretty obvious reasons.
 
M

Matthias Gutfeldt

SpaceGirl said:
Not easily - however many web sites now offer something called RSS; these
are free feeds of data in XML format, which you can include in your web site
totally free (and format it to make it look just the way you want). Anything
else is technically copyright theft, unless you explicitly get permission of
the owner of the web site you are leaching from. There is no simple way to
automate the stealing of other sites content and wrapping it up as if it
were your own on your web site, and for pretty obvious reasons.

Depends on how you define "simple way". There are dozens of tools that
attempt to extract website content, a long list of PHP scripts is here:
<http://www.hotscripts.com/PHP/Scripts_and_Programs/Web_Fetching/index.html>.


Anybody can grab content with a bit of Regex. Usually it's easier to
grab content from a well-structured, W3C-compliant site than from a
tag-soup site.

The copyright issue is something to consider, of course. And RSS feeds
are easier to grab.


Matthias
 
A

Andy Dingley

Victor said:
Anyone could tell me how can I extract the contents of outside web site in
my html file, how can i write it?

I wouldn't do this. Try using RSS instead - it's much easier, and you
can find RSS feeds to offer almost any content you might wish for.

You can "screen-scrape" HTML content, but it's difficult. And when
you've done it, the page designer might re-design their page and then
all of your code stops working.
 
T

Toby A Inkster

Matthias said:
Anybody can grab content with a bit of Regex. Usually it's easier to
grab content from a well-structured, W3C-compliant site than from a
tag-soup site.

I find the main determining factor in scraping content is simply
consistancy. If all the pages use a common, predictable format (e.g. they
are all generated dynamically) then it's easy to scrape -- standards don't
come into it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top