How to avoid frames?

R

rf

Philip Ronan
On 25/7/04 10:36 am, rf wrote:

That isn't entirely true.

A lot of servers leave out headers like "Content-length" and "Last-modified"
from SSI pages. This makes them harder to cache, slows down the delivery of
your pages, and might even affect their search ranking.

Well then said servers are broken.

However I don't know. I am merely responding from a logical point of view,
the UA does (should) not know how the server build the page.

I have never used SSI, being more inclined to use real server side stuff
like <shudder/> php.
 
P

Philip Ronan

Well then said servers are broken.

Boo hoo :-(
However I don't know. I am merely responding from a logical point of view,
the UA does (should) not know how the server build the page.

I have never used SSI, being more inclined to use real server side stuff
like <shudder/> php.

If you're using a standard build, then anyone looking at your pages will be
able to figure that out too:
 
R

rf

Philip said:
On 25/7/04 2:14 pm, rf wrote:
If you're using a standard
PHP:
 build, then anyone looking at your pages will be
able to figure that out too:
[QUOTE][/QUOTE][/QUOTE]

I guess I am not using a standard buld then. The PHP pages that I have out
there look exactly like hand coded HTML pages. No indication at all of how
they were built. I suppose my host is better than most :-)
 
T

Toby Inkster

Cogito said:
Does it mean that the only way to develop and test my html code is on
the server?
Yes.

Can't I test it on my pc?

Yes, you can. Download Apache and install it on your PC.
 
T

Toby Inkster

Philip said:
If you're using a standard build, then anyone looking at your pages will be
able to figure that out too:

But the UA won't know if you used the include() function. :)
 
P

Philip Ronan

But the UA won't know if you used the include() function. :)

Is that what you used on your "contact me" page?
HTTP/1.1 200 OK
Date: Sun, 25 Jul 2004 20:25:03 GMT
Server: Apache-AdvancedExtranetServer/2.0.48
X-Powered-By: PHP/4.3.4
Expires: Sun, 1 Jun 1980 12:00:00 GMT
Last-Modified: Sun, 25 Jul 2004 20:25:03 GMT
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8

Oooh, version 4.3.4!
 
T

Toby Inkster

Philip said:
Is that what you used on your "contact me" page?

Might have, might not have. You really can't tell from what gets sent to
the client. Looking at the headers, there's a good chance that PHP was
used to build the page, but you can't tell for what it was used.

I may have used it to include() another file. Or I may have used it to do
something like this:
Last modified: <?= date('r',filemtime($_SERVER['SCRIPT_FILENAME'])); ?>
(Which for those of you who don't speak PHP will tell you the date and
time that the file was last modified in RFC 822 format.)

As it happens, the real answer is that there is no "contact me" page. The
URL in my sig bears no relation to anything on my file system -- it's
rewritten by Apache to a script called "index.php" that has very little
content of its own, but include()s about 12 other PHP files, which in turn
pull in excessive amounts of data from a PostgreSQL database (making
about 50 SQL queries total!).
 
P

Philip Ronan

Might have, might not have. You really can't tell from what gets sent to
the client. Looking at the headers, there's a good chance that PHP was
used to build the page, but you can't tell for what it was used.

I was responding to rf's assertion that the use of server-side
scripting/inclusion to generate dynamic web pages is completely transparent
and undetectable to robots or anyone else. The presence of a header saying
"X-Powered-By: PHP/4.3.4" is a dead giveaway.

But that's only part of it. Content generated dynamically by servers also
tends to lack other headers like "ETag" and "Content-Length". Web developers
like yourself (yes, I checked your home page) also forget to build in
responses to "HEAD" requests, conditional fetches, and so on. This causes
various problems:

* The dynamic pages are harder to cache, because it is impossible
to verify their "freshness". This slows up their delivery to
clients.

* The pages are unfriendly to search engines because they do not
support conditional fetches or HEAD requests.

* The pages are incompatible with persistent HTTP connections and
therefore take longer to load

You can correct most of these problems with careful coding, but it does
involve quite a bit of effort. And you will still find it very difficult to
produce a page that looks and behaves in *exactly* the same way as an
ordinary HTML file.
I may have used it to include() another file. Or I may have used it to do
something like this:
Last modified: <?= date('r',filemtime($_SERVER['SCRIPT_FILENAME'])); ?>
(Which for those of you who don't speak PHP will tell you the date and
time that the file was last modified in RFC 822 format.)

Adding a header with an invalid date stamp won't help :)

Hint: Don't use date() -- use gmdate() with " GMT" tagged on the end. And
"getlastmod()" is tidier than "filemtime($_SERVER['SCRIPT_FILENAME'])".
 
L

Luigi Donatello Asero

Philip Ronan said:
I was responding to rf's assertion that the use of server-side
scripting/inclusion to generate dynamic web pages is completely transparent
and undetectable to robots or anyone else. The presence of a header saying
"X-Powered-By: PHP/4.3.4" is a dead giveaway.

But that's only part of it. Content generated dynamically by servers also
tends to lack other headers like "ETag" and "Content-Length". Web developers
like yourself (yes, I checked your home page) also forget to build in
responses to "HEAD" requests, conditional fetches, and so on. This causes
various problems:

* The dynamic pages are harder to cache, because it is impossible
to verify their "freshness". This slows up their delivery to
clients.

* The pages are unfriendly to search engines because they do not
support conditional fetches or HEAD requests.

* The pages are incompatible with persistent HTTP connections and
therefore take longer to load

You can correct most of these problems with careful coding, but it does
involve quite a bit of effort. And you will still find it very difficult to
produce a page that looks and behaves in *exactly* the same way as an
ordinary HTML file.

So, if I understand you properly, you would not recommend the use of dynamic
pages or SSI at least as much as it concerns accessibility of the page to
the robots and thus the possibility to get a higher rank for the search
engines or some of them.
 
P

Philip Ronan

So, if I understand you properly, you would not recommend the use of dynamic
pages or SSI at least as much as it concerns accessibility of the page to
the robots and thus the possibility to get a higher rank for the search
engines or some of them.

Yes, that's correct.

Of course dynamic pages are essential in some cases, but for your pages to
work successfully you need to handle HEAD and "If-Modified-Since" requests
properly, and you should make sure that your dynamic pages include header
information like "Content-Length", "Last-Modified" and "ETag".

Phil
 
S

SpaceGirl

Philip said:
I was responding to rf's assertion that the use of server-side
scripting/inclusion to generate dynamic web pages is completely transparent
and undetectable to robots or anyone else. The presence of a header saying
"X-Powered-By: PHP/4.3.4" is a dead giveaway.

The headers from my web site dont say that.
But that's only part of it. Content generated dynamically by servers also
tends to lack other headers like "ETag" and "Content-Length". Web developers
like yourself (yes, I checked your home page) also forget to build in
responses to "HEAD" requests, conditional fetches, and so on. This causes
various problems:

* The dynamic pages are harder to cache, because it is impossible
to verify their "freshness". This slows up their delivery to
clients.

One of the biggest problems with generated pages is often they get
cached when you dont want them to. For example, a page that pulls back
data from a database. You'd expect it to always be the latest data, but
you sometimes get cached copies even when you send nocache headers (not
meta tags, they never work). It can be quite a headache trying to work
around.

Dynamic pages send out the same headers as a regular page - there is
utterly no difference between a generated page and a regular page as far
as a browser is concerned.

* The pages are unfriendly to search engines because they do not
support conditional fetches or HEAD requests.

To a search engine, an SSI'd page looks just the same as a normal page.
The search engine cannot tell the difference, and there is no difference
in headers.

If a search engine hits my (generated) index page, it'll see the page
has just been updated (because it is regenerated because of the search
engines request for the page). How you manage headers (using your server
to write headers) is up to you, giving you control over whether to
report the page as "updated", and to add any other headers. By default,
there are no issues at all.

* The pages are incompatible with persistent HTTP connections and
therefore take longer to load

Makes no difference - no *noticeable* difference anyway. Vastly faster
than framed content anyway.



--


x theSpaceGirl (miranda)

# lead designer @ http://www.dhnewmedia.com #
# remove NO SPAM to email, or use form on website #
 
P

Philip Ronan

The headers from my web site dont say that.

So you're not using PHP. So what?
One of the biggest problems with generated pages is often they get
cached when you dont want them to. For example, a page that pulls back
data from a database. You'd expect it to always be the latest data, but
you sometimes get cached copies even when you send nocache headers (not
meta tags, they never work). It can be quite a headache trying to work
around.

Is that why you decided to add a "Cache-control: private" header to your
home page? Which bit of that came from a database?

As a rule, cacheable pages are friendlier to search engines. Yours are not.
To a search engine, an SSI'd page looks just the same as a normal page.
The search engine cannot tell the difference, and there is no difference
in headers.

*yawn*

Compare these:

(a) http://www.dhnewmedia.com/
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Mon, 26 Jul 2004 09:56:11 GMT
X-Powered-By: ASP.NET
Content-Length: 5460
Content-Type: text/html
Set-Cookie: ASPSESSIONIDSCASDRTA=FHLOEJLDLMEAFBHJFOCGPDLN; path=/
Cache-control: private

(b) http://www.dhnewmedia.com/blank.htm:
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
X-Powered-By: ASP.NET
Date: Mon, 26 Jul 2004 09:57:16 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Wed, 07 Jan 2004 13:23:05 GMT
ETag: "2e4f196221d5c31:9de"
Content-Length: 527

Your dynamic page (a) has no "Last-Modified" or "ETag" headers. This makes
it impossible for search engines to check for changes to your web page. So
instead of using "HEAD" requests or "If-Modified-Since" conditional fetches,
the only way a search engine can check this page is by reloading the whole
page and comparing it with a stored copy. That takes much longer. This page
is not "search engine friendly".

Your static page (b) does contain these headers, and it does support
conditional fetches. That's because you haven't meddled with it and the
server is able to deliver it in the usual way. Search engines and internet
caches can validate this page with a simple HEAD request or a conditional
fetch. This means it is friendlier to search engines and caches.

Phil
 
S

Steve Pugh

SpaceGirl said:
The headers from my web site dont say that.

Which one of your web sites? Both http://www.dhnewmedia.com/ and
http://www.subhuman.net/ return "X-Powered-By: ASP.NET" amongst the
headers.
One of the biggest problems with generated pages is often they get
cached when you dont want them to. For example, a page that pulls back
data from a database. You'd expect it to always be the latest data, but
you sometimes get cached copies even when you send nocache headers (not
meta tags, they never work). It can be quite a headache trying to work
around.

Frequently authors go too far in the opposite direction and prevent
the page being cached at all.

In your case (cache-control: private) you've gone for allowing
browsers to cache the page but not allowing proxies to cache the page.
But as you've provided no mechanism to communicate the freshness of
the page the browser has no way of telling whether the page in its
cache is up to date or not - so it has to fetch the page from the
server every time which wastes time and resources.

How many sites have content that changes so fast that two visits, five
minutes apart, always need to be served different versions? Sports
results, stock tickers, a few others.

Configuring the server to handle validate Last-Modified requests
(which it does automatically for static pages) saves a lot of repeated
fetching. But many people don't configure the server to do this -
neither of your sites that I checked do this.
Dynamic pages send out the same headers as a regular page - there is
utterly no difference between a generated page and a regular page as far
as a browser is concerned.

Often they don't send out the same headers. Your pages don't send out
Expires, Last-Modified or ETag headers.

This causes user agents to treat them as stale. So browsers will
always need to re-request the page from the server, even if they only
visited it five minutes ago, and search engines have no reason to
reindex the page regularly.
To a search engine, an SSI'd page looks just the same as a normal page.
The search engine cannot tell the difference, and there is no difference
in headers.

Not the case. As the headers are different the page is different.

By default PHP, ASP, etc, do not generate all the same headers that
the web server generates for static pages. It takes work on the
author's part to bring dynamic pages up the same level of
user-friendliness as static pages and many authors either don't bother
or, more likely, are ignorant of the issues involved.
If a search engine hits my (generated) index page, it'll see the page
has just been updated (because it is regenerated because of the search
engines request for the page).

How will it see that?
In neither of your sites that I checked are you supplying a
Last-Modified header.
How you manage headers (using your server
to write headers) is up to you, giving you control over whether to
report the page as "updated", and to add any other headers. By default,
there are no issues at all.

By default servers add all the correct headers to static pages but
they don't add them to dynamic pages.

Compare the headers for http://www.sfsfw.net/index.php where I've
added code to generate the correct headers and
http://www.sfsfw.net/a/index.php where I haven't yet modified the page
and you'll see quite a few missing headers.

I'm sure you'll see the same on your own servers if you compare the
headers for a page generated with ASP and a static one.

Useful tools:
http://www.web-caching.com/cgi-web-caching/cacheability.py
http://www.delorie.com/web/headers.html

cheers,
Steve
 
S

SpaceGirl

Steve said:
Which one of your web sites? Both http://www.dhnewmedia.com/ and
http://www.subhuman.net/ return "X-Powered-By: ASP.NET" amongst the
headers.

Hehheh well, those would both be bad examples, but okay.
Frequently authors go too far in the opposite direction and prevent
the page being cached at all.

I know... :/
In your case (cache-control: private) you've gone for allowing
browsers to cache the page but not allowing proxies to cache the page.
But as you've provided no mechanism to communicate the freshness of
the page the browser has no way of telling whether the page in its
cache is up to date or not - so it has to fetch the page from the
server every time which wastes time and resources.

Hmm well, in the case of the two sites you picked, yes. But the client
I'm working with now it would be a NIGHTMARE if pages got cached... the
site is 100% data driven, and complex at that. Lots of complicated
forms. I'd quite happily loose bandwidth in exchage for getting the
right data.

How many sites have content that changes so fast that two visits, five
minutes apart, always need to be served different versions? Sports
results, stock tickers, a few others.

How about seconds apart?
Configuring the server to handle validate Last-Modified requests
(which it does automatically for static pages) saves a lot of repeated
fetching. But many people don't configure the server to do this -
neither of your sites that I checked do this.

I wasn't holding them up as an example. They are both fairly raw 'play'
sites.
Often they don't send out the same headers. Your pages don't send out
Expires, Last-Modified or ETag headers.

This causes user agents to treat them as stale. So browsers will
always need to re-request the page from the server, even if they only
visited it five minutes ago, and search engines have no reason to
reindex the page regularly.



Not the case. As the headers are different the page is different.

By default PHP, ASP, etc, do not generate all the same headers that
the web server generates for static pages. It takes work on the
author's part to bring dynamic pages up the same level of
user-friendliness as static pages and many authors either don't bother
or, more likely, are ignorant of the issues involved.

I will investigate further! This last week has been a learning
experience - had a real problem with trying to get Apache NOT to let
pages cache... now I seem to have the opposit problem on IIS lol!
How will it see that?
In neither of your sites that I checked are you supplying a
Last-Modified header.




By default servers add all the correct headers to static pages but
they don't add them to dynamic pages.

Compare the headers for http://www.sfsfw.net/index.php where I've
added code to generate the correct headers and
http://www.sfsfw.net/a/index.php where I haven't yet modified the page
and you'll see quite a few missing headers.

I'm sure you'll see the same on your own servers if you compare the
headers for a page generated with ASP and a static one.

Useful tools:
http://www.web-caching.com/cgi-web-caching/cacheability.py

GREAT page btw... very useful.

I will have a play :) And for now I should shut up on the matter cuz I'm
out of my depth (obviously).

--


x theSpaceGirl (miranda)

# lead designer @ http://www.dhnewmedia.com #
# remove NO SPAM to email, or use form on website #
 
S

SpaceGirl

Philip said:
So you're not using PHP. So what?




Is that why you decided to add a "Cache-control: private" header to your
home page? Which bit of that came from a database?

As a rule, cacheable pages are friendlier to search engines. Yours are not.




*yawn*

Compare these:

(a) http://www.dhnewmedia.com/




(b) http://www.dhnewmedia.com/blank.htm:




Your dynamic page (a) has no "Last-Modified" or "ETag" headers. This makes
it impossible for search engines to check for changes to your web page. So
instead of using "HEAD" requests or "If-Modified-Since" conditional fetches,
the only way a search engine can check this page is by reloading the whole
page and comparing it with a stored copy. That takes much longer. This page
is not "search engine friendly".

Your static page (b) does contain these headers, and it does support
conditional fetches. That's because you haven't meddled with it and the
server is able to deliver it in the usual way. Search engines and internet
caches can validate this page with a simple HEAD request or a conditional
fetch. This means it is friendlier to search engines and caches.

Phil


Yep I take back what I said after reading Steve's reply too! I'm reading
up on this now. Got a demo of subhuman.net 10 almost ready so I'll add
some header munging to it and see how it works, and learn as I go. For
now, I'll shut up!

Sorry! :/

--


x theSpaceGirl (miranda)

# lead designer @ http://www.dhnewmedia.com #
# remove NO SPAM to email, or use form on website #
 
L

Luigi Donatello Asero

Philip Ronan said:
Yes, that's correct.

Of course dynamic pages are essential in some cases, but for your pages to
work successfully you need to handle HEAD and "If-Modified-Since" requests
properly, and you should make sure that your dynamic pages include header
information like "Content-Length", "Last-Modified" and "ETag".

Phil

Do I need add "If-Modified-Since" and "Last-Modified" even in the header of
static pages?
 
S

SpaceGirl

Steve said:
Which one of your web sites? Both http://www.dhnewmedia.com/ and
http://www.subhuman.net/ return "X-Powered-By: ASP.NET" amongst the
headers.




Frequently authors go too far in the opposite direction and prevent
the page being cached at all.

In your case (cache-control: private) you've gone for allowing
browsers to cache the page but not allowing proxies to cache the page.
But as you've provided no mechanism to communicate the freshness of
the page the browser has no way of telling whether the page in its
cache is up to date or not - so it has to fetch the page from the
server every time which wastes time and resources.

How many sites have content that changes so fast that two visits, five
minutes apart, always need to be served different versions? Sports
results, stock tickers, a few others.

Configuring the server to handle validate Last-Modified requests
(which it does automatically for static pages) saves a lot of repeated
fetching. But many people don't configure the server to do this -
neither of your sites that I checked do this.




Often they don't send out the same headers. Your pages don't send out
Expires, Last-Modified or ETag headers.

This causes user agents to treat them as stale. So browsers will
always need to re-request the page from the server, even if they only
visited it five minutes ago, and search engines have no reason to
reindex the page regularly.




Not the case. As the headers are different the page is different.

By default PHP, ASP, etc, do not generate all the same headers that
the web server generates for static pages. It takes work on the
author's part to bring dynamic pages up the same level of
user-friendliness as static pages and many authors either don't bother
or, more likely, are ignorant of the issues involved.




How will it see that?
In neither of your sites that I checked are you supplying a
Last-Modified header.




By default servers add all the correct headers to static pages but
they don't add them to dynamic pages.

Compare the headers for http://www.sfsfw.net/index.php where I've
added code to generate the correct headers and
http://www.sfsfw.net/a/index.php where I haven't yet modified the page
and you'll see quite a few missing headers.

I'm sure you'll see the same on your own servers if you compare the
headers for a page generated with ASP and a static one.

Useful tools:
http://www.web-caching.com/cgi-web-caching/cacheability.py
http://www.delorie.com/web/headers.html

cheers,
Steve

I had a play! http://www.subhuman.net/test.asp :)

--


x theSpaceGirl (miranda)

# lead designer @ http://www.dhnewmedia.com #
# remove NO SPAM to email, or use form on website #
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top