HTML source in Google

O

Oli Filth

Daniel said:
Hi all.

I have made a web page (www.brettshaping.com). When google list some of the
sub pages, the HTML source appears instead of a preview of the text in my
page. Klick on the link below to see what i mean (scroll to the bottom og
the page):
http://www.google.no/search?q=+site:www.brettshaping.com+seilbrett&hl=no&lr=
&start=0&sa=N&filter=0

Why does this happen? Is there some error in my source that causes this?
What can I do about it?

If you look at the pages that Google have cached (e.g.
http://216.239.59.104/search?q=cach...tm++site:www.brettshaping.com+seilbrett&hl=no),
you'll see that they're shown as things like:

< H T M L >
< H E A D >
< T I T L E > S h a p i n g ... etc. etc. etc.

i.e. a whitespace after every character.

Your page is encoded as UTF-16, i.e. two 8-bit bytes per symbol. As
you're only using characters from the ASCII set, what you're seeing
would tally with Google interpreting your page as UTF-8 or ISO-8859-1 (I
think. Someone may want to correct me on this! ;) ).

Is it possible that at some point your server was not configured to
output HTML with the correct charset? If Google cached the page at this
point, it may have defaulted to an 8-bit charset, hence the result
you're seeing.
 
D

Dave Patton

Hi all.

I have made a web page (www.brettshaping.com). When google list some
of the sub pages, the HTML source appears instead of a preview of the
text in my page. Klick on the link below to see what i mean (scroll to
the bottom og the page):
http://www.google.no/search?q=+site:www.brettshaping.com+seilbrett&hl=n
o&lr= &start=0&sa=N&filter=0

Why does this happen? Is there some error in my source that causes
this? What can I do about it?

Maybe either a glitch at Google's end, or maybe there
was a problem with your server. Look at Google's
cached version of your page(s) with the problem.
It tells you when Google retrieved the page.
Maybe you can then use your server logs to figure
out if it was a problem at your end.
 
D

Daniel Hjerholm

Oli Filth said:
If you look at the pages that Google have cached (e.g.
http://216.239.59.104/search?q=cache:rAIRSd86w3QJ:www.brettshaping.com/toppl
aminat.htm++site:www.brettshaping.com+seilbrett&hl=no),
you'll see that they're shown as things like:

< H T M L >
< H E A D >
< T I T L E > S h a p i n g ... etc. etc. etc.

i.e. a whitespace after every character.

Your page is encoded as UTF-16, i.e. two 8-bit bytes per symbol. As
you're only using characters from the ASCII set, what you're seeing
would tally with Google interpreting your page as UTF-8 or ISO-8859-1 (I
think. Someone may want to correct me on this! ;) ).

Is it possible that at some point your server was not configured to
output HTML with the correct charset? If Google cached the page at this
point, it may have defaulted to an 8-bit charset, hence the result
you're seeing.

Hi!

You were right! Notepad automatically saved the pages with Unicode encoding
because they contained some special characters. I have now saved them with
ANSI encoding. The special characters are still there, so I don't know why
Notepad didn't want to use Unicode in the first place.

Anyway, thanks for your quick reply.

Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top