Mark up compound noun so that search engines see two words

G

Greg N.

I need to mark up a compound noun such that it looks like one word on
screen, but I want search bots to see two separate words.

Take, for example the word "Kindergarten". I want a search for the word
"kinder" or the word "garten" to hit upon my page.

I was thinking to insert a   in a tiny font size between "kinder"
and "garten". Are there other, maybe more elegant ways?

My web sites are written in German. In this language, as you might
know, the problem described above is ubiquitous.
 
B

BootNic

Greg N. said:
I need to mark up a compound noun such that it looks like one word
on screen, but I want search bots to see two separate words.

Take, for example the word "Kindergarten". I want a search for the
word "kinder" or the word "garten" to hit upon my page.

I was thinking to insert a   in a tiny font size between
"kinder" and "garten". Are there other, maybe more elegant ways?

My web sites are written in German. In this language, as you might
know, the problem described above is ubiquitous.

Hide the space with css, no clue what a search bot will do with it.

kinder<span style="display:none;"> </span>garten

--
BootNic Saturday, February 11, 2006 11:38 AM

Imagination was given to man to compensate him for what he isn't. A
sense of humor was provided to console him for what he is.
*Horace Walpole English novelist*
 
J

Jonathan N. Little

BootNic said:
Hide the space with css, no clue what a search bot will do with it.

kinder<span style="display:none;"> </span>garten
Would this not be simpler?

<p>... kinder<span>garten</span> ... </p>
 
B

BootNic

Jonathan N. Little said:
Would this not be simpler?

<p>... kinder<span>garten</span> ... </p>

I give up.

Is it the same thing?

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<script type="text/javascript">
function blip(){
tip=(document.body.textContent)?document.body.textContent:
document.body.innerText;
alert(tip)
}
window.onload=blip;
</script>
<title></title>
</head>
<body>
<p>kinder<span style="display:none;"> </span>garten</p>
<p>kinder<span>garten</span></p>
</body>
</html>

--
BootNic Saturday, February 11, 2006 12:36 PM

One must learn by doing the thing, for though you think you know it,
you have no certainty until you try.
*Aristotle*
 
J

Jukka K. Korpela

BootNic said:
Hide the space with css, no clue what a search bot will do with it.

kinder<span style="display:none;"> </span>garten

Did you forget that CSS is for optional presentational suggestions?
 
J

Jukka K. Korpela

Greg N. said:
I need to mark up a compound noun such that it looks like one word
on screen, but I want search bots to see two separate words.

Won't work, but you might cause some damage in trying to achieve that.
Take, for example the word "Kindergarten". I want a search for the
word "kinder" or the word "garten" to hit upon my page.

I don't think that's any of your real examples.
I was thinking to insert a &nbsp; in a tiny font size between
"kinder" and "garten".

And what happens when font size is forced by the user (or by the user
agent)? How will speech browsers read it?
My web sites are written in German. In this language, as you might
know, the problem described above is ubiquitous.

Surely there are situations where we would like to make components of a
compound word key words in searching. In that case,
a) use <meta name="keywords" content="kinder,garten">, which will
not help much but won't cause damage either
b) formulate the texts so that the relevant words appears as separate
words, too; this should be possible if they can be significant
content words in discussing the topic of the page - and if they
aren't, should the page really match them in searches?
c) hope that search engines will improve; did you actually check
what Google finds when you use the search words
kinder garten
? (You'll be surprised. But unfortunately, it's still rather
limited functionality.)
 
G

Greg N.

I don't think that's any of your real examples.

What are you trying to imply here? Of course it's not a real example, I
was just trying to use a compound word that works well as an example in
German and English.
And what happens when font size is forced by the user (or by the user
agent)? How will speech browsers read it?

You tell me. You're the smart one around here.
In that case,
a) use .... which will not help much...

Oh thanks.
b) formulate the texts so that the relevant words appears as separate
words, too; this should be possible if they can be significant
content words in discussing the topic of the page - and if they
aren't, should the page really match them in searches?

again, you seem to imply I'm trying something unethical here, maybe
keyword spamming? If I were as competent as you think you are, I'd only
imply that after spotting some evidence.
c) ... did you actually check what Google finds when you use the search words
kinder garten? (You'll be surprised....

What do you think it finds? Not Kindergarten, anyways, unless that site
contains the words kinder and garten separately as well.
 
G

Greg N.

Jukka said:
Did you forget that CSS is for optional presentational suggestions?

Well, is there really a problem? Whether a screen reader says "kinder
garten" or "kindergarten" makes probably little or no audible
difference. And if the CSS is not properly rendered by some browser,
the user will see an extra space which will appear as a minor spelling
quirk but not affect the meaning of the text.
 
N

Neredbojias

With neither quill nor qualm, Greg N. quothed:
I need to mark up a compound noun such that it looks like one word on
screen, but I want search bots to see two separate words.

Take, for example the word "Kindergarten". I want a search for the word
"kinder" or the word "garten" to hit upon my page.

I was thinking to insert a &nbsp; in a tiny font size between "kinder"
and "garten". Are there other, maybe more elegant ways?

My web sites are written in German. In this language, as you might
know, the problem described above is ubiquitous.

This seems to work well at several different sizes and fonts:

tel <span style="margin-left:-.25em;">star</span>

Of course, I didn't check _all_ the sizes and fonts nor letter
combinations nor how it breaks etc., etc.
 
T

Toby Inkster

Greg said:
I was thinking to insert a &nbsp; in a tiny font size between "kinder"
and "garten". Are there other, maybe more elegant ways?

kinder<small style="display:none">-</small>garten

kindergarten
 
T

Toby Inkster

Greg said:
What are you trying to imply here? Of course it's not a real example, I
was just trying to use a compound word that works well as an example in
German and English.

Although "kindergarten" is used in English, "kinder" and "garten" alone
aren't. Well, "kinder" (more kind) is spelt the same as "kinder" (German
for child), but it's not really the same word.
 
C

chromatic_aberration

Jonathan said:
Would this not be simpler?

<p>... kinder<span>garten</span> ... </p>

or maybe:

<p> ... <span title="kinder, garten">kindergarten</span> ... </p>

don't bots index the "title" attributes? You'll get a tooltip though...
 
J

Jukka K. Korpela

Greg N. said:
Well, is there really a problem?

Yes, the markup would indeed create a problem (probably without solving
any problem).
Whether a screen reader says
"kinder garten" or "kindergarten" makes probably little or no
audible difference.

Did you actually try it?
And if the CSS is not properly rendered by
some browser, the user will see an extra space which will appear as
a minor spelling quirk but not affect the meaning of the text.

If it does not matter, why don't you just misspell the word as
"kinder garten" and forget the complexities? Then you, as an author,
would not too easily miss to see the problem you have created.
 
J

Jukka K. Korpela

Greg N. said:
What are you trying to imply here?

That you did not present your real problem.
You tell me. You're the smart one around here.

I find it mildly disguisting that you make nasty notes about a named
person and hide yourself under an incomplete name.
again, you seem to imply I'm trying something unethical here, maybe
keyword spamming?

I am implying that you did not present your real problem, and you seem
to prove it by not explaining it, by not giving any URL or other real
examples that would illustrate it, and by trying to insult.
If I were as competent as you think you are, I'd
only imply that after spotting some evidence.

Is this your usual way of talking to people who are capable of helping
and were even willing to do that for free?
 
J

Jukka K. Korpela

Toby Inkster said:
kinder<small style="display:none">-</small>garten

That's not elegant at all, because the hyphen would appear in small
font when CSS is disabled but <small> markup is honored. Besides, the
way Google handles hyphenated compounds is somewhat mysterious.

Using kinder&shy;garten might be reasonable _if_ you think that it is
acceptable to spell the word as "kinder-garten", too.
kindergarten

The U+FEFF character is a space character. Though it has nominally no
width, it may be expanded in justification, and it also constitutes an
allowed line break point - a direct line break point, not a hyphenation
point. Would you like to have the word split as "kinder" (without a
hyphen) at the end of a line.
 
G

Greg N.

Jukka said:
That you did not present your real problem.

I think I did.
I find it mildly disguisting that you make nasty notes about a named
person and hide yourself under an incomplete name.

Come on, there is more information about me on the web than about most
other netcitizens. I'm not hiding at all, and you _know_ that.
I am implying that you did not present your real problem, and you seem
to prove it by not explaining it, by not giving any URL or other real
examples that would illustrate it, and by trying to insult.

I thought I described my problem very accurately, so that no URL was
required. When Im opened the thread, there was just text, no html or
css (yet) pertaining to the problem.

All I have is a compund noun in German that I want Google to treat both
parts of separately. Do you really need an URL for such a simple
question?

But if you insist, ok. It is the site in my sig below. And it is not
the word "Kindergarten" but, for instance, "Motorradreisen". By now, I
have changed it in ways suggested here by others.
Is this your usual way of talking to people who are capable of helping
and were even willing to do that for free?

Well, it's not always apparent to me that you're going out of your way
trying to help.

Many of your posts are of the "you're asking the wrong question", or
"you're not explaining all of it", or "what if a speech browser sees
this", etc.. Ok, sometimes this helps, thank you.

But sometimes, I get the impression you know a straightforward answer,
and you're hinting but not telling, just to rub our nose into the fact
that you know better.

You may say, this is all insulting and rude, and you may plonk me. Or
you may think about it.
 
D

dorayme

Jukka K. Korpela said:
That you did not present your real problem.


I find it mildly disguisting that you make nasty notes about a named
person and hide yourself under an incomplete name.

This is silly indeed. In general, if you had learnt from my
previous advice to you, you would see that there are far more
serious types of "personal " deceptions than the use of nom de
plumes on email addresses for newsgroups. But in particular, this
is quite outlandishly bad, this GregN, for God's sake has
published pictures of himself on motorbikes. There's not much
that is incomplete about this. How about a picture of you on a
motorbike for us to see? Or a camel?
I am implying that you did not present your real problem, and you seem
to prove it by not explaining it, by not giving any URL or other real
examples that would illustrate it, and by trying to insult.


Is this your usual way of talking to people who are capable of helping
and were even willing to do that for free?

Once again, you have obviously not learnt from my previous
lessons to you about this sentiment. In case I never put it this
way, getting advice from you is rarely free, it so often comes at
the price of humiliation.
 
T

Toby Inkster

Jukka said:
The U+FEFF character is a space character. Though it has nominally no
width, it may be expanded in justification, and it also constitutes an
allowed line break point - a direct line break point, not a hyphenation
point. Would you like to have the word split as "kinder" (without a
hyphen) at the end of a line.

U+FEFF is the zero-width NON-BREAKING space.
 
J

Jukka K. Korpela

Greg N. said:
I think I did.

You can keep thinking that way, but it won't help you.
Come on, there is more information about me on the web than about
most other netcitizens. I'm not hiding at all, and you _know_
that.

I know that you are using a protocol-incorrect From field; "Greg N."
cannot be your full name. Whether you indirectly reveal your identity
is immaterial.
I thought I described my problem very accurately, so that no URL
was required.

People generally ask for help when they expect that someone else knows
better the issue at hand. Then it is reasonable to assume that if
others ask for further information, they _know_ that more information
is needed.
All I have is a compund noun in German that I want Google to treat
both parts of separately. Do you really need an URL for such a
simple question?

No, but you would have needed to post it. Whether you believe it or
not, I know the topic better than you, and I know that the problem
cannot be solved in such an abstract form and in isoltation.
But if you insist, ok. It is the site in my sig below.

Sigs are for fun and for "nice to know" stuff. Don't rely on sigs in
any way. For all that you can know, I may have configured my newsreader
to suppress sigs, since they are mostly just boring or worse. And
mentioning a site isn't sufficient; you haven't told _how_ the issue
would be relevant there.
And it is
not the word "Kindergarten" but, for instance, "Motorradreisen".

"For instance"? Was there _any_ other word you worried about?
(And you probably worried about a split to main components only, not to
"Motor", "Rad", and "Reisen". So it _was_ rather specific.)
By now, I have changed it in ways suggested here by others.

That's your prerogative. You can mistype words in a misguided attempt
at boosting your page in Google. After all, Usenet is an almost
infinite supply of wrong advice, especially to those who fail to
formulate their questions reasonably even after continued attempts at
helping them.

Now your main heading says you don't care about elementary rules of
orthography of the language you are writing in. Believe it or not, the
trick _is_ visible, even on graphic browsers.
You may say, this is all insulting and rude,

No, this time you just exercised pointless babbling.
and you may plonk me.

That's irrelevant to you, actually; you seem to be destined to use only
the advice that looks pleasant to you.
 
J

Jukka K. Korpela

Toby Inkster said:
U+FEFF is the zero-width NON-BREAKING space.

You're right; I should try to remember that I don't remember all
Unicode characters yet. (And I really _should_ remember correctly what
U+FEFF is. [Slaps himself.])

The defined meaning of U+FEFF is that it is a) a byte order mark (BOM),
b) an invisible control character for preventing a line break, and in
the latter role, U+2060 WORD JOINER is preferred. This means, in
effect, that by Unicode recommendations, U+FEFF should only be used at
the start of a text file as BOM.

This is somewhat theoretic of course, since U+2060 is poorly supported.
Besides, HTML specifications do not require that Unicode semantics be
obeyed; on the other hand, this means that the effect of U+FEFF in an
HTML document is _undefined_.

What you are really saying by using kindergarten is that the
word "kindergarten" be not divided into its components in word
division. This has little effect at present, since browsers don't do
word division.

So in that sense, it might be a harmless trick in an attempt to make
indexing robots treat the construct as two words. However, we have no
guarantee that this actually happens (after all, search engines _could_
be Unicode-aware and treat a word with prevented line break inside as
very much a single word).

Some user agents will choke on . Such user agents are rare
these days, but before taking a risk, I would like to see that
something can possibly be gained. If the split into components is
natural (and "kinder" and "garten" is not, for English text), then it
would be better to _use_ the component words in natural sentences as
healthy, natural food for search engines. If it isn't, the whole trick
is probably quite pointless; nobody is going to search for "kinder" and
"garten" if he wants to find info on kindergartens.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top