email address obfuscation

D

dorayme

Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages. Mostly interested to
hear from anyone using specific methods (rather than anything
else like further reviews, analyses of the ultimate effectiveness
etc, having things like "removeThis" inside the email address
that is in the "mailto:").

I had a client recently ask me to "do something" about the spam
coming from his website. I want to do better than tell him to get
the best spam filter he can, both on his local and on his server
end via his host. There is a javascript thing I used to use but
these days I am interested in being able to get by without it.
so, any suggestions will be welcome, especially if they are
actually being used by the suggester (not someone else they know
or have heard of... take this as a compliment)
 
J

Joe

Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages.
...
I've been using the 'hash entity' method for years. Seems to work, but
as it's used in harness with spam filter on my email proggy and isp, I
don't really know. I replace about half the addy with entities,
especially the . and @ , but how far you go is up to you. It can't hurt
to try.
Anyway, check it at http://graspages.cjb.cc/emailme.php
 
D

dorayme

Joe said:
I've been using the 'hash entity' method for years. Seems to work, but
as it's used in harness with spam filter on my email proggy and isp, I
don't really know. I replace about half the addy with entities,
especially the . and @ , but how far you go is up to you. It can't hurt
to try.
Anyway, check it at http://graspages.cjb.cc/emailme.php

Thanks Joe. I found something to make your technique easier at
http://www.wbwip.com/wbw/emailencoder.html and have already used
it in anger just now and it is working on a client's site. I am
imagining it already side-swiping all attempts by vicious bots to
"harvest" it, I have the image of a rugby player at full bore
with the ball, fending off all attempts to tackle him. You know,
his arm and hand stretched out to push all comers away as Rugby
players do... (I know how analogies tickle you pink...)

Just what I wanted, someone to say something to get me going!
 
N

Nikita the Spider

dorayme said:
Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages. Mostly interested to
hear from anyone using specific methods (rather than anything
else like further reviews, analyses of the ultimate effectiveness
etc, having things like "removeThis" inside the email address
that is in the "mailto:").

I've set up several spamtrap addresses to study this. Eventually I'll
write a short article about my findings, but in the meantime I'll
summarize here. I have three email addresses all on the same page. One
is naked (i.e. just (e-mail address removed)), one is entity encoded (i.e.
foo etc.) and one is added to the page by Javascript.
The number of spams each has gotten to date is as follows:

naked - 715
entities - 2
javascript - 1

In short, the entities look pretty effective to me. They're nice because
they don't disturb one's visitors at all and you don't have to mess
around with any Javascript.

But another way of looking at it is to say that Javascript protection is
twice as effective as entity protection. =) (Thanks to Huff's "How to
Lie with Statistics")
 
D

dorayme

<[email protected]
t.rr.com>,
Nikita the Spider said:
I've set up several spamtrap addresses to study this. Eventually I'll
write a short article about my findings, but in the meantime I'll
summarize here. I have three email addresses all on the same page. One
is naked (i.e. just (e-mail address removed)), one is entity encoded (i.e.
foo etc.) and one is added to the page by Javascript.
The number of spams each has gotten to date is as follows:

naked - 715
entities - 2
javascript - 1

In short, the entities look pretty effective to me. They're nice because
they don't disturb one's visitors at all and you don't have to mess
around with any Javascript.

Yes, excellent. My feelings too on this one.
But another way of looking at it is to say that Javascript protection is
twice as effective as entity protection. =) (Thanks to Huff's "How to
Lie with Statistics")

People can and do look at things as they like! But the truth is
another matter.

It would be nice to actually know how the 2 and 1 got through...
This brings up this issue: just this morning, there was some post
here at alt.html re a facility to somehow capture material on a
screen (it is gone from my newsreader now). Though the email is
veiled in the source, it is not in the browser as expressed. It
is commonly just printed as normal on the screen. Sure, this bit
can be avoided by simple techniques like making the visible link
something like ...>email us</a>? To avoid any "on screen
harvesting"?

But, this is not always acceptable. I have no idea how the robots
work, how clever they are, whether they in fact look at source or
output or both. Your stats would be more meaningful if you could
say more about the implementation. Interesting experiment though,
Spider. Look forward to your article.
 
B

Beauregard T. Shagnasty

Nikita said:
I've set up several spamtrap addresses to study this. Eventually I'll
write a short article about my findings, but in the meantime I'll
summarize here. I have three email addresses all on the same page.
One is naked (i.e. just (e-mail address removed)), one is entity encoded (i.e.
foo etc.) and one is added to the page by Javascript.
The number of spams each has gotten to date is as follows:

naked - 715
entities - 2
javascript - 1

I'll agree that using entities works. I have one address on a web site
that began life in this form. Never got any spam in about six years.

Then one day, I started getting bounces from emails containing viruses.
I found out that someone who added my address to his address book got
infected. My address was used as a forged FROM: by this virus. Shortly
after that, I started to get spam and it's hovering around 200-250 per
day now. :-(
 
J

jojo

Joe said:
I've been using the 'hash entity' method for years. Seems to work, but
as it's used in harness with spam filter on my email proggy and isp, I
don't really know. I replace about half the addy with entities,
especially the . and @ , but how far you go is up to you. It can't hurt
to try.
Anyway, check it at http://graspages.cjb.cc/emailme.php

You can improve that: use HTML-Entities for "mailto:" and hex-entities
(%41 for A) for the email-adress itself.
 
J

John Dunlop

[re e-mail address obfuscation]

jojo:
You can improve that: use HTML-Entities for "mailto:" and hex-entities
(%41 for A) for the email-adress itself.

....the one going against if not the word then the spirit of HTML4.01,
the other against the spirit of RFC3986. Character references were
made for when it is inconvenient or impossible to enter a character
directly, for example, when there is no key for it on the keyboard or
the character isn't displayable.

| A given character encoding may not be able to express
| all characters of the document character set. For such
| encodings, or when hardware or software configurations
| do not allow users to input some document characters
| directly, authors may use SGML character references.

(HTML4.01 sec.5.3)

Percent-encoding characters that are allowed as data in a URL part
hinders transcription because characters that could otherwise be
recognisable and rememberable have been, unless you're familiar with
US-ASCII and hexadecimal notation, turned into unrecognisable and
harder-to-remember three-character sequences. That your browser
silently decodes percent-encodings and presents you with a more
human-friendly URL suggests that e-mail address harvesters can do a
similar job.

Principles of URL design take into consideration human factors because
URLs are part of the user-interface. Obfuscating URLs with
percent-encodings makes things harder for humans while barely
increasing the hardship on e-mail address harvesters.

Obfuscation of e-mail addresses is just that: obfuscation. It does
nothing to help the genuine user find and use your e-mail address.
Attempts at obfuscating e-mail addresses - likewise attempts at
obfuscating markup - are trivial to bypass, even by e-mail address
harvesters. I should emphasize that I'm not saying that attempts at
obfuscation will universally fail, only that it takes little effort to
overcome them.

My advice, if you're not keen on actively fighting spam, would be to
either set up junk mail filters both at your server and at your MUA, or
remove the address from the public eye altogether.
 
B

Brian Cryer

Nikita the Spider said:
I've set up several spamtrap addresses to study this. Eventually I'll
write a short article about my findings, but in the meantime I'll
summarize here. I have three email addresses all on the same page. One
is naked (i.e. just (e-mail address removed)), one is entity encoded (i.e.
foo etc.) and one is added to the page by Javascript.
The number of spams each has gotten to date is as follows:

naked - 715
entities - 2
javascript - 1

Given how easy it is to translate I'm amazed that the encoded version is so
effective. Just goes to show that spammers are stupid as well as sad.
 
B

Brian Cryer

dorayme said:
Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages. Mostly interested to
hear from anyone using specific methods (rather than anything
else like further reviews, analyses of the ultimate effectiveness
etc, having things like "removeThis" inside the email address
that is in the "mailto:").

I had a client recently ask me to "do something" about the spam
coming from his website. I want to do better than tell him to get
the best spam filter he can, both on his local and on his server
end via his host. There is a javascript thing I used to use but
these days I am interested in being able to get by without it.
so, any suggestions will be welcome, especially if they are
actually being used by the suggester (not someone else they know
or have heard of... take this as a compliment)

I'm sure you already know this, but: Whatever technique you decide to use
(unless you go the route of a better spam filter) be sure to ditch the
existing email address. Once you are on spammer's mailing list its unlikely
that you will ever get off it. So there is no point deploying a
"super-anti-spam" technique with an email address that already gets tons of
spam.
 
C

cwdjrxyz

dorayme said:
Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages. Mostly interested to
hear from anyone using specific methods (rather than anything
else like further reviews, analyses of the ultimate effectiveness
etc, having things like "removeThis" inside the email address
that is in the "mailto:").

I had a client recently ask me to "do something" about the spam
coming from his website. I want to do better than tell him to get
the best spam filter he can, both on his local and on his server
end via his host. There is a javascript thing I used to use but
these days I am interested in being able to get by without it.
so, any suggestions will be welcome, especially if they are
actually being used by the suggester (not someone else they know
or have heard of... take this as a compliment)

Several methods that work at least somewhat have been mentioned. Most
of us likely need several email addresses. I have noticed that many
large companies use addresses that can not be answered for contacting
people. All questions have to go to the main address. Some like to use
CGI feedback forms without a mention of a specific address. However
this is not without risk, since a virus can be fed to a server in this
way unless the CGI feedback is not very carefully constructed. There
are people who will put a scripted virus in the feedback box. Limiting
the size of the feedback and not allowing it to contain script helps in
this respect. And of course, do not use a good address on Usenet posts.
I use one at my domain for posting that does not allow any response -
everything is dumped. Then I have addresses used only for friends,
finance, etc. These seldom get spam, so I usually do not have to
configure to allow only mail from those on a list.
 
N

Nikita the Spider

"Brian Cryer said:
..

Given how easy it is to translate I'm amazed that the encoded version is so
effective. Just goes to show that spammers are stupid as well as sad.

I was also surprised by this result, but I can think of two reasons why
harvesting bots might ignore any non-naked addresses, even if they're
easy to translate. First, the harvesters might feel that anyone who is
savvy enough to obfuscate his email address isn't likely to respond to
spam anyway. Second, the harvesters might see no shortage of
un-obfuscated addresses, so why go to the trouble of harvesting the
small number of obfuscated ones? It's this latter theory that I prefer
because laziness is a powerful (and common) motivator.
 
N

Nikita the Spider

dorayme said:
<[email protected]
t.rr.com>,

It would be nice to actually know how the 2 and 1 got through...

One of the two was a standard 419 scam (see http://www.419eater.com/ if
you're not familiar with these) so I could believe that an actual human
clicked on the link. But they one that got through to both the
Javascript- and entity-protected one was a garden variety spam. It
really surprises me that I got only one. I figured that once I was on
the list, the floodgates would open.

But, this is not always acceptable. I have no idea how the robots
work, how clever they are, whether they in fact look at source or
output or both.

I'd be surprised if any do more than look through the source.
Your stats would be more meaningful if you could
say more about the implementation. Interesting experiment though,
Spider. Look forward to your article.

Thanks, will explain methodology, implementation, etc. and post a link
to the article here eventually.
 
D

dorayme

I'm sure you already know this, but: Whatever technique you decide to use
(unless you go the route of a better spam filter) be sure to ditch the
existing email address. Once you are on spammer's mailing list its unlikely
that you will ever get off it. So there is no point deploying a
"super-anti-spam" technique with an email address that already gets tons of
spam.

I know what you mean. Looking on the bright side though, after a
while, without any response, without fresh harvesting, there
would start to be a reduction perhaps... after the point of
encoding provisions being made.
 
D

dorayme

"John Dunlop said:
[re e-mail address obfuscation]

jojo:
You can improve that: use HTML-Entities for "mailto:" and hex-entities
(%41 for A) for the email-adress itself.

...the one going against if not the word then the spirit of HTML4.01,
the other against the spirit of RFC3986. Character references were
made for when it is inconvenient or impossible to enter a character
directly, for example, when there is no key for it on the keyboard or
the character isn't displayable.

Ah but you see, it is like this Jock, recall, for example,
Burning Mississippi. Gene Hackman, second in command of an FBI
hunt is rearing to bring in his team of ex-crim
mission-impossible not-totally-law-abiding but
now-on-the-side-of-the-good-guys to break the back of the
low-down no-good scumbag-leadership of the KKK responsible for a
triple murder. The FBI leader, Agent Alan Ward, makes your sort
of speech, and holds out for high principles and gets bloody
nowhere! Things start to happen soon as the fabulously
charismatic Hackman is allowed to follow his instincts.
likewise attempts at
obfuscating markup - are trivial to bypass, even by e-mail address
harvesters. I should emphasize that I'm not saying that attempts at
obfuscation will universally fail, only that it takes little effort to
overcome them.

If it is so little effort, what is your theory about why it is so
effective (if it is as recent indications suggest)? Perhaps I can
help you:

Similar speeches are made like yours about the value of security
bars on windows and doors. "Ha", says my neighbour opposite, "I
could get through with a good crowbar in 15 secs!".

Sure he could - if he wants to die by the claws of my specially
and lovingly trained 16 year old cat.

The point is this though: robbers tend to go for the low lying
fruit first and there is plenty enough of that to go around. Do
you understand what I am saying? No need to crash through even
slightly heavier security.
 
J

Jukka K. Korpela

Scripsit dorayme:
Anyone here using methods to make it more difficult for spammers
to garner email addresses from web pages.

Removing all of one's web pages is sometimes suggested as the only sure
method, but even it isn't sure at all, of course. Think about
www.archive.org.
I had a client recently ask me to "do something" about the spam
coming from his website.

Tell them to contact a specialist on such matters if they can't handle it.
Spam isn't an HTML problem any more terrorism, lack of good sex, or poverty
is.
I want to do better than tell him to get
the best spam filter he can,

Why would you you want to do better than the real thing? I guess you are
thinking of suggesting something _else_, like "email address protection"
snake oil. I hope you now realize how ridiculous the idea is.

Either they do some spam filtering, or they don't. Either way, email address
obsfuscation does not protect them from spam but _will_ damage their
business by damaging communication, style, and impression.
 
D

dorayme

"Jukka K. Korpela said:
Scripsit dorayme:


Tell them to contact a specialist on such matters if they can't handle it.
Spam isn't an HTML problem any more terrorism, lack of good sex, or poverty
is.

I have already said to do the spam filtering. It is the other bit
of what you say that I don't want to communicate. I don't
honestly. I know, you are right about an ideal world. If there is
something a little impure that helps, I will use it if all I see
are mainly theoretical objections.
Why would you you want to do better than the real thing? I guess you are
thinking of suggesting something _else_, like "email address protection"
snake oil. I hope you now realize how ridiculous the idea is.

Well, yes actually. But it really does not seem to me ridiculous,
even though it is not really kosher. What I do find ridiculous is
the idea of being purer than the practicalities dictate. When a
pedestrian stop light is on, Australians will tend to wait till
it goes green, even if there is not a car in sight. French people
are not so ridiculous and express surprise at this behaviour when
visiting here.
Either they do some spam filtering, or they don't. Either way, email address
obsfuscation does not protect them from spam but _will_ damage their
business by damaging communication, style, and impression.

Well, I would like to see the evidence for this as it might
relate to various cases in my patch. If you were right, it would
indeed be a reason not to.

I was aware of this response when I posted. And was not looking
forward to it. But I think you are right to have expressed it so
as to dampen any ideas that it is a wholesome thing to do. I have
no illusions: I am a fallen being.

As often though, I do think about what you say and will probably
end up further emphasising the proper way to go, ie. to put in
the best spam filters/blockers they can and to point them to
resources to do this... So, thank you.
 
J

Joe (GKF)

Thanks Joe. I found something to make your technique easier at
http://www.wbwip.com/wbw/emailencoder.html and have already used

shiny! and arguably better than my usual 'back of an envelope'
technique, which involves memorising "At 64 dot 46". Then I normally
have to look up 'a'.

his arm and hand stretched out to push all comers away as Rugby
players do... (I know how analogies tickle you pink...)

pinking up nicely, ta.
Just what I wanted, someone to say something to get me going!
my pleasure.
 
J

John Dunlop

dorayme:

[re overcoming e-mail address obfuscation]
If it is so little effort, what is your theory about why it is so
effective (if it is as recent indications suggest)? Perhaps I can
help you:

No help needed, dorayme, thank you. Someone in this thread has already
advanced a plausible theory: laziness. Even the slightest extra
effort is too much because unobfuscated e-mail addresses are plentiful,
easy pickings even. No need to stretch.
The point is this though: robbers tend to go for the low lying
fruit first and there is plenty enough of that to go around. Do
you understand what I am saying? No need to crash through even
slightly heavier security.

Yes, but I am merely pointing out that obfuscating e-mail addresses is
inferior to real security; I am not claiming to know what harvesters
actually do!

Mind that old axiom 'security by obscurity gives a false sense of
security'?

And, as I've explained, the techniques to obfuscate e-mail addresses
proposed in this thread run contrary to the spirit of Internet
specifications. That a construct is included in a specification is
hardly license to exploit it.

Deal with spam at your end; don't pass the buck.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top