[OT] Map of email origins to Python list

  • Thread starter Claire McLister
  • Start date
C

Claire McLister

We've been working with Google Maps, and have created a web service to
map origins of emails to a group. As a trial, we've developed a map of
emails to this group at:

http://www.zeesource.net/maps/map.do?group=668

This represents emails sent to the group since October 27.

Would like to hear what you think of it.

Thanks for listening.

Claire

--
Claire McLister                       (e-mail address removed)
1684 Nightingale Avenue     Suite 201
Sunnyvale, CA 94087        408-733-2737(fax)

http://www.zeemaps.com
 
P

Paul McGuire

We've been working with Google Maps, and have created a web service to
map origins of emails to a group. As a trial, we've developed a map of
emails to this group at:

http://www.zeesource.net/maps/map.do?group=668

This represents emails sent to the group since October 27.

Would like to hear what you think of it.
------------------------------

<sigh>
Another sleepless camera pointed at the fishbowl that is my online life.

I guess it's a great way to find where there might be Python jobs to be
found, or at least kindred souls (or dissident Python posters in countries
where Internet activity is closely monitored...)

To me, it's either cool in a creepy sort of way, or creepy in a cool sort of
way.

-- Paul
 
R

Rocco Moretti

Paul said:
We've been working with Google Maps, and have created a web service to
map origins of emails to a group. As a trial, we've developed a map of
emails to this group at:

http://www.zeesource.net/maps/map.do?group=668

This represents emails sent to the group since October 27.

Would like to hear what you think of it.

It's also a testament to the limited value of physically locating people
by internet addresses - If you zoom in on the San Fransico bay area, and
click on the southern most bubble (south of San Jose), you'll see the
entry for the Mountain View postal code (94043) - a massive list which
contains mostly gmail.com accounts, but also contains accounts with .de
..ca .uk .pl .it .tw and .za domains. I doubt all of the people in that
list live in sunny California, let alone in Mountain View proper.
 
M

mensanator

Rocco said:
It's also a testament to the limited value of physically locating people
by internet addresses - If you zoom in on the San Fransico bay area, and
click on the southern most bubble (south of San Jose), you'll see the
entry for the Mountain View postal code (94043) - a massive list which
contains mostly gmail.com accounts, but also contains accounts with .de
.ca .uk .pl .it .tw and .za domains. I doubt all of the people in that
list live in sunny California, let alone in Mountain View proper.

North of that bubble is a second massive list also labeled Mountain
View
94043. I found my name on that list and I live in the Chicago area.
Moutain View is, perhaps, where aol.com is located? These bubbles are
showing the location of the server that's registered under the domain
name?
 
R

Rocco Moretti

North of that bubble is a second massive list also labeled Mountain
View
94043. I found my name on that list and I live in the Chicago area.
Moutain View is, perhaps, where aol.com is located? These bubbles are
showing the location of the server that's registered under the domain
name?

Actually, it looks like they are the *same* list. I haven't gone through
all of the names, but I spot checked a few, and it looks like yours,
among others, are listed in both spots. (The southern one looks like it
is a mislocated duplicate, as it is nowhere close to Mountain View, and
is stuck in the middle of a golf course.)
 
R

Robert Kern

North of that bubble is a second massive list also labeled Mountain
View
94043. I found my name on that list and I live in the Chicago area.
Moutain View is, perhaps, where aol.com is located? These bubbles are
showing the location of the server that's registered under the domain
name?

Most of AOL's offices are in Dulles, VA. Google's headquarters are in
Mountain View, CA.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
M

mensanator

Robert said:
Most of AOL's offices are in Dulles, VA. Google's headquarters are in
Mountain View, CA.

Aha, I post to the usenet through Google. Makes the map application
all the more stupid, doesn't it?
 
A

Alan Kennedy

[Robert Kern]
Most of AOL's offices are in Dulles, VA. Google's headquarters are in
Mountain View, CA.
[[email protected]]
Aha, I post to the usenet through Google. Makes the map application
all the more stupid, doesn't it?

Actually, no, because Google Groups sets the NNTP-Posting-Host header to
the IP address from which the user connected to Google. So your post to
which I'm replying came from IP address "68.73.244.37", which reverses
to "adsl-68-73-244-37.dsl.chcgil.ameritech.net".

http://groups.google.com/group/comp.lang.python/msg/ca06957210fe12ae?dmode=source

So presumably "chcgil" indicates you're in Chicago, Illinois?

Although I do have to point out that the map makes it appear as if I've
been busy posting from all over Dublin's Southside, which, as anyone who
has seen "The Commitments" can attest, is a deep insult a born-and-bred
Northsider such as myself ;-)
 
J

Jorge Godoy

Claire McLister said:
We've been working with Google Maps, and have created a web service to map
origins of emails to a group. As a trial, we've developed a map of emails to
this group at:

http://www.zeesource.net/maps/map.do?group=668

This represents emails sent to the group since October 27.

Would like to hear what you think of it.

Hmmmm... I don't see mine listed there: I'm in South America, Brasil. More
specifically in Curitiba, Paraná, Brasil. :)
 
M

mensanator

Alan said:
[Robert Kern]
Most of AOL's offices are in Dulles, VA. Google's headquarters are in
Mountain View, CA.
[[email protected]]
Aha, I post to the usenet through Google. Makes the map application
all the more stupid, doesn't it?

Actually, no, because Google Groups sets the NNTP-Posting-Host header to
the IP address from which the user connected to Google. So your post to
which I'm replying came from IP address "68.73.244.37", which reverses
to "adsl-68-73-244-37.dsl.chcgil.ameritech.net".

http://groups.google.com/group/comp.lang.python/msg/ca06957210fe12ae?dmode=source

So presumably "chcgil" indicates you're in Chicago, Illinois?

Yes, but why, then, is my name logged into Mountain View, CA?

That justifies my claim of "all the more stupid", doesn't it?
 
G

George Sakkis

Jorge Godoy said:
Hmmmm... I don't see mine listed there: I'm in South America, Brasil. More
specifically in Curitiba, Paraná, Brasil. :)

That's funny; I was looking for mine and I stumbled across yours at
Piscataway, NJ, US. :)

George
 
C

Claire McLister

I guess it's a great way to find where there might be Python jobs to be
found, or at least kindred souls (or dissident Python posters in
countries
where Internet activity is closely monitored...)

Possibly. But there are so many in-accuracies, that this is possibly a
guide at best.
To me, it's either cool in a creepy sort of way, or creepy in a cool
sort of
way.

An interesting perspective. Not to increase your sense of 'creepy', but
a lot of big corporations now have access to this kind of information
and more.
 
A

Alan Kennedy

[Alan Kennedy]
So presumably "chcgil" indicates you're in Chicago, Illinois?
[[email protected]]
Yes, but why, then, is my name logged into Mountain View, CA?

Presumably the creators of the map have chosen to use a mechanism other
than NNTP-Posting-Host IP address to geolocate posters.

Claire, what mechanism did you use?
That justifies my claim of "all the more stupid", doesn't it?

Well, to me it just says that the map creation software has some bugs
that need fixing.
 
C

Claire McLister

It's also a testament to the limited value of physically locating
people
by internet addresses - If you zoom in on the San Fransico bay area,
and
click on the southern most bubble (south of San Jose), you'll see the
entry for the Mountain View postal code (94043) - a massive list which
contains mostly gmail.com accounts, but also contains accounts with .de
.ca .uk .pl .it .tw and .za domains. I doubt all of the people in that
list live in sunny California, let alone in Mountain View proper.

Indeed, locating people from IP is not that easy or correct. We are,
however, not trying to suggest that we can find people's locations this
way. We are just trying to pin-point the origins of emails to a group.

The flaw that you point out is due to problems in our approach of how
we find the 'origin' IP.

We try to get a best guess estimate of the originating IP and its
location. If we cannot find that, we fall back on the earliest server
that has a location information. Clearly this marks quite a few email
origins in the wrong way. It doesn't do the collection of ALL gmail
addresses this way, however. If you do a filter on 'gmail' in the
'Name' filter, you'll see a lot of gmail addresses all over the world.
So, we need to do a better job of guessing the originating IP, and not
try to go too far forward.
 
J

Jorge Godoy

George Sakkis said:
That's funny; I was looking for mine and I stumbled across yours at
Piscataway, NJ, US. :)

Phew! Thanks for finding me. I was feeling a bit lost... :)


Be seeing you,
 
C

Claire McLister

Thanks, Alan. You are absolutely right, we are not using the
NNTP-Posting-Host header for obtaining the IP address.

The Python list is unique among the lists that we have handled so far,
in that it has a cross-posting mechanism with a net news. Hence, it
seems we are getting many more wrong locations here than any other
email list maps we've done so far. We've done them for Linux kernel,
postresql, apache, tomcat, etc. You can find them by searching their
names in the 'find' box. Not many people reported wrong locations on
those maps.

So, we'll have to go back and fix the script that is extracting the IP
address (which is written in Python, btw). Let me know if someone is
interested in taking a look at it and I can post it somewhere.

[Alan Kennedy]
So presumably "chcgil" indicates you're in Chicago, Illinois?
[[email protected]]
Yes, but why, then, is my name logged into Mountain View, CA?

Presumably the creators of the map have chosen to use a mechanism other
than NNTP-Posting-Host IP address to geolocate posters.

Claire, what mechanism did you use?
That justifies my claim of "all the more stupid", doesn't it?

Well, to me it just says that the map creation software has some bugs
that need fixing.
 
N

Neil Hodgson

Claire McLister:
We try to get a best guess estimate of the originating IP and its
location. If we cannot find that, we fall back on the earliest server
that has a location information. Clearly this marks quite a few email
origins in the wrong way. It doesn't do the collection of ALL gmail
addresses this way, however. If you do a filter on 'gmail' in the 'Name'
filter, you'll see a lot of gmail addresses all over the world. So, we
need to do a better job of guessing the originating IP, and not try to
go too far forward.

The points are labelled with the email address which won't always be
the account posted from. I'm listed in both Sydney (correct) and
Melbourne with my gmail account (actually a subaddress,
(e-mail address removed), only used for news posting) but I post to
comp.lang.python through Thunderbird on my local machine through my
ISP's news server. I expect the marked locations are for the ISP's news
hubs. Gmail only comes into the picture when I'm sent spam in response
to a post.

Multiple locations for gmail doesn't imply discovery of real origins
of traffic through gmail.

Neil
 
A

Alan Kennedy

[Claire McLister]
Thanks, Alan. You are absolutely right, we are not using the
NNTP-Posting-Host header for obtaining the IP address.

Aha, that would explain the lack of precision in many cases. A lot of
posters in this list/group go through NNTP (either with an NNTP client
or through NNTP-aware services like Google Groups) which should give
very good results, when available.
So, we'll have to go back and fix the script that is extracting the IP
address (which is written in Python, btw).

What better language to write in :)
Let me know if someone is
interested in taking a look at it and I can post it somewhere.

Sure, please do make it available, or at least the geolocation component
anyway. I'm sure you'll get lots of useful comments from the many clever
and experienced folk who frequent this group.

Don't be aggrieved at the negative comment you've received: I think what
you're doing is fascinating.

But don't forget that a lot of people are not aware that this kind of
geolocation can be done, along with the many other inferences that can
be drawn from message and browser headers. So don't be surprised if some
of them try to "shoot the messenger".

I look forward to the map with updated precision :)
 
M

Mike Meyer

Claire McLister said:
Thanks, Alan. You are absolutely right, we are not using the
NNTP-Posting-Host header for obtaining the IP address.

Yes, but what are you using?
The Python list is unique among the lists that we have handled so far,
in that it has a cross-posting mechanism with a net news. Hence, it
seems we are getting many more wrong locations here than any other
email list maps we've done so far. We've done them for Linux kernel,
postresql, apache, tomcat, etc. You can find them by searching their
names in the 'find' box. Not many people reported wrong locations on
those maps.

Hmm. Are you using a different method than you used for the mail
lists? Because my mail and news follows the same path, using the same
host name. The only difference is that my ISP uses supernews.com news
servers, so my postings appear to go direct from my domain to
supernews - but the only place this shows up is in the Path: header.

For the record - I (and my servers) are in Virginia, the domain name I
use is registered to an address in Oklahoma, and everything is relayed
through my ISP in Berkeley. Your map has me in San Francisco. Ok, you
nearly got my ISPs hardware.
So, we'll have to go back and fix the script that is extracting the IP
address (which is written in Python, btw). Let me know if someone is
interested in taking a look at it and I can post it somewhere.

What IP address it is extracting? Well, if you post it, I'll look at
it and figure it out from that.

<mike
 
M

Mike Meyer

Claire McLister said:
An interesting perspective. Not to increase your sense of 'creepy',
but a lot of big corporations now have access to this kind of
information and more.

You mean my creditors are going to be looking for me in San Francisco,
even though I'm in Virginia? Cool.

<mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top