Just out of curiosity: Which languages are they using at Google and what for?

A

alainpoint

I know Google are using Python for testing purposes.
But for the rest ?
is it PHP or Java or .NET?
Which technology is rendering the google main page?

And of course th obvious question, why not Python?

Alain
 
B

bruno at modulix

I know Google are using Python for testing purposes.

Not only:
"""
Where is Python used?

* The Google build system is written in python. All of Google's
corporate code is checked into a repository and the dependency and
building of this code is managed by python. Greg mentioned that to
create code.google.com took about 100 lines of python code. But since
it has so many dependencies, the build system generated a 3 megabyte
makefile for it!
* Packaging. Google has an internal packaging format like RPM.
These packages are created using python.
* Binary Data Pusher. This is the area where Alex Martelli is
working, on optimizing pushing bits between thousands of servers
* Production servers. All monitoring, restarting and data
collection functionality is done with python
* Reporting. Logs are analyzed and reports are generated using Python.
* A few services including code.google.com and google groups. Most
other front ends are in C++ (google.com) and Java (gmail). All web
services are built on top of a highly optimizing http server wrapped
with SWIG.
"""

http://panela.blog-city.com/python_at_google_greg_stein__sdforum.htm
 
A

Alex Martelli

I know Google are using Python for testing purposes.

....and many more besides, as per the other response.
But for the rest ?
is it PHP or Java or .NET?
Which technology is rendering the google main page?

I think you can get a reasonable idea by perusing the 800+ job offers we
currently have open;-), and eyeballing the several papers published by
Googlers -- between one and the other, it becomes pretty clear that we
use mostly Python, C++ and Java, plus of course a host of others for
special purposes (Javascript for AJAX purposes, C and Assembly for
kernel-hacking of various sort, SQL for relational databases, etc, etc),
including some highly specialized ones invented within Google for highly
specialized purposes (e.g., Rob Pike's "sawzall" for log-processing).

And of course th obvious question, why not Python?

You can read Pike et al's paper on sawzall to see why we would want a
special-purpose language for that specialized, very-high-volume task,
for example; I hope the reason for the other specialized ones, from
Javascript to Assembly to SQL, is pretty obvious in each case;-).

As among the "three big ones" -- Python, C++, Java -- there are good
reasons why the overall job is best done by a mix of them. I won't
address Java (we don't use any in my group, nor any of the groups we
interact with intensely), but the tradeoffs between Python and C++
should, again, be pretty clear. For example, C++ allows (and demands)
close control of where all your memory is going -- much harder to
achieve in garbage collected languages such as Python or Java (managing
memory IS a chore, but, under potentially heavy load, an important one).

Also, a consideration I also made at SDForum: at Google's volumes of
traffic, we need load-balancing among many machines, of course, but we
_also_ are unwilling to let user experience suffer from high latencies.
Now, you can scale the "bandwidth" of a cluster by throwing more servers
at the problem -- but latency does not work the same way: you can't make
a baby in 1 month by load-balancing among 9 mothers, as the saying goes.
So, having as few machine-instructions as feasible on the critical paths
that determine the user-perceived latency is important; even the very
first paper by Page and Brin describing what would later become Google
made the point indirectly -- they describe the "crawling" (where latency
is no big deal) as being implemented in Python, but the processing of
queries (where latency is crucial) in C++.


Alex
 
A

Alex Martelli

bruno at modulix said:
* Packaging. Google has an internal packaging format like RPM.
These packages are created using python.
* Binary Data Pusher. This is the area where Alex Martelli is
working, on optimizing pushing bits between thousands of servers
* Production servers. All monitoring, restarting and data
collection functionality is done with python

Yep, Greg did say that at SDForum, but in fact I'm working on a much
wider range of problems than just the datapush -- I lead Production
Systems, which includes parts of all of the above and yet more stuff
(account management, network verification, etc etc) -- done mostly in
Python, but with substantial helpers in C++ as well (not so much for CPU
efficiency reasons, as for keeping memory use under strict control when
necessary, the area where C/C++ really shines;-).


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,571
Members
45,045
Latest member
DRCM

Latest Threads

Top