Python Web Programming - looking for examples of solid high-traffic sites

J

John Nagle

Alex said:
Yeah, but I don't know why it's configured it that way. A good example
of a question that looks perfectly appropriate for YouTube's OSCON
session.

YouTube's home page is PHP. Try "www.youtube.com/index.php".
That works, while the obvious alternatives don't.
If you look at the page HTML, you'll see things like

<a href="/login?next=/index.php"
onclick="_hbLink('LogIn','UtilityLinks');">Log In</a>

So there's definitely PHP inside YouTube.

If you look at the HTML for YouTube pages, there seem to be two
drastically different styles. Some pages begin with "<!-- machid: 169 -->",
and have their CSS stored in external files. Those seem to be generated
by PHP. Other pages start with
"<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">",
with no "machine ID". It looks like the stuff associated with
accounts and logging in is on the second system (Python?) while the
search and view related functions are on the PHP system.

Shortly after Google bought YouTube, they replaced YouTube's search
engine (which was terrible) with one of their own. At that time,
Google search syntax, like "-", started working. That's probably
when the shift to PHP happened.

John Nagle
 
M

Michele Simionato

Our main requirement for tools we're going to use is rock-solid
stability. As one of our team-members puts it, "We want to use tools
that are stable, has many developer-years and thousands of user-years
behind them, and that we shouldn't worry about their _versions_." The
main reason for that is that we want to debug our own bugs, but not
the bugs in our tools.

Our problem is - we yet have to find any example of high-traffic,
scalable web-site written entirely in Python. We know that YouTube is
a suspect, but we don't know what specific python web solution was
used there.

TurboGears, Django and Pylons are all nice, and provides rich features
- probably too many for us - but, as far as we understand, they don't
satisfy the stability requirement - Pylons and Django hasn't even
reached 1.0 version yet. And their provide too thick layer - we want
something 'closer to metal', probably similar to web.py -
unfortunately, web.py doesn't satisfy the stability requirement
either, or so it seems.

So the question is: what is a solid way to serve dynamic web pages in
python? Our initial though was something like python + mod_python +
Apache, but we're told that mod_python is 'scary and doesn't work very
well'.

AFAIK mod_python is solid and works well, but YMMV of course.
If you want rock solid stability, you want a framework where there is
little
development going on. In that case, I have a perfect match for your
requirements: Quixote. It has been around for ages, it is the most bug
free framework I have seen and it *very* scalable. For instance
http://www.douban.com
is a Quixote-powered chinese site with more than 2 millions of pages
served per
day. To quote from a message on the Quixote mailing list:

"""
Just to report-in the progress we're making with a real-world Quixote
installation: yesterday douban.com celebrated its first 2 million-
pageview day. Quixote generated 2,058,207 page views. In addition,
there're about 640,000 search-engine requests. These put the combined
requests at around 2.7 millions. All of our content pages are
dynamic, including the help and about-us pages.

We're still wondering if we're the busiest one of all the python/ruby
supported websites in the world.

Quixote runs on one dual-core home-made server (costed us US$1500).
We have three additional servers dedicated to lighttpd and mysql. We
use memcached extensively as well.

Douban.com is the most visible python establishment on the Chinese
web, so there's been quite a few django vs. quixote threads in the
Chinese language python user mailing lists.
"""

Michele Simionato
 
J

John Nagle

John said:
YouTube's home page is PHP. Try "www.youtube.com/index.php".
That works, while the obvious alternatives don't.
If you look at the page HTML, you'll see things like

<a href="/login?next=/index.php"
onclick="_hbLink('LogIn','UtilityLinks');">Log In</a>

So there's definitely PHP inside YouTube.

Not sure; that "next" field is just the URL of the page you're on,
inserted into the output HTML. It's "index.php" because the page was
"index.php".

But it's an Apache server, with all the usual Apache messages.

John Nagle
 
J

Jorge Godoy

John Nagle said:
As a direct result of this, neither the Linux distro builders like
Red Hat nor major hosting providers provide Python environments that
just work. That's reality.

Try SuSE, OpenSUSE, Ubuntu... They "just work". I've never had any
problem installing any library or module for Python. Even the ones that
require huge libraries or compiling something.
 
J

Josiah Carlson

John said:
Many of the basic libraries for web related functions do have
problems. Even standard modules like "urllib" and "SSL" are buggy,
and have been for years. Outside the standard modules, it gets
worse, especially for ones with C components. Version incompatibility
for extensions is a serious problem. That's reality.

It's a good language, but the library situation is poor. Python as
a language is better than Perl, but CPAN is better run than Cheese Shop.

You know, submitting bug reports, patches, etc., can help make Python
better. And with setuptools' easy_setup, getting modules and packages
installed from the Cheese Shop is pretty painless.

- Josiah
 
B

Bruno Desthuilliers

John Nagle a écrit :
Denying the existence of the problem won't fix it.

Neither will keeping on systematically criticizing on this newsgroup
instead of providing bug reports and patches.
As a direct result of this, neither the Linux distro builders like
Red Hat nor major hosting providers provide Python environments that
just work. That's reality.
I've been using Python for web applications (Zope, mod_python, fast cgi
etc) on Gentoo and Debian for the 4 or 5 past years, and it works just
fine. So far, I've had much more bugs and compatibility problems with
PHP (4 and 5) than with Python.
 
M

Michael Bayer

Our main requirement for tools we're going to use is rock-solid
stability. As one of our team-members puts it, "We want to use tools
that are stable, has many developer-years and thousands of user-years
behind them, and that we shouldn't worry about their _versions_." The
main reason for that is that we want to debug our own bugs, but not
the bugs in our tools.

youre not going to find a web development platform in any language at
all where you will not come across bugs. you will always have to
"worry" about versions. I have a day job where we worry about bugs
and versions in struts, hibernate, mysql, and spring all day long, and
each of those products probably has more users than all python
frameworks combined (feel free to whomever to bring out numbers, id be
glad to be proven wrong).

The web platform for Python which has the longest running time and the
most thousands-of-whatever hours is Zope (and Plone). All the others
which are popular today have only a tiny fraction of the in-production
time that Zope has. so if thats your criterion, then zope is what
you'd probably have to use.
TurboGears, Django and Pylons are all nice, and provides rich features
- probably too many for us - but, as far as we understand, they don't
satisfy the stability requirement - Pylons and Django hasn't even
reached 1.0 version yet. And their provide too thick layer - we want
something 'closer to metal', probably similar to web.py -
unfortunately, web.py doesn't satisfy the stability requirement
either, or so it seems.

I would seriously reconsider the notion that Pylons is "too thick" of
a layer. Pylons is quite open ended and non-opinionated. the
approaches of Pylons and Django couldnt be more different, so I would
suggest digging a little deeper into the various frameworks before
dismissing them on based on shallow judgments. Also, I understand
reddit is built on web.py, which is pretty high-traffic/proven/etc.
So the question is: what is a solid way to serve dynamic web pages in
python? Our initial though was something like python + mod_python +
Apache, but we're told that mod_python is 'scary and doesn't work very
well'.

mod_python works fantastically in my experience. that would satisfy
your requirement of stability as well as "close to the metal". but
youre going to have to roll your own pretty much everything...theres
only the most rudimdental controller layer, not much of an idea of url
resolution, and of course youd still have to figure out database/
templating. if you built a whole lot of custom mod_python handlers,
youd be tied to a very specific kind of process model and couldnt
really branch out into something like fcgi/mod_proxy->WSGI etc.

I think you guys have to either be less rigid about your requirements
and be willing to get your hands a little dirtier...web.py does seem
to be the kind of thing you guys would like, but if it has some issues
then youd just have to ....*shudder*....*contribute!* to that project
a little bit. its sort of par for the course in the field of open
source that youre going to have to be willing to contribute, if not
patches, then at least feedback and test cases to the developers for
issues found. if youre not willing to do that, you might have to
stick with J2EE for now.
 
R

Rico

Hello list,

our team is going to rewrite our existing web-site, which has a lot of
dynamic content and was quickly prototyped some time ago.

Today, as we get better idea of what we need, we're going to re-write
everything from scratch. Python is an obvious candidate for our team:
everybody knows it, everybody likes it, it has *real* objects, nice
clean syntax etc.

Our main requirement for tools we're going to use is rock-solid
stability. As one of our team-members puts it, "We want to use tools
that are stable, has many developer-years and thousands of user-years
behind them, and that we shouldn't worry about their _versions_." The
main reason for that is that we want to debug our own bugs, but not
the bugs in our tools.

Our problem is - we yet have to find any example of high-traffic,
scalable web-site written entirely in Python. We know that YouTube is
a suspect, but we don't know what specific python web solution was
used there.

TurboGears, Django and Pylons are all nice, and provides rich features
- probably too many for us - but, as far as we understand, they don't
satisfy the stability requirement - Pylons and Django hasn't even
reached 1.0 version yet. And their provide too thick layer - we want
something 'closer to metal', probably similar to web.py -
unfortunately, web.py doesn't satisfy the stability requirement
either, or so it seems.

So the question is: what is a solid way to serve dynamic web pages in
python? Our initial though was something like python + mod_python +
Apache, but we're told that mod_python is 'scary and doesn't work very
well'.

And althoughhttp://www.python.org/about/quotes/lists many big names
and wonderful examples, be want more details. E.g. our understanding
is that Google uses python mostly for internal web-sites, and
performance is far from perfect their. YouTube is an interesting
example - anybody knows more details about that?

Your suggestions and comments are highly welcome!

Best Regards,
Victor.

Teenwag runs on Python, with a hacked up Framework and recieves about
2million visitors a day and is constantly increasing
http://teenwag.com/profile?friendid=326
 
?

=?ISO-8859-2?Q?Michael_Str=F6der?=

John said:
Sure they do. I have a complex web site, "http://www.downside.com",
that's implemented with Perl, Apache, and MySQL. It automatically reads
SEC
filings and parses them to produce financial analyses. It's been
running for seven years, and hasn't been modified in five, except [..]

Well, how can you be then sure that you don't have any security hole in
there?

Ciao, Michael.
 
I

Ivan Tikhonov

Use php. I am lead programmer/maintainer of big website with a lot of
interactive stuff in user's backoffice and with a lot of interraction
to our non-web apps.

PHP is a crap, but programming for web in python is a pain in the ass.
And php programmers are cheaper. Especialy avoid mod_python.

IMHO.
 
M

Matthew Nuzum

Hello list,

our team is going to rewrite our existing web-site, which has a lot of
dynamic content and was quickly prototyped some time ago.
See #3 below
Our main requirement for tools we're going to use is rock-solid
stability. As one of our team-members puts it, "We want to use tools
that are stable, has many developer-years and thousands of user-years
behind them, and that we shouldn't worry about their _versions_."
TurboGears, Django and Pylons are all nice, and provides rich features
- probably too many for us - but, as far as we understand, they don't
satisfy the stability requirement - Pylons and Django hasn't even
reached 1.0 version yet.
See #3 below
And their provide too thick layer - we want
something 'closer to metal', probably similar to web.py -
See #1 below
Your suggestions and comments are highly welcome!

Victor et al,

I would propose that you ask some different questions. I propose these
out of personal experience, much of it from making poor decisions and
learning the hard way. Sorry if these sound gruff, I think its hard to
avoid when using bullet points; in my mind, these are all being
phrased as kindly and gently as I can.

#1
Seek flexibility over being "closer to metal." I say this because
whenever I've thought I wanted to be closer to metal, its because I
didn't want to be constrained by a framework's assumptions. I guess
another reason would be the need for raw performance with
computations, but I think you would have said that if that was your
goal. Still, "more flexible" is not always better. "flexible and well
integrated" is slightly better than "flexible to the Nth degree."

#2
Look for projects that deal with updates and security fixes in a way
that is sensitive to users of critical applications. This is better
than programs that change little. For example, does the "vendor"
provide patches containing just critical updates? Do they have good,
clear communication about changes that may break compatibility? How
long are older versions maintained? Asking these questions will help
you find a thriving project that is actively maintained and supported.
(contrast to abandonware, which hasn't changed in ages)

#3
Why haven't you mentioned maintainability or scalability? It sounds
like you're coming from a platform that you have outgrown, either
because your app can't keep up with it's load, or because you can't
enhance it with the features you want. You're not simply refactoring
it, you're starting over from the beginning. How often do you want to
start from scratch? If the answer is, "this is the last time," then
I'd worry *way* more about this and point #2 than anything else you
mentioned.

#4 (optional)
Has your dev team built many python web-apps? I'm guessing no, or
you'd already have picked a platform because of experience. If they
have not, I'd personally also ask for a platform that is easy to
learn, works well with the dev tools (IDE, debugger, version control)
you're familiar with, and has good documentation (dead tree or
online).

The unfortunate news, I'm afraid to say, is that because of the python
culture, you're still going to face tough decisions as there are
several mature products who score high marks in the areas I've listed
above. It seems the python community at large is insanely bent on
doing things the "right" way, so there may just be too many good
options to choose from.

But its better to ask the right questions.
 
?

=?ISO-8859-1?Q?Jo=E3o_Santos?=

Please have a look at Plone and Zope.

"During the month of January 2006, we've had approx. 167 million hits"
plone.org
 
B

Bruno Desthuilliers

Ivan Tikhonov a écrit :
Use php. I am lead programmer/maintainer of big website with a lot of
interactive stuff in user's backoffice and with a lot of interraction
to our non-web apps.

PHP is a crap, but programming for web in python is a pain in the ass.

Strange enough, MVHO on this is that PHP is crap *and* that programming
"for the web" with it is a king-size PITA, *specially* compared to Python.
And php programmers are cheaper.

"cheaper", yes. On an hourly basis. TCO is another problem...
Especialy avoid mod_python.

Unless you have to deeply integrate with (and are willing to be forever
tied to) Apache, I totally agree on this last one.
 
B

Bruno Desthuilliers

John Nagle a écrit :
(snip)
YouTube's home page is PHP. Try "www.youtube.com/index.php".
That works, while the obvious alternatives don't.
If you look at the page HTML, you'll see things like

<a href="/login?next=/index.php"
onclick="_hbLink('LogIn','UtilityLinks');">Log In</a>

So there's definitely PHP inside YouTube.

What about learning more on web servers configuration ?
http://zope.alostnet.eu/index.php

"definitively php", hu ?
 
V

Victor Kryukov

Hello list,

thanks a lot to everybody for their suggestions. We're yet to make our
final decision, and the information you've provided is really helpful.

Best,
Victor.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top