OT: why do web BBS's and blogs get so slow?

Paul Rubin · Jan 31, 2004

Lots of times, when a blog (that supports user comments) or a web BBS
gets heavily loaded, it slows down horribly, even when it's hosted at
an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

I'm asking this on clpy because thinking about the problem naturally
made me wonder how I'd write a program like that myself, which of
course would mean using Python.

FWIW, here's how I'd do it:

1) It would be single threaded web server (asyncore, twistedmatrix)
with a select loop talking to a socket, either on port 80 directly, or
to a proxy web server running mod_gzip, SSL, and so forth.

2) It might use MySQL for infrequent operations like user info lookup
at login time or preference updates, but not for frequent operations
like reading and posting messages. User session info and preferences
would be in ram during a session, in a python dict indexed by a
browser session cookie.

3) The message store would be two files, one for metadata and one for
message text. Both of these would be mmap'd into memory. There would
be a fixed length of metadata for each message, so getting the
metadata for message #N would be a single array lookup. The metadata
would contain the location in the text file where the message text is
and its length, so getting the text would take just one memcpy. The
box would have enough ram to hold all recently active messages in ram
almost all the time. Paging to disk is left to the host OS's virtual
memory system. From the application's point of view everything is
always in ram. Digging up old messages might take some page faults,
but doing that should be relatively rare. New messages are always
appended to the files, keeping memory and paging behavior fairly
localized. There might be a third file for an activity log, which is
append-only (serial access). Ideally that would be on a separate disk
from the message disk, to reduce head contention.

4) Finding all N messages in an active discussion thread might require
chasing N pointers in the metadata file, but that would usually be at
most a few thousand small lookups, all in ram, and the thread info
would be cached in ram in the application once found. Fancier disk
structures could speed this up but probably arn't needed.

A site like Slashdot gets maybe 20(?) page views per second at busy
times, and around 10,000 new messages a day of maybe 1 kbyte each, a
mere 10 MB per day of message text, no problem to keep in ram. I'd
like to think that the above scheme could handle Slashdot's traffic
level pretty easily on one or two typical current PC's with IDE disks,
one PC for the BBS application and the other (if needed) for a proxy
server running gzip and caching static pages. Am I being naive and/or
missing something important? Slashdot itself uses a tremendous amount
of hardware by comparison.

Jay O'Connor · Jan 31, 2004

Paul said:
Lots of times, when a blog (that supports user comments) or a web BBS
gets heavily loaded, it slows down horribly, even when it's hosted at
an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

I'm asking this on clpy because thinking about the problem naturally
made me wonder how I'd write a program like that myself, which of
course would mean using Python.

FWIW, here's how I'd do it:

1) It would be single threaded web server (asyncore, twistedmatrix)
with a select loop talking to a socket, either on port 80 directly, or
to a proxy web server running mod_gzip, SSL, and so forth.

2) It might use MySQL for infrequent operations like user info lookup
at login time or preference updates, but not for frequent operations
like reading and posting messages. User session info and preferences
would be in ram during a session, in a python dict indexed by a
browser session cookie.

This is similar to what I wrote in Smalltalk several years and is now running
http://www.ezboard.com

Paul Rubin · Jan 31, 2004

Paul Rubin · Jan 31, 2004

Jay O'Connor said:
This is similar to what I wrote in Smalltalk several years and is now
running http://www.ezboard.com

Uh oh, ezboard.com is pretty slow. Is it hopeless?

A.M. Kuchling · Jan 31, 2004

an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

Oh, interesting! I'm sporadically working on a Slashdot clone, so this sort
of thing is a concern. As a result I've poked around in the Slashdot SQL
schema and page design a bit.

Skipping ahead:

Am I being naive and/or
missing something important? Slashdot itself uses a tremendous amount
of hardware by comparison.

Additional points I can think of:

* Some of that slowness may be latency on the client side, not the server. A
heavily table-based layout may require that the client get most or all of
the HTML before rendering it. Slashdot's HTML is a nightmare of tables;
some weblogs have CSS-based designs that are much lighter-weight.

* To build the top page, Slashdot requires a lot of SQL queries. There's
the list of stories itself, but there are also lists of subsections (Apache,
Apple, ...), lists of stories in some subsections (YRO, Book reviews, older
stories), icons for the recent stories, etc. All of these may need an SQL
query, or at least a lookup in some kind of cache.

It also displays counts of posts to each story (206 of 319 comments),
but I don't think it's doing queries for these numbers; instead there
are extra columns in various SQL tables that cache this information
and get updated somewhere else.

* I suspect the complicated moderation features chew up a lot of time. You
take +1 or -1 votes from people, and then have to look up information
about the person, and then look at how people assess this person's
moderation... It's not doing this on every hit, though, but this feature
probably has *some* cost.

* There are lots of anti-abuse features, because Slashdot takes a lot
of punishment from bozos. Perhaps the daily traffic is 10,000
that get displayed plus another 10,000 messages that need to be filtered
out but consume database space nonetheless.

* Slashcode actually implements a pretty generic web application system that
runs various templates and stitches together the output. A Slashcode
"theme" consists of the templates, DB queries, and cron jobs that make up
a site; you could write a Slashcode theme that was amazon.com or any other
web application, in theory. However, only one theme has ever been
written, AFAICT: the one used to run Slashdot. (Some people have taken
this theme and tweaked it in small stylistic ways, but that's a matter of
editing this one theme, not creating a whole new one.) This adds an
extra level of interpretation because the site is running these templates
all the time.

3) The message store would be two files, one for metadata and one for
message text. Both of these would be mmap'd into memory. There would
be a fixed length of metadata for each message, so getting the
metadata for message #N would be a single array lookup. The metadata

I like this approach, though I think you'd need more files of metadata, e.g.
the discussion of story #X starts with message #Y.

(Note that this is basically how Metakit works: it mmaps a region of memory
and copies data around, provided a table-like API and letting you add and
remove columns easily. It might be easier to use Metakit than to reinvent a
similar system from scratch. Anyone know if this is also how SQLite works?)

Maybe threading would be a problem with fixed-length metadata records. It
would be fixed-length if you store a pointer in each message to its parent,
but to display a message thread you really want to chase pointers in the
opposite directory, from message to children. But a message can have an
arbitrary number of children, so you can't store such pointers and have
fixed-length records any more.

In my project discussions haven't been implemented yet, so I have no
figures to present.

--amk

Paul Rubin · Jan 31, 2004

A.M. Kuchling said:
* Some of that slowness may be latency on the client side, not the server. A
heavily table-based layout may require that the client get most or all of
the HTML before rendering it. Slashdot's HTML is a nightmare of tables;
some weblogs have CSS-based designs that are much lighter-weight.

Yeah, I'm specifically concerned about how severs get overloaded.

* To build the top page, Slashdot requires a lot of SQL queries.
There's the list of stories itself, but there are also lists of
subsections (Apache, Apple, ...), lists of stories in some
subsections (YRO, Book reviews, older stories), icons for the
recent stories, etc. All of these may need an SQL query, or at
least a lookup in some kind of cache.

That's what I'm saying--in a decently designed system, a common
operation like a front page load should take at most one SQL query, to
get the user preferences. The rest should be in memory. The user's
preferences can also be cached in memory so they'd be available with
no SQL queries if the user has connected recently. On Slashdot, an
LRU cache of preferences for 10,000 or so users would probably
eliminate at least half those lookups, since those are the ones who
keep hitting reload.

It also displays counts of posts to each story (206 of 319 comments),
but I don't think it's doing queries for these numbers; instead there
are extra columns in various SQL tables that cache this information
and get updated somewhere else.

That stuff would just be in memory.

* I suspect the complicated moderation features chew up a lot of time. You
take +1 or -1 votes from people, and then have to look up information
about the person, and then look at how people assess this person's
moderation... It's not doing this on every hit, though, but this feature
probably has *some* cost.

Nah, that's insignificant. Of 10k messages a day, maybe 1/3 get
moderated at all, but some get moderated more than once, so maybe
there's 5k moderations a day. Each is just an update to the metadata
for the article, cost close to zero.

* There are lots of anti-abuse features, because Slashdot takes a lot
of punishment from bozos. Perhaps the daily traffic is 10,000
that get displayed plus another 10,000 messages that need to be filtered
out but consume database space nonetheless.

I think 10,000 total is on the high side, just adding up the number of
comments on a typical day.

* Slashcode actually implements a pretty generic web application system that
runs various templates and stitches together the output.

Yeah, I think it's doing way too much abstraction, too much SQL, etc.

I like this approach, though I think you'd need more files of metadata, e.g.
the discussion of story #X starts with message #Y.

Basically, a story would just be a special type of message, indicated
by some field in its metadata. Another field for each message would
say where the next message in that story was (or a special marker if
it's the last message). So the messages would be in a linked list.
You'd remember in memory where the last message of each story is, so
you could append easily. You'd also make an SQL record for each story
so you can find the stories again on server restart.

(Note that this is basically how Metakit works: it mmaps a region of
memory and copies data around, provided a table-like API and letting
you add and remove columns easily. It might be easier to use
Metakit than to reinvent a similar system from scratch. Anyone know
if this is also how SQLite works?)

Wow cool, I should look at Metakit.

Maybe threading would be a problem with fixed-length metadata records. It
would be fixed-length if you store a pointer in each message to its parent,
but to display a message thread you really want to chase pointers in the
opposite directory, from message to children. But a message can have an
arbitrary number of children, so you can't store such pointers and have
fixed-length records any more.

Each metadata record would have a pointer to its parent, and another
pointer to the chronologically next record in that story. So you'd
read in a story by scanning down the linear chronological list, using
the parent pointers to build tree structure in memory as you go. If
you cache a few dozen of these trees, you shouldn't have to do those
scans very often (you'd do one if a user visits a very old story whose
tree is not in cache).

Lothar Scholz · Jan 31, 2004

Paul Rubin said:
Lots of times, when a blog (that supports user comments) or a web BBS
gets heavily loaded, it slows down horribly, even when it's hosted at
an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

The problem is that most of them don't use an application server and
that most of them uses databases very unoptimized, for example PhpBB
runs about 50 SQL queries per page for medium size thread views.

The www.asp-shareware.org has a nice framework that let you view the
messages in an nntp server. It's amazing fast, because everything is
hold in memory. Optimizing this by a factor 10 is not difficult.

The problem may come with slashdot or for example the german website
"www.heise.de" which gets upto 100 requests per second (it's among the
top ten in germany). They also use a NNTP backend. For sites like this
you need multiple server and then its not so easy to maintain cache
coherence. But there are only very few websites that have this
traffic.

At the moment there seems to be absolute no evolution in the area of
BBS's. PHPBB has set a standart an it seems that nobody is able to
think about more/other functionality. Maybe if more and more people
get virtual servers and have more control over what they can install
then it may become better. We shall see.

Aahz · Feb 1, 2004

Lots of times, when a blog (that supports user comments) or a web BBS
gets heavily loaded, it slows down horribly, even when it's hosted at
an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

Much as I'm loathe to mention it, you might might also want to take a
look at LiveJournal to see how they do it.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR

Max M · Feb 1, 2004

Paul said:
Lots of times, when a blog (that supports user comments) or a web BBS
gets heavily loaded, it slows down horribly, even when it's hosted at
an ISP on a fast computer with plenty of net bandwidth. I'm wondering
what those programs are doing, that makes them bog down so badly.
Anyone know what the main bottlenecks are? I'm just imagining them
doing a bunch of really dumb things.

Lot's of sql queries is a culprit.

Bad programming practices is another, with the code showing squared
behaviour.

But the biggest problem is without a doubt a lack of caching.

A subproblem here is that the sites are designed without caching
considderations in mind.

A Pentium 133 can easily fill up a 10Mbit line. That is a lot of requests.

So if you have a system that can generate static pages, that Apache can
serve you can have really fast systems.

regards Max m

David Steuber · Feb 1, 2004

Paul Rubin said:
1) It would be single threaded web server (asyncore, twistedmatrix)
with a select loop talking to a socket, either on port 80 directly, or
to a proxy web server running mod_gzip, SSL, and so forth.

Is this how you would handle concurrency?

Also, how bad is using files really? A decent Unix will use memory
to cache files, abstracting disk io behind the scenes for you.

Paul Rubin · Feb 1, 2004

David Steuber said:
Is this how you would handle concurrency?

Yes, it's a BBS, not a data mining system. There should be few if any
long-running transactions. Maybe some occasional complicated requests
would get spun off to a new thread or even just forked off. The
server would run normal requests to completion before going on to the
next request.

Also, how bad is using files really? A decent Unix will use memory
to cache files, abstracting disk io behind the scenes for you.

Maybe. Using mmap does avoid a lot of system calls/context switches.

Paul Rubin · Feb 1, 2004

Much as I'm loathe to mention it, you might might also want to take a
look at LiveJournal to see how they do it.

But LiveJournal is another one of those sites that slows to a crawl
when loaded. I want to know how the FAST sites do it.

Paul Rubin · Feb 1, 2004

Max M said:
But the biggest problem is without a doubt a lack of caching.

A subproblem here is that the sites are designed without caching
considderations in mind.

A Pentium 133 can easily fill up a 10Mbit line. That is a lot of requests.

So if you have a system that can generate static pages, that Apache
can serve you can have really fast systems.

I'd like to think that a well designed system can serve dynamic pages
almost as fast as static ones. If the page says "Hi <username>" at
the top, all it really needs to be doing is concatenating a few
in-memory strings together before serving the page.

The one way where static might win is if mod_gzip is a big cpu load.
You could cache pages in compressed form. However, even that may not
matter much. I believe gzip format lets you concatenate compressed
strings or files together, and still have them uncompress properly.

Paul Rubin · Feb 1, 2004

The problem is that most of them don't use an application server and
that most of them uses databases very unoptimized, for example PhpBB
runs about 50 SQL queries per page for medium size thread views.

Ugh! You mean 50 separate queries, rather than getting a cursor that
you read from 50 times? I think that could be improved a lot!

The www.asp-shareware.org has a nice framework that let you view the
messages in an nntp server. It's amazing fast, because everything is
hold in memory. Optimizing this by a factor 10 is not difficult.

Is source code available? I took a quick look there and didn't see
anything about it. Is the idea of NNTP that you leverage off the
existing NNTP servers that are written in C and already designed to
handle very heavy traffic?

The problem may come with slashdot or for example the german website
"www.heise.de" which gets upto 100 requests per second (it's among the
top ten in germany). They also use a NNTP backend. For sites like this
you need multiple server and then its not so easy to maintain cache
coherence. But there are only very few websites that have this
traffic.

Are multiple servers really needed? Is a multi-CPU box (4-way Athlon
MP or whatever) with shared memory not enough? 100 requests per second
doesn't sound like all THAT much.

For multiple servers, I've had the idea for a while of using a
Beowulf-type architecture, i.e. using fast LAN connections and special
kernel modules (I think that means MP directly over ethernet or
Myrinet, bypassing the TCP/IP stack) between the servers to get the
next best thing to shared memory. Another way is just process all
updates (new posts) through a single server. Every few seconds the
other servers would poll the update server (through a fast LAN) for
new posts to cache. I think it would be hard to have new posts coming
in fast enough to swamp a server. Most requests will be reads, not
new posts.

At the moment there seems to be absolute no evolution in the area of
BBS's. PHPBB has set a standart an it seems that nobody is able to
think about more/other functionality. Maybe if more and more people
get virtual servers and have more control over what they can install
then it may become better. We shall see.

I think there's new development happening in blogging software, which
is really pretty similar to BBS software, in terms of supporting
comment threads, user diaries, etc. I'm thinking of Scoop and so forth.

Aahz · Feb 1, 2004

But LiveJournal is another one of those sites that slows to a crawl
when loaded. I want to know how the FAST sites do it.

Considering just how loaded it needs to be before slowing down, and
considering how much processing LJ needs to do to display each page, I'd
have to say that it's pretty damn fast and you can probably learn a lot
from its design.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR

Paul Rubin · Feb 1, 2004

Considering just how loaded it needs to be before slowing down, and
considering how much processing LJ needs to do to display each page, I'd
have to say that it's pretty damn fast and you can probably learn a lot
from its design.

OK, I looked at a few typical user pages. I might look at the code
sometime (I think it's in Perl though, yecch). But I don't understand
about needing to do a lot of processing for each page. The ones I
looked at looked like they could have been served static entirely from
cache, updated only when someone actually posts a new entry or comment.
Maybe it's doing that, which would certainly make it fast.

Josiah Carlson · Feb 2, 2004

OK, I looked at a few typical user pages. I might look at the code

sometime (I think it's in Perl though, yecch). But I don't understand
about needing to do a lot of processing for each page. The ones I
looked at looked like they could have been served static entirely from
cache, updated only when someone actually posts a new entry or comment.
Maybe it's doing that, which would certainly make it fast.

Quoting the latest news post (http://www.livejournal.com/users/news/)
"We now host over 2,000,000 users, roughly half of which are active in
some way."

'Active in some way' are those who visit and check their friends page,
those that post updates to their own livejournal, or those that post
replies to entries/replies on livejournals, I believe in the last month.

30k visitors/day, many doing more than one thing when they visit (check
their friends page, post some replies, update their own livejournal,
etc.), everyone hitting different parts of the database, many causing
database updates (without updates, livejournal is worthless), etc.

No offense, but I'd love to see you write a BBS/Blogging software that
does that - in any language.

- Josiah

Paul Rubin · Feb 2, 2004

Josiah Carlson said:
Quoting the latest news post (http://www.livejournal.com/users/news/)
"We now host over 2,000,000 users, roughly half of which are active in
some way."

'Active in some way' are those who visit and check their friends page,
those that post updates to their own livejournal, or those that post
replies to entries/replies on livejournals, I believe in the last month.

OK, "active in some way" can mean read-only users who neer do anything
that causes any DB updates. It can also mean someone who posts a
comment once a month. Also, that doesn't say how much hardware they
use.

30k visitors/day,

Another informative number (about 1/10th as many visitors/day as
Slashdot gets).

many doing more than one thing when they visit (check their friends
page, post some replies, update their own livejournal, etc.),
everyone hitting different parts of the database, many causing
database updates (without updates, livejournal is worthless), etc.

OK, I'm not familiar with all LJ features, but so far this doesn't sound
too bad.

No offense, but I'd love to see you write a BBS/Blogging software that
does that - in any language.

I have some curiosity about how to go about that (hence this thread)
but at the moment I don't have powerful enough motivation to actually
want to do the work.

Max M · Feb 2, 2004

I'd like to think that a well designed system can serve dynamic pages
almost as fast as static ones. If the page says "Hi <username>" at
the top, all it really needs to be doing is concatenating a few
in-memory strings together before serving the page.

Yes. Or even faster if "Hi <username>" was two gifs that could be
loaded. Depending on whether you are logged on or not.

"Hi <username>" or "Log On"

Or the "personalised" information could be presented in a an iframe, or
by a server side include.

regards Max M

Josiah Carlson · Feb 2, 2004

OK, "active in some way" can mean read-only users who neer do anything

that causes any DB updates. It can also mean someone who posts a
comment once a month. Also, that doesn't say how much hardware they
use.

Hrm...I'll actually check the page.
http://www.livejournal.com/stats.bml

As of right now, says that > 217,000 users have updated their
livejournal in the last 24 hours. That is 217,000 database writes in
the last 24 hours. Nothing to shake a stick at.

However, most every post has the option of people commenting on it.
Those stats don't count comments, which all require a DB write. I would
be willing to bet that the number of replies to posts exceeds the number
of posts, but replies don't seem to count.

Checking on my own counts, I've posted 553 entries, received 1,110
comments to my posts, and posted 1,767 replies to other entries. Now,
if we assume I'm a lower bound, then for every entry in someone's
livejournal, there are at least two comments, which means that there
were an additional ~400,000 database writes that involved comments.

Another informative number (about 1/10th as many visitors/day as
Slashdot gets).

Or really, from my above numbers, > 200,000 visitors/day.

OK, I'm not familiar with all LJ features, but so far this doesn't sound
too bad.

With the new numbers, perhaps you'll change your mind

.

I have some curiosity about how to go about that (hence this thread)
but at the moment I don't have powerful enough motivation to actually
want to do the work.

Yeah. The trick is that most anything that is really intellectually
stimulating, is difficult to do. Solving the dynamic-page-generation
problem can be intellectually stimulating, but it is not easy. Adding
into it the page formatting with templates problem, database load,
bandwidth problem, etc. It starts getting unweildy very quickly.

- Josiah

So Confused	1	Jan 5, 2023
Why is regex so slow?	21	Jun 18, 2013
Why Do We Need Angular, React, or Other Frontend Frameworks?	0	Apr 19, 2025
So I have (a sketch of) a universal system...	3	Sep 1, 2022
How do I fix Error 1028: Insufficient Memory in IBM Notes when opening a large NSF file?	0	Feb 19, 2026
Mini Web Server in C++ (Part One)	4	Oct 2, 2025
Web-Based RAM Management: Real-Time Server Control in Windows 10	0	Nov 12, 2024
Why is os.stat so slow?	0	Jun 15, 2009

OT: why do web BBS's and blogs get so slow?

Paul Rubin

Jay O'Connor

Paul Rubin

Paul Rubin

A.M. Kuchling

Paul Rubin

Lothar Scholz

Aahz

Max M

David Steuber

Paul Rubin

Paul Rubin

Paul Rubin

Paul Rubin

Aahz

Paul Rubin

Josiah Carlson

Paul Rubin

Max M

Josiah Carlson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads