Recent Criticism about Ruby (Scalability, etc.)

C

Chad Perrin

2 years x 1 developer @ $70k = 58x Dell PowerEdge 860 Quad Core Xeon
X3210s

Job Security Rocks!

Oops. I should have read the responses before I posted my own cost
comparison.
 
C

Chad Perrin

I don't think it was a matter of not getting something working -- IIRC
CD Baby did *work* when it was in Rails. In reality, I think it was that
he didn't understand MVC, Ruby or Rails when he started the migration --
it just looked cool, so he went out and hired a Rails programmer to do it.

I'm confused. If it worked . . . why did he throw it away and redo it in
PHP?
 
B

Brad Phelan

M. Edward (Ed) Borasky said:
"Complex scheduling algorithm" means different things to different
people. Is it slow because the algorithm sucks or slow because it's not
written in C/C++? What kind of scheduling is it -- combinatorial?

I have never looked at the algorithm myself. It is a TDMA message
scheduling algorithm. Given a set of messages on a time partitioned bus,
with multiple transmitters and receivers find the optimal message
schedule. It is not too different from the classic M$ Project scheduling
really. However you have thousands of messages to schedule and many
constraints. Still I am not sure that waiting 2 minutes to several hours
for a schedule to complete would occur if exactly the same algorithm was
written in C rather than Python. This is pure speculation however as
I've never had time to look over it myself and make a proper judgment.

However the argument from the developers is that real customers of the
tool rarely run a schedule. Once the schedule has been fixed it is
rarely changed. Therefore the pain of waiting for 5 minutes or even a
few hours in the case of a large schedule is little pain in the overall
scheme of a project that may last many years.

The only reason it bugs me is that I am constantly running schedules to
generate test cases for some other downstream code I am running. It
therefore causes me pain in ways it would not necessarily do for a real
customer. This brings us round to the original argument I guess. Perhaps
in this case Python is suitable. It is easy to code, and easy to
analyze, debug and maintain. The slowness is not a huge factor because
the customer just doesn't require it to be "fast"

B
 
C

Chad Perrin

Because he was able to do it himself, and then both _read_ the code
and _rewrite_ it.
Please read http://www.zedshaw.com/essays/c2i2_hypothesis.html,
Gadfly Festival section,
and all on the Big Rewrite by Chad Fowler. In this case PHP was a
(roughly and not at all harshly pat) alternative
for an Excel spreasdheet with VBA macros - the issue of ownership I
guess. Nothing critical.

How do you know that was the reason? The impression I got from the
lengthy, slashdotted explanation was that the project was unfinished
after two years, so he decided to junk it and start over in PHP.
 
C

Chad Perrin

He decided to write it _himself_. That's the main piece.

Yes, he decided to write it himself -- after giving up on Rails, for
reasons that, as far as I'm aware, relate to the fact that it wasn't done
in Rails after two years.
 
L

Lloyd Linklater

Chad said:
Yes, he decided to write it himself -- after giving up on Rails, for
reasons that, as far as I'm aware, relate to the fact that it wasn't
done in Rails after two years.

Actually, if I can be allowed to read between the lines, he went back
when his Ruby mentor left and he realized that he was not able to do it
in Ruby. He went back to what he knew when he was left on his own. He
sort of says that in the post.
 
C

Chad Perrin

Actually, if I can be allowed to read between the lines, he went back
when his Ruby mentor left and he realized that he was not able to do it
in Ruby. He went back to what he knew when he was left on his own. He
sort of says that in the post.

Well . . . yes, but judging by the phrasing he considers the main reason
for switching back to PHP to be that Rails didn't get the job done in two
years of development. He may well have stuck with Rails longer if the
"Ruby mentor" hadn't departed for greener pastures, but a lot of effort
seems to be spent in that piece on pointing out that he got done in a
very short time what Rails hadn't allowed in two years.

Note that I'm speaking of what the "analysis" seems to be saying, and not
my own opinions of what Rails can or cannot do. It seems ridiculous to
me that someone couldn't build a working web app in two years, regardless
of the tool used in the development effort (as I've already said).
 
R

Robert Klemme


Same here.
I go with Dave Thomas's verbiage "Ruby stays out of your way". That says
it all - dynamic typing, clear simple statements, endless extensibility,
and realistic scaling, all in a nutshell.


That's not scaling! (Okaaay, that's only one aspect of scaling!)

It definitively is. One aspect of Ruby that hinders scaling is the
absence of native threads IMHO. On the other hand, mechanisms are
provided for IPC (DRb for example) which are easy to use and thus may be
counted as compensating at least partially for the lack of native threading.
How did your Java design itself scale? The rate you add new features -
did it go up or down over time? _That's_ scaling. If the rate doesn't
slow down, you have time to tune your code to speed it up and handle
more users...

IMHO this is not scaling (well, at least not if you follow common usage)
but extensibility or flexibility of the design which translates into
developer efficiency. Which does not say this is superfluous or the
wrong measure, not at all. I just don't think "scalability" is the
wrong term here.

Kind regards

robert
 
R

Robert Klemme

2 years to rebuild in Rails?! How?!
Simple. You can't force an existing database structure onto a framework
that has an ORM. Doesn't work well if at all.

I think this statement with this level of generality is wrong. It
depends on the schema how well it fits a particular ORM tool.
You can migrate the data. easy.

That also depends on the schemas involved. The complexity of
translating a data or object model into another is mainly governed by
the similarity of the schemas.

Kind regards

robert
 
C

Charles Oliver Nutter

Chad said:
Assuming about an 80k salary and a 2,000 dollar server, a server is worth
about 50 hours of programmer time.

I just figured I'd provide a simple starting place for comparing the cost
of software development with that of hardware upgrades.

I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it's the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

An application that scales poorly will require more hardware. Hardware
is cheap, but power and administrative resources are not. If you need 10
servers to run a poorly-scaling language/platform versus some smaller
number of servers to run other "faster/more scalable"
languages/platforms, you're paying a continuously higher cost to keep
those servers running. Better scaling means fewer servers and lower
continuous costs.

Even the most inexpensive and quickly-developed application's savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

- Charlie
 
P

Phlip

2 years to rebuild in Rails?! How?!

Big companies get in this trouble when they practice "Big Requirements Up
Front". If you schedule a huge number of requirements, then try to do them
all at the same time, you make development extremely hard. This is how the
huge multi-billion dollar software project failures happen in the news.
Rails is not immune.

http://www.oreillynet.com/onlamp/blog/2007/09/big_requirements_up_front.html

The correct way to replace an old project is a process pattern called
"Strangler Fig". You ask the client what's one tiny feature to add to the
old system - what are they waiting for - and you implement it using the new
technology. You link the old system to it, and you put it online as soon as
possible.

A project that succeeds early cannot fail in 2 years. (It can be cancelled,
at any time, of course, but with no difference between the cost and the
amount of features deployed.)

Then you ask the client what's the next feature, and you implement this, and
use it as an excuse to convert a little more of the old system into the new
system. And if there's no reason to completely retire the old system, you
don't. You could save 1 year like that!

Someone got ambitious, and actually believed Rails's hype, and was
over-confident.
 
M

M. Edward (Ed) Borasky

Charles said:
I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it's the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

An application that scales poorly will require more hardware. Hardware
is cheap, but power and administrative resources are not. If you need 10
servers to run a poorly-scaling language/platform versus some smaller
number of servers to run other "faster/more scalable"
languages/platforms, you're paying a continuously higher cost to keep
those servers running. Better scaling means fewer servers and lower
continuous costs.

Even the most inexpensive and quickly-developed application's savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

- Charlie
Thank you!! It's about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of "adding servers is
cheaper than hiring programmers." They are two entirely different
economic propositions.
 
B

benjohn

Thank you!! It's about time somebody put a dollar figure on the cost of
poor scalability and highlighted the nonsense of "adding servers is
cheaper than hiring programmers." They are two entirely different
economic propositions.

That's true. However, very roughly, compute resource can scale about
linearly with compute requirement.

Alternatively, you can reduce the compute requirement by having a more
complex software system. However, the amount of programmer needed to
build and maintain a given complexity of software certainly doesn't
scale linearly with the system complexity (see the mythical man month).

I'm also not a fan of throwing hardware at a problem as a cure all, and
I don't like dumb "enterprise" grade solutions when there are much more
powerful alternatives. But I'm also not a fan of automatically assuming
that you need to build lots of clever and complex things, and that
existing components just wont do the job. Pragmatism seems like the best
approach :)
 
M

MenTaLguY

That's true. However, very roughly, compute resource can scale about
linearly with compute requirement.

What about Amdahl's law?
Alternatively, you can reduce the compute requirement by having a more
complex software system.

While it's true that very simple systems can perform badly because
they use poor algorithms and/or do not make dynamic optimizations,
more complex software generally means increased computational
requirements.

-mental
 
C

Chad Perrin

What about Amdahl's law?

What about it? Unless you're writing software that doesn't scale with
the hardware, more hardware means linear scaling, assuming bandwidth
upgrades. If bandwidth upgrades top out, you've got a bottleneck no
amount of hardware purchasing or programmer time will ever solve.

While it's true that very simple systems can perform badly because
they use poor algorithms and/or do not make dynamic optimizations,
more complex software generally means increased computational
requirements.

I thought "complex" was a poor choice of term here, for the most part.
It was probably meant as a stand-in for "more work at streamlining
design, combined with greater code cleverness needs to scale without
throwing hardware at the problem."
 
B

Brian Adkins

I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it's the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

"Most"? If you define "large data center" as the very top echelon,
then maybe, but even then I'd like to see some data. I expect the vast
majority (all?) of readers of this ng will be involved in scenarios in
which the cost of development time far exceeds electricity or server
costs for their deployed applications.

Part of what kept me from getting involved in Ruby sooner than I did
was my erroneous view that I wanted to be using technology that would
be sufficient to run Amazon, Ebay, etc. Little did it matter that I
wasn't pursuing that type of project - analogous to the fact that
most, if not all, Hummer drivers will never encounter serious off road
or combat situations :)

I'm all for increasing the performance and scalability of Ruby, but I
think the productivity gains still outweigh the extra runtime costs
for most projects.
 
C

Chad Perrin

I find this perspective puzzling. In most large datacenters, the big
cost of operation is neither the cost of the servers nor the cost of the
development time to put code on them, it's the peripheral electricity,
administration and cooling costs once the application written must be
deployed to thousands of users.

For very small operations, this is true.

An application that scales poorly will require more hardware. Hardware
is cheap, but power and administrative resources are not. If you need 10
servers to run a poorly-scaling language/platform versus some smaller
number of servers to run other "faster/more scalable"
languages/platforms, you're paying a continuously higher cost to keep
those servers running. Better scaling means fewer servers and lower
continuous costs.

Actually, when people talk about something scaling well or poorly,
they're usually talking about whether it scales linearly or requires an
ever-increasing inclusion of some resource. Something that scales very
well requires the addition of one more unit of a given resource to
achieve an increase in capability that matches up pretty much exactly
with the amount of capability per unit of resource already employed.
This is usually counted starting after an initial base resource cost.
For instance, if you have minimal electricity needs for lighting, air
conditioning, and a security system, plus your network infrastructure,
and none of that will need to be upgraded within the foreseeable future,
you start counting your electricity resource usage when you start
throwing webservers into the mix (for a somewhat simplified example). If
you simply add one more webserver to increase load handling by a static
quantity of concurrent connections, you have linear (good) scaling.

On the other hand, if you have a system plagued by interdependencies and
other issues that make your scaling needs non-linear, that kind of
resource cost can get *very* expensive. Obviously, some software design
needs are part of determining the linearality of your scaling
capabilities, but such needs often involve factors like choosing a
language that makes development easier, a framework that is already
well-designed for scaling, and so on. A language that compiles to
relatively high-performance binaries, or one that is compiled to bytecode
and executed by an optimizing VM, can help -- but that doesn't magically
make your software scale linearly. That's dependent upon how the
software was designed in the first place.

Throwing more programmers at the problem certainly won't result in a
system that scales linearly either. What a larger number of programmers
on a single project often does, in fact, is ensure that scaling
characteristics across the project are less consistent. You may end up
with one particular part of the overall software serving as a scaling
bottleneck because its design characteristics are sufficiently different
from the rest that it requires either a refactor or ever-increasing
resources as scaling needs get more extreme. Oh, and there's one more
thing . . .

Even the most inexpensive and quickly-developed application's savings
will be completely overshadowed if deployment to a large datacenter
results in unreasonably high month-on-month expenses.

Even the cheapest hardware and energy requirement will quickly become
astronomically expensive if you have to throw more programmers at it.
The more difficult a system is to maintain, the faster the needed
programmer resources grow. That's the key: programming resources don't
tend to scale linearly. Hardware resources, except in very poor examples
of software design, usually do.
 
C

Chad Perrin

It definitively is. One aspect of Ruby that hinders scaling is the
absence of native threads IMHO. On the other hand, mechanisms are
provided for IPC (DRb for example) which are easy to use and thus may be
counted as compensating at least partially for the lack of native threading.

Agreed. This is a far more useful argument against Ruby's ability to
scale than benchmarks from some website.

IMHO this is not scaling (well, at least not if you follow common usage)
but extensibility or flexibility of the design which translates into
developer efficiency. Which does not say this is superfluous or the
wrong measure, not at all. I just don't think "scalability" is the
wrong term here.

It's scalability of code, but not of system load handling. There are
different possible uses of the term "scale", and I think some of us are
running up against that discrepancy. You're talking about the
scalability of an application, and Phlip is talking about the scalability
of the development project behind the app, from what I see.
 
M

MenTaLguY

What about it? Unless you're writing software that doesn't scale with
the hardware, more hardware means linear scaling, assuming bandwidth
upgrades. If bandwidth upgrades top out, you've got a bottleneck no
amount of hardware purchasing or programmer time will ever solve.

Amdahl's law is relevant because most software _can't_ be written to
scale entirely linearly with the hardware, because most computational
problems are limited in the amount of parallelism they admit. You may
have been fortunate enough to have been presented with a lot of
embarrassingly parallel problems to solve, but that isn't the norm.
I thought "complex" was a poor choice of term here, for the most part.
It was probably meant as a stand-in for "more work at streamlining
design, combined with greater code cleverness needs to scale without
throwing hardware at the problem."

No argument there, as long as it's understood that there are limits to
what can be achieved. I don't want to discourage anyone from seeking
linear scalability as an ideal, but it's not a realistic thing to
promise or assume.

-mental
 
M

MenTaLguY

Amdahl's law is relevant because most software _can't_ be written to
scale entirely linearly with the hardware, because most computational
problems are limited in the amount of parallelism they admit. You may
have been fortunate enough to have been presented with a lot of
embarrassingly parallel problems to solve, but that isn't the norm.

Actually, this is not entirely true. Amdahl's law only applies to
optimizing a fixed workload. I need to rethink this argument.

-mental
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top