scalability and manageability

A

Arne Vajhøj

I agree with you, Arne.


"Scalability" is usually understood as the ability to get (reasonably)
close to n times performance increase of a component by increasing its
resources by n. More generally, it's the ability to adjust the
performance of a software system by adjusting resources without changing
code. There is no concept of "high-end" or "low-end" scalability that
I've ever encountered. There's only "scalability".

We got an explanation of what was meant.

If we say that:

f = bang/buck

then scalability is f'=1.

The "high-end" "low-end" problems is if f' does
not converge to the same value from both sides.

Not a very realistic scenario in my opinion.
If a system is scalable, it can be adjusted readily for high or low
workloads and achieve its performance goals.

That is a completely different use of "high" and "low".
The number of tiers in the system is no determinant of scalability, nor
of manageability, at any level.

Oh - it is.

It is a necessary requirement for but not sufficient to guarantee
scalability.

Arne
 
D

Daniele Futtorovic

I agree with you, Arne.


"Scalability" is usually understood as the ability to get (reasonably)
close to n times performance increase of a component by increasing its
resources by n. More generally, it's the ability to adjust the
performance of a software system by adjusting resources without changing
code. There is no concept of "high-end" or "low-end" scalability that
I've ever encountered. There's only "scalability".

If a system is scalable, it can be adjusted readily for high or low
workloads and achieve its performance goals.

The number of tiers in the system is no determinant of scalability, nor
of manageability, at any level.

Thanks for shedding some light, Arne and Lew.

So scalability is commonly used for measurements of greater N, and not
of lower. But wouldn't the study of the latter be relevant, too?

Let's take the following simple and perhaps somewhat contrived example.
Assume a component in a piece of software that serves some function
whose memory usage is linear and that, due to considerations as to, say,
the minimum load it expects to encounter, allocates itself a
corresponding amount of memory. Now put that component in a production
environment in which it only ever sees at most half the load it had
expected. Clearly, half of the memory it is hogging would be wasted. As
such, this component does not scale for lower N: as soon as it reaches
the minimum load it expects, it ceases to scale.

There's a similar and more realistic example with complexity. Greater
complexity of a piece of software is generally expressed through more
code, more data structures, more classes. Put a complex system in a
situation where only a fraction of its capabilities are used, and if the
capabilities are not lazily loaded, you've got waste. This would suggest
that the more features something has, the lesser its low-end scalability
is (I'll still use that term, for want of a better one).

Lastly, so as not to be speaking only about software, imagine a piece a
hardware that, being designed as a workhorse, consumes a certain minimal
amount of electricity, but gets put in an environment where it only sees
sporadic use.

In all these cases, you could say: well, you've got the wrong tool for
the job. And that is certainly right. However, standardisation is
effective. In a capitalist system, out of two producers of the same IT
good, the one whose product would match the need of the most consumers
would win over the other -- all others things being equal. One way for
the product to match many needs would be by being able both to increase
as well as decrease its scale smoothly. Consequently, we as the people
who are designing, building or using these things are likely to be
confronted with this issue.

df.
 
L

Lew

Daniele said:
Thanks for shedding some light, Arne and Lew.

So scalability is commonly used for measurements of greater N, and not of
lower. But wouldn't the study of the latter be relevant, too?

How do you get that from what we said?

We never said anything about an "N". If by "N" you mean the size of the
workload, you quoted my post:
"If a system is scalable, it can be adjusted readily for high or low workloads
and achieve its performance goals."

Doesn't "high or low" address your "of lower [N]" comment?

Anyway, this whole concept of "high-end/lowe-end" or "greater/lower 'N'" is
not what scalability is. Scalability is the ability to control throughput
levels through the allocation of resources. Control means adjustment for
various circumstances not just whatever the heck you mean by "greater N".
 
A

Arne Vajhøj

Thanks for shedding some light, Arne and Lew.

So scalability is commonly used for measurements of greater N, and not
of lower.

Not really.

In theory for both bigger and smaller.

But:
1) the expectation is that for small increases or small
decreases the impact would be the same. If f=bang/buck,
the we assume that f' is continuous.
2) the financial importance of scalability for increases
is much greater than for decreases.
> But wouldn't the study of the latter be relevant, too?

See arguments above.
Let's take the following simple and perhaps somewhat contrived example.
Assume a component in a piece of software that serves some function
whose memory usage is linear and that, due to considerations as to, say,
the minimum load it expects to encounter, allocates itself a
corresponding amount of memory. Now put that component in a production
environment in which it only ever sees at most half the load it had
expected. Clearly, half of the memory it is hogging would be wasted. As
such, this component does not scale for lower N: as soon as it reaches
the minimum load it expects, it ceases to scale.

But given that the hardware is paid for, then no one really cares.

Arne
 
M

Martin Gregorie

Let's take the following simple and perhaps somewhat contrived example.
'Contrived' is the right term for it.

If any process has a fixed, load dependent memory, its badly designed if
the memory requirement isn't configurable.

However, in practise I've never never seen this requirement: memory is
usually dynamically assigned and will typically vary with the amount of
data that a program is storing at any given time. For instance:

- if the process is doing lookups on a large hash table or binary
tree the storage requirement is determined by the number of items
being scanned. That has nothing to do with throughput: it is
entirely determined by the number of items in the database and their
representation in memory.

- if the memory is used to store a queue then its limited by the
maximum queue size. This must be configurable.

The only valid exception is when queued items have variable sizes:
in this case the memory should be claimed when a queued item is
created and released when it is destroyed. The maximum size of this
pool should by dynamic and configurable: the maximum queue size of the
queue should be limited by both the number of item slots in the queue
and by the number of queued items that will fit into the queue's memory
pool: reaching either limit will cause the queue to report 'queue full'
to any process attempting to add another item.

Queues must have a maximum size if the application is to be well-
behaved. This way back-pressure is exerted to limit upstream throughout
to what the slowest component can handle during a momentary spike. If
there is no such mechanism the application can overfill memory and
crash.

As I've said before, a well-designed application must have such feedback
mechanisms, enough instrumentation to allow its performance to be
monitored and be sufficiently configurable to match its throughput and
memory requirement to the hardware capabilities. Omission of any one of
these three elements counts as a failure by its designers.
 
A

Arne Vajhøj

If any process has a fixed, load dependent memory, its badly designed if
the memory requirement isn't configurable.

However, in practise I've never never seen this requirement: memory is
usually dynamically assigned and will typically vary with the amount of
data that a program is storing at any given time.

What the app is using for user data is dynamically and proportional
to problem size.

The memory used by JVM, OS etc. tend to be more fixed.

This can make the complete system look not so downwards scalable.

Arne
 
M

Martin Gregorie

What the app is using for user data is dynamically and proportional to
problem size.

The memory used by JVM, OS etc. tend to be more fixed.

This can make the complete system look not so downwards scalable.
Once you exceed the JVM defaults memory usage is fairly dynamic - or was
back in the days of Java 5. The garbage collector will shrink the JVM's
memory allocation whenever it can. Back then I was processing a fair
number of e-mails (40,000), some quite large, by reading them into
JavaMail from mbox files, parsing them to extract the date, subject,
senders, recipients and plain text and then inserting the lot into a
Postgres database. I had to set -Xmx to 450 MB to handle some of the
larger messages, but memory use was quite dynamic, hovering around 250 MB
for much of the run and only getting up to the maximum for a short while
before dropping back down again.

I was primarily attempting to make the point that to a first
approximation the OP is on entirely the wrong track by assuming that
anything other than CPU usage is dependent on throughput. Unless, of
course, you're increasing throughout by configuring multiple parallel
copies of some processes.
 
A

Arne Vajhøj

Once you exceed the JVM defaults memory usage is fairly dynamic - or was
back in the days of Java 5. The garbage collector will shrink the JVM's
memory allocation whenever it can. Back then I was processing a fair
number of e-mails (40,000), some quite large, by reading them into
JavaMail from mbox files, parsing them to extract the date, subject,
senders, recipients and plain text and then inserting the lot into a
Postgres database. I had to set -Xmx to 450 MB to handle some of the
larger messages, but memory use was quite dynamic, hovering around 250 MB
for much of the run and only getting up to the maximum for a short while
before dropping back down again.

-Xmx is app not the JVM itself.

I would not expect the JVM itself to be very dynamic.

Arne
 
L

Lew

Arne said:
-Xmx is app not the JVM itself.

I would not expect the JVM itself to be very dynamic.

/Au contraire/, that is the JVM. It's a parameter to the "java" command,
which instantiates and runs the JVM.

It is, as I suspect you meant to say, only a parameter that affects the heap.
Other parameters affect other areas of memory.

The notion of scalability applies usually to performance, either response time
or throughput. Even when applied to memory, there's a notion of a minimum
requirement. I don't think that has anything to do with whatever was meant by
"downward scalability".

Given that things can't happen infinitely fast, nor take less than zero
memory, it's trivially obvious that there's a boundary to speed and memory
parsimony. The point hardly seems worth the effort we're giving it.
 
D

Daniele Futtorovic

I was primarily attempting to make the point that to a first
approximation the OP is on entirely the wrong track by assuming that
anything other than CPU usage is dependent on throughput.

I beg to differ. Again, think about a some business software system with
a plethora of functionalities, of which only a few are used. If this
system isn't carefully designed, then it is likely unnecessarily to hog
resources. This is one of the things I was getting at and which you have
addressed in another post, namely that downwards scalability could be a
design consideration (the importance of which depends on the
circumstances, of course).

No, I don't think only CPU usage is affected by this. Static and dynamic
memory usage enter into it. Complexity of installation, maintenance and
of the upgrade process enter into it. These in turn affect what kind of
hardware infrastructure gets put in place, with its own costs of
installation, maintenance and upgrade.

Arne said that if f=bang/buck, the goal would be to get f'=1. In my
experience, f is more often designed to be logarithmic. In other words,
more importance is given to upwards than to downwards scalability.
 
M

Martin Gregorie

I beg to differ. Again, think about a some business software system with
a plethora of functionalities, of which only a few are used. If this
system isn't carefully designed, then it is likely unnecessarily to hog
resources.
You're making a different point, that disk and memory resource usage are
both affected by unused functionality, which is of course true. However
that doesn't affect my point that CPU usage is dependent entirely on the
use made of a module.

If no data is flowing through it, no CPU will be used unless it contains
a timer triggered thread. Of course, the CPU usage isn't necessarily
linear w.r.t. throughput volume. At very low input volumes on a heavily
loaded machine there may be an overhead as the process is swapped in when
input arrives and eventually gets swapped out again due to inactivity. At
busy times a misconfigured system might thrash.
 
A

Arne Vajhøj

/Au contraire/, that is the JVM. It's a parameter to the "java" command,
which instantiates and runs the JVM.

It is, as I suspect you meant to say, only a parameter that affects the
heap.

I said that the memory limited was app memory not JVM internal
memory.

Should be pretty obvious from the context.

If you had kept it.
The notion of scalability applies usually to performance, either
response time or throughput. Even when applied to memory, there's a
notion of a minimum requirement. I don't think that has anything to do
with whatever was meant by "downward scalability".

Given that things can't happen infinitely fast, nor take less than zero
memory, it's trivially obvious that there's a boundary to speed and
memory parsimony. The point hardly seems worth the effort we're giving it.

I don't think that applies to the point I was trying
to make.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top