Ruby 1.8 vs 1.9

D

David Masover

Because that's how the other applications written on the mainframe the
company bought 20, 30, 40 years ago expect their data, and the same
code *still runs*.

In other words, not _quite_ greenfield, or at least, a somewhat different
sense of greenfield.

But I guess that explains why you're on a mainframe at all. Someone put their
data there 20, 30, 40 years ago, and you need to get at that data, right?
Legacy systems like that have so much money invested in them, with
code poorly understood (not necessarily because it's *bad* code, but
because the original author has retired 20 years ago),

Which implies bad code, bad documentation, or both. Yes, having the original
author available tends to make things easier, but I'm not sure I'd know what
to do with the code I wrote 1 year ago, let alone 20, unless I document the
hell out of it.
Want perpetual job security? Learn COBOL.

I considered that...

It'd have to be job security plus a large enough paycheck I could either work
very part-time, or retire in under a decade. Neither of these seems likely, so
I'd rather work with something that gives me job satisfaction, which is why
I'm doing Ruby.
 
P

Phillip Gawlowski

In other words, not _quite_ greenfield, or at least, a somewhat different
sense of greenfield.

You don't expect anyone to throw their older mainframes away, do you? ;)
But I guess that explains why you're on a mainframe at all. Someone put their
data there 20, 30, 40 years ago, and you need to get at that data, right?

Oh, don't discard mainframes. For a corporation the size of SAP (or
needing SAP software), a mainframe is still the ideal hardware to
manage the enormous databases collected over the years.

And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.
Which implies bad code, bad documentation, or both. Yes, having the original
author available tends to make things easier, but I'm not sure I'd know what
to do with the code I wrote 1 year ago, let alone 20, unless I document the
hell out of it.

It gets worse 20 years down the line: The techniques used and state of
the art then are forgotten now, for example (nobody uses GOTO, or
should use it, anyway) any more, and error handling is done with
exceptions these days, instead of error codes, for example. And TDD
didn't even *exist* as a technique.

Together with a very, very conservative attitude, changes are
difficult to deal with, if they can be implemented at all.

Assuming the source code still exists, anyway.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
D

David Masover

You don't expect anyone to throw their older mainframes away, do you? ;)

I suppose I expected people to be developing modern Linux apps that just
happen to compile on that hardware.
Oh, don't discard mainframes. For a corporation the size of SAP (or
needing SAP software), a mainframe is still the ideal hardware to
manage the enormous databases collected over the years.

Well, now that it's been collected, sure -- migrations are painful.

But then, corporations the size of Google tend to store their information
distributed on cheap PC hardware.
And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.

When you say "ideal", do you mean they actually beat out the cluster of
commodity hardware I could buy for the same price?
It gets worse 20 years down the line: The techniques used and state of
the art then are forgotten now, for example (nobody uses GOTO, or
should use it, anyway) any more, and error handling is done with
exceptions these days, instead of error codes, for example. And TDD
didn't even *exist* as a technique.

Together with a very, very conservative attitude, changes are
difficult to deal with, if they can be implemented at all.

Assuming the source code still exists, anyway.

All three of which suggest to me that in many cases, an actual greenfield
project would be worth it. IIRC, there was a change to the California minimum
wage that would take 6 months to implement and 9 months to revert because it
was written in COBOL -- but could the same team really write a new payroll
system in 15 months? Maybe, but doubtful.

But it's still absurdly wasteful. A rewrite would pay for itself with only a
few minor changes that'd be trivial in a sane system, but major year-long
projects with the legacy system.

So, yeah, job security. I'd just hate my job.
 
P

Phillip Gawlowski

I suppose I expected people to be developing modern Linux apps that just
happen to compile on that hardware.

Linux is usually not the OS the vendor supports. Keep in mind, a day
of lost productivity on this kind of systems means losses in the
millions of dollars area.
But then, corporations the size of Google tend to store their information
distributed on cheap PC hardware.

If they were incorporated where there was such a thing as "cheap PC
hardware". Google is a young corporation, even in IT. And they need
loads of custom code to make their search engine and datacenters
perform and scale, too.
When you say "ideal", do you mean they actually beat out the cluster of
commodity hardware I could buy for the same price?

Sure, if you can shell out for about 14 000 Xeon CPUs and 7 000 Tesla
GPGPUs (Source: http://en.wikipedia.org/wiki/Tianhe-I ).
All three of which suggest to me that in many cases, an actual greenfield
project would be worth it. IIRC, there was a change to the California minimum
wage that would take 6 months to implement and 9 months to revert because it
was written in COBOL -- but could the same team really write a new payroll
system in 15 months? Maybe, but doubtful.

So, you'd bet the corporation on the size of Exxon Mobile, Johnson &
Johnson, General Electric and similar, just because you *think* it is
easier to do changes 40 years later in an unproven, unused, upstart
language?

The clocks in the sort of shops that still run mainframes tick very
different from what you or I are used to.
But it's still absurdly wasteful. A rewrite would pay for itself with only a
few minor changes that'd be trivial in a sane system, but major year-long
projects with the legacy system.

If the rewrite would pay for itself in the short term, then why hasn't
it been done?

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
B

Brian Candler

Robert Klemme wrote in post #963807:
Checking input and ensuring that data reaches the program in proper
ways is generally good practice for robust software.

But that's not what Ruby does!.

If you do
s1 = File.open("foo","r:UTF-8").gets
it does *not* check that the data is UTF-8. It just adds a tag saying
that it is.

Then later, when you get s2 from somewhere else, and have a line like s3
= s1 + s2, it *might* raise an exception if the encodings are different.
Or it might not, depending on the actual content of the strings at that
time.

Say s2 is a string read from a template. It may work just fine, as long
as s2 contains only ASCII characters. But later, when you decide to
translate the program and add some non-ASCII characters into the
template, it may blow up.

If it blew up on the invalid data, I'd accept that. If it blew up
whenever two strings of different encodings encounter, I'd accept that.
But to have your program work through sheer chance, only to blow up some
time later when it encounters a different input stream - no, that sucks.

In that case, I would much rather the program didn't crash, but at least
carried on working (even in the garbage-in-garbage-out sense).
Brian, it seems you want to avoid the complex matter of i18n - by
ignoring it. But if you work in a situation where multiple encodings
are mixed you will be forced to deal with it - sooner or later.

But you're never going to want to combine two strings of different
encodings without transcoding them to a common encoding, as that
wouldn't make sense.

So either:

1. Your program deals with the same encoding from input through to
output, in which case there's nothing to do

2. You transcode at the edges into and out of your desired common
encoding

Neither approach requires each individual string to carry its encoding
along with it.
 
G

Garance A Drosehn

Thank you for being the voice of reason.

I've fought against Brian enough in the past over this issue, that I try =
to stay out of it these days. =A0However, his arguments always strike me as=
wanting to unlearn what we have learned about encodings.
We can't go back. =A0Different encodings exist. =A0At least Ruby 1.9 allo=
ws us to work with them.


My experience with 1.9 so far is that some of my ruby scripts have
become much faster. =A0I have other scripts which have needed to deal
with a much wider range of characters than "standard ascii". =A0I got
those string-related scripts working fine in 1.8. =A0They all seem to
break in 1.9.

In my own opinion, the problem isn't 1.9, is that I wrote these
string-handling scripts in ruby before ruby really supported all the
characters I had to deal with. =A0I look forward to getting my scripts
switched over to 1.9, but there's no question that *getting* to 1.9 is
going to require a bunch of work from me. =A0That's just the way it is.
Not the fault of ruby 1.9, but it's still some work to fix the
scripts.

--
Garance Alistair Drosehn=A0 =A0=A0 =3D=A0 =A0 =A0 =A0 =A0 =A0=A0 drosihn@gm=
ail.com
Senior Systems Programmer
Rensselaer Polytechnic Institute;=A0 =A0 =A0 =A0 =A0 =A0=A0 Troy, NY;=A0 US=
A
 
D

David Masover

=20
Linux is usually not the OS the vendor supports. Keep in mind, a day
of lost productivity on this kind of systems means losses in the
millions of dollars area.

In other words, you need someone who will support it, and maybe someone who=
'll=20
accept that kind of risk. None of the Linux vendors are solid enough? Or is=
it=20
that they don't support mainframes?
=20
Sure, if you can shell out for about 14 000 Xeon CPUs and 7 000 Tesla
GPGPUs (Source: http://en.wikipedia.org/wiki/Tianhe-I ).

=46rom that page:

"Both the original Tianhe-1 and Tianhe-1A use a Linux-based operating=20
system... Each blade is composed of two compute nodes, with each compute no=
de=20
containing two Xeon X5670 6-core processors and one Nvidia M2050 GPU=20
processor."

I'm not really seeing a difference in terms of hardware.
=20
So, you'd bet the corporation

Nope, which is why I said "doubtful."
just because you *think* it is
easier to do changes 40 years later in an unproven, unused, upstart
language?

Sorry, "unproven, unused, upstart"? Which language are you talking about?
=20
If the rewrite would pay for itself in the short term, then why hasn't
it been done?

The problem is that it doesn't. What happens is that those "few minor chang=
es"=20
get written off as "too expensive", so they don't happen. Every now and the=
n,=20
it's actually worth the expense to make a "drastic" change anyway, but at t=
hat=20
point, again, 15 months versus a greenfield rewrite -- the 15 months wins.

So it very likely does pay off in the long run -- being flexible makes good=
=20
business sense, and sooner or later, you're going to have to push another o=
f=20
those 15-month changes. But it doesn't pay off in the short run, and it's h=
ard=20
to predict how long it will be until it does pay off. The best you can do i=
s=20
say that it's very likely to pay off someday, but modern CEOs get rewarded =
in=20
the short term, then take their pensions and let the next guy clean up the=
=20
mess, so there isn't nearly enough incentive for long-term thinking.

And I'm not sure I could make a solid case that it'd pay for itself=20
eventually. I certainly couldn't do so without looking at the individual=20
situation. Still wasteful, but maybe not worth fixing.

Also, think about the argument you're using here. Why hasn't it been done? =
I=20
can think of a few reasons, some saner than others, but sometimes the answe=
r=20
to "Why hasn't it been done?" is "Everybody was wrong." Example: "If it was=
=20
possible to give people gigabytes of email storage for free, why hasn't it=
=20
been done?" Then Gmail did, and the question became "Clearly it's possible =
to=20
give people gigabytes of email storage for free. Why isn't Hotmail doing it=
?"
 
P

Phillip Gawlowski

In other words, you need someone who will support it, and maybe someone who'll
accept that kind of risk. None of the Linux vendors are solid enough? Or is it
that they don't support mainframes?

Both, and the Linux variant you use has to be certified by the
hardware vendor, too. Essentially, a throwback to the UNIX
workstations of yore: if you run something uncertified, you don't get
the support you paid for in the first place.
"Both the original Tianhe-1 and Tianhe-1A use a Linux-based operating
system... Each blade is composed of two compute nodes, with each compute node
containing two Xeon X5670 6-core processors and one Nvidia M2050 GPU
processor."

I'm not really seeing a difference in terms of hardware.

We are probably talking on cross purposes here:
You *can* build a vector CPU cluster out of commodity hardware, but it
involves a) a lot of hardware and b) a lot of customization work to
get them to play well with each other (like concurrency, and avoiding
bottlenecks that leads to a hold up in several nodes of you cluster).
Sorry, "unproven, unused, upstart"? Which language are you talking about?

Anything that isn't C, ADA or COBOL. Or even older. This is a very,
very conservative mindset, where not even Java has a chance.
So it very likely does pay off in the long run -- being flexible makes good
business sense, and sooner or later, you're going to have to push another of
those 15-month changes. But it doesn't pay off in the short run, and it's hard
to predict how long it will be until it does pay off. The best you can do is
say that it's very likely to pay off someday, but modern CEOs get rewarded in
the short term, then take their pensions and let the next guy clean up the
mess, so there isn't nearly enough incentive for long-term thinking.

Don't forget the engineering challenge. Doing the Great Rewrite for
software that's 20 years in use (or even longer), isn't something that
is done on a whim, or because this new-fangled "agile movement" is
something the programmers like.

Unless there is a very solid business case (something on the level of
"if we don't do this, we will go bankrupt in 10 days" or similarly
drastic), there is no incentive to fix what ain't broke (for certain
values of "ain't broke", anyway).
Also, think about the argument you're using here. Why hasn't it been done? I
can think of a few reasons, some saner than others, but sometimes the answer
to "Why hasn't it been done?" is "Everybody was wrong." Example: "If it was
possible to give people gigabytes of email storage for free, why hasn't it
been done?" Then Gmail did, and the question became "Clearly it's possible to
give people gigabytes of email storage for free. Why isn't Hotmail doing it?"

Google has a big incentive, and a big benefit going for it:
a) Google wants your data, so they can sell you more and better ads.
b) The per MB cost of hard drives came down *significantly* in the
last 10 years. For my external 1TB HD I paid about 50 bucks, and for
my internal 500GB 2.5" HD I paid about 50 bucks. For that kind of
money, you couldn't buy a 500 GB HD 5 years ago.

Without cheap storage, free email accounts with Gigabytes of storage
are pretty much impossible.

CUDA and GPGPUs have become available only in the last few years, and
only because GPUs have become insanely powerful and insanely cheap at
the same time.

If you were building the architecture that requires mainframes today,
I doubt anyone would buy a Cray without some very serious
considerations (power consumption, ease of maintenance, etc) in favor
of the Cray.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
D

David Masover

Both, and the Linux variant you use has to be certified by the
hardware vendor, too. Essentially, a throwback to the UNIX
workstations of yore: if you run something uncertified, you don't get
the support you paid for in the first place.

Must be some specific legacy systems, because IBM does seem to be supporting,
or at least advertising, Linux on System Z.
We are probably talking on cross purposes here:
You *can* build a vector CPU cluster out of commodity hardware, but it
involves a) a lot of hardware and b) a lot of customization work to
get them to play well with each other (like concurrency, and avoiding
bottlenecks that leads to a hold up in several nodes of you cluster).

Probably. You originally called this a "Mainframe", and that's what confused
me -- it definitely seems to be more a cluster than a mainframe, in terms of
hardware and software.
Anything that isn't C, ADA or COBOL. Or even older.

Lisp, then?
This is a very,
very conservative mindset, where not even Java has a chance.

If age is the only consideration, Java is only older than Ruby by a few
months, depending how you count.

I'm not having a problem with it being a conservative mindset, but it seems
irrationally so. Building a mission-critical system which is not allowed to
fail out of a language like C, where an errant pointer can corrupt data in an
entirely different part of the program (let alone expose vulnerabilities),
seems much riskier than the alternatives.

About the strongest argument I can see in favor of something like C over
something like Lisp for a greenfield project is that it's what everyone knows,
it's what the schools are teaching, etc. Of course, the entire reason the
schools are teaching COBOL is that the industry demands it.
Don't forget the engineering challenge. Doing the Great Rewrite for
software that's 20 years in use (or even longer), isn't something that
is done on a whim, or because this new-fangled "agile movement" is
something the programmers like.

I'm not disputing that.
Unless there is a very solid business case (something on the level of
"if we don't do this, we will go bankrupt in 10 days" or similarly
drastic), there is no incentive to fix what ain't broke (for certain
values of "ain't broke", anyway).

This is what I'm disputing. This kind of thinking is what allows companies
like IBM to be completely blindsided by companies like Microsoft.
Google has a big incentive, and a big benefit going for it:

Which doesn't change my core point. After all:
a) Google wants your data, so they can sell you more and better ads.

What's Microsoft's incentive for running Hotmail at all? I have to imagine
it's a similar business model.
b) The per MB cost of hard drives came down *significantly* in the
last 10 years.

Yes, but Google was the first to offer this. And while it makes sense in
hindsight, when it first came out, people were astonished. No one immediately
said "Oh, this makes business sense." They were too busy rushing to figure out
how they could use this for their personal backup, since gigabytes of online
storage for free was unprecedented.

Then, relatively quickly, everyone else did the same thing, because people
were leaving Hotmail for Gmail for the storage alone, and no one wanted to be
the "10 mb free" service when everyone else was offering over a hundred times
as much.

I'm certainly not saying people should do things just because they're cool, or
because programmers like them. Clearly, there has to be a business reason. But
the fact that no one's doing it isn't a reason to assume it's a bad idea.
 
R

Robert Klemme

Whoops, my mistake. I guess now I'm confused as to why they went with UTF-16
-- I always assumed it simply truncated things which can't be represented in
16 bits.

The JLS is a bit difficult to read IMHO. Characters are 16 bit and a
single character covers the range of code points 0000 to FFFF.

http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.1

Characters with code points greater than FFFF are called "supplementary
characters" and while UTF-16 provides encodings for them as well, these
need two code units (four bytes). They write "The Java programming
language represents text in sequences of 16-bit code units, using the
UTF-16 encoding.":

http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#95413

IMHO this is not very precise: all calculations based on char can not
directly represent the supplementary characters. These use just a
subset of UTF-16. If you want to work with supplementary characters
things get really awful. Then you need methods like this one

http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#toChars(int)

And if you stuff this sequence into a String all of a sudden
String.length() does no longer return the length in characters what is
in line with what the JavaDocs states

http://download.oracle.com/javase/6/docs/api/java/lang/String.html#length()

Unfortunately the majority of programs I have seen never takes this into
account and uses String.length() as "length in characters". This awful
mixture becomes apparent in the JavaDoc of class Character, which
explicitly states that there are two ways to deal with characters:

1. type char (no supplementary supported)
2. type int (with supplementary)

http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode
Wait, how?

You can convert a code point above FFFF via Character.toChars() (which
returns a char[] of length 2) and truncate it to 1. But: the resulting
sequence isn't actually invalid since all values in the range 0000 to
FFFF are valid characters. This isn't really robust. Even though the
docs say that the longest matching sequence is to be considered during
decoding there is no reliably way to determine whether d80d dd53
represents a single character (code point 013553) or two separate
characters (code points d80d and dd53).

If you like you can play around a bit with this:
https://gist.github.com/719100
I mean, yes, you can deliberately build strings out of corrupt data, but if
you actually work with complete strings and string concatenation, and you
aren't doing crazy JNI stuff, and you aren't digging into the actual bits of
the string, I don't see how you can create a truncated string.

Well, you can (see above) but unfortunately it is still valid. It just
happens to represent a different sequence.

Kind regards

robert
 
P

Phillip Gawlowski

Must be some specific legacy systems, because IBM does seem to be supporting,
or at least advertising, Linux on System Z.

Oh, they do. But it's this specific Linux, and you get locked into it.
Compile the kernel yourself, and you lose support.

And, of course, IBM does that to keep their customers locked in. While
Linux is open source, it's another angle for IBM to stay in the game.
Not all that successful, considering that mainframes are pretty much a
dying breed, but it keeps this whole sector on life support.
Probably. You originally called this a "Mainframe", and that's what confused
me -- it definitely seems to be more a cluster than a mainframe, in terms of
hardware and software.

Oh, it is. You can't build a proper mainframe out of off the shelf
components, but a mainframe is a cluster of CPUs and memory, anyway,
so you can "mimic" the architecture.
Lisp, then?

If there's commercial support, then, yes. The environment LISP comes
from is the AI research in MIT, which was done on mainframes, way back
when.
If age is the only consideration, Java is only older than Ruby by a few
months, depending how you count.

It isn't. Usage on mainframes is a component, too. And perceived
stability and roadmap safety (a clear upgrade path is desired quite a
bit, I wager).

And, well, Java and Ruby are young languages, all told. Mainframes
exist since the 1940s at the very least, and that's the perspective
that enabled "Nobody ever got fired for buying IBM [mainframes]".
I'm not having a problem with it being a conservative mindset, but it seems
irrationally so. Building a mission-critical system which is not allowed to
fail out of a language like C, where an errant pointer can corrupt data in an
entirely different part of the program (let alone expose vulnerabilities),
seems much riskier than the alternatives.

That is a problem of coding standards and practices. Another reason
why change in these sorts of systems is difficult to achieve. Now
imagine a language like Ruby that comes with things like reflection,
duck typing, and dynamic typing.
About the strongest argument I can see in favor of something like C over
something like Lisp for a greenfield project is that it's what everyone knows,
it's what the schools are teaching, etc. Of course, the entire reason the
schools are teaching COBOL is that the industry demands it.

A vicious cycle, indeed. Mind, for system level stuff C is still the
goto language, but not for anything that sits above that. At least,
IMO.
This is what I'm disputing. This kind of thinking is what allows companies
like IBM to be completely blindsided by companies like Microsoft.

Assuming that the corporation is actually an IT shop. Proctor &
Gamble, or ThyssenKrupp aren't. For them, IT is supporting the actual
business, and is much more of a cost center than a way to stay
competitive.

Or do you care if the steel beams you buy by the ton, or the cleaner
you buy are produced by a company that does its ERP on a mainframe or
a beowulf cluster?
Which doesn't change my core point. After all:


What's Microsoft's incentive for running Hotmail at all? I have to imagine
it's a similar business model.

Since MS doesn't seem to have a clue, either...

Historically, MS bought hotmail, because every body else started
offering free email accounts, and not just ISPs.

And Hotmail still smells of "me, too"-ism.
Yes, but Google was the first to offer this. And while it makes sense in
hindsight, when it first came out, people were astonished. No one immediately
said "Oh, this makes business sense." They were too busy rushing to figure out
how they could use this for their personal backup, since gigabytes of online
storage for free was unprecedented.

Absolutely. And Google managed to give possible AdWords customers
another reason to use AdSense: "Look, there's a million affluent,
tech-savvy people using our mail service, which allows us to mine the
data and to show your ads that much more effectvely!"
Then, relatively quickly, everyone else did the same thing, because people
were leaving Hotmail for Gmail for the storage alone, and no one wanted to be
the "10 mb free" service when everyone else was offering over a hundred times
as much.

That, and Google was the cool kid on the block back then. Which counts
for quite a bit, too. And the market of freemail offerings was rather
stale, until GMail shook it up, and got lots of mind share really
fast.

But most people stuck with their AOL mail addresses, since they didn't
care about storage, but cared about stuff working. The technorati
quickly switched (I'm guilty as charged), but aunts, and granddads
kept their AOL, EarthLink, or Yahoo! accounts.
I'm certainly not saying people should do things just because they're cool, or
because programmers like them. Clearly, there has to be a business reason. But
the fact that no one's doing it isn't a reason to assume it's a bad idea.

Of course. But if a whole sector, a whole user base, says "Thanks, but
no thanks", it has its reasons, too. Cost is one, and the human nature
of liking stability and disliking change plays into it, as well.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
O

Oknek Jeanr

I know,I really LIKE THE GERMANY life,anyone is the same?
I like to Buy UGG Boots From Good Supplier.
My friend Tell me a Good Supplier,can anyone
Let Me Know?
www Uggscom com
www Uggscom com
My friend tell me that can do Best Secvice and 100% Original.
really?
please let me know,thank you
 
D

David Masover

Oh, it is. You can't build a proper mainframe out of off the shelf
components, but a mainframe is a cluster of CPUs and memory, anyway,
so you can "mimic" the architecture.

When I hear "mainframe", I think of a combination of hardware and software
(zOS) which you actually can't get anywhere else, short of an emulator (like
Hercules).
It isn't. Usage on mainframes is a component, too.

IBM does seem to be aggressively promoting not just Linux on mainframes, but a
Unix subsystem and support for things like Java.
And perceived
stability and roadmap safety (a clear upgrade path is desired quite a
bit, I wager).

Is there "roadmap safety" in C, though?
And, well, Java and Ruby are young languages, all told. Mainframes
exist since the 1940s at the very least, and that's the perspective
that enabled "Nobody ever got fired for buying IBM [mainframes]".

Right, that's why I mentioned Lisp. They're old enough that I'd argue the time
to be adopting is now, but I can see someone with a mainframe several times
older wanting to wait and see.
That is a problem of coding standards and practices.

There's a limit to what you can do with that, though.
Another reason
why change in these sorts of systems is difficult to achieve. Now
imagine a language like Ruby that comes with things like reflection,
duck typing, and dynamic typing.

In practice, it doesn't seem like any of these are as much of a problem as the
static-typing people fear. Am I wrong?

Given the same level of test coverage, a bug that escapes through a Ruby test
suite (particularly unit tests) might lead to something like an "undefined
method" exception from a nil -- relatively easy to track down. In Java, it
might lead to NullPointerExceptions and the like. In C, it could lead to
_anything_, including silently corrupting other parts of the program.

Technically, it's _possible_ Ruby could do anything to any other part of the
program via things like reflection -- but this is trivial to enforce. People
generally don't monkey-patch core stuff, and monkey-patching is easy to avoid,
easy to catch, and relatively easy to do safely in one place, and avoid
throughout the rest of your program.

Contrast to C -- it's not like you can avoid pointers, arrays, pointer
arithmetic, etc. And Ruby at least has encapsulation and namespacing -- I
really wouldn't want to manage a large project in C.
A vicious cycle, indeed.

I have to wonder if it would be worth it for any of these companies to start
demanding Lisp. Ah, well.
Mind, for system level stuff C is still the
goto language, but not for anything that sits above that. At least,
IMO.

For greenfield system-level stuff, I'd be seriously considering something like
Google's Go. But my opinion probably isn't worth much here, as I don't really
do system-level stuff if I can avoid it (which is almost always). If I had to,
I'd pass as much off to userland as I could get away with.
This is what I'm disputing. This kind of thinking is what allows
companies like IBM to be completely blindsided by companies like
Microsoft.

[snip]

Or do you care if the steel beams you buy by the ton, or the cleaner
you buy are produced by a company that does its ERP on a mainframe or
a beowulf cluster?

Not particularly, but I do care if someone else can sell me those beams
cheaper. Even just as a cost center, it matters how much it costs.

And who knows? Maybe someone else just implemented a feature that actually
does matter to me. Maybe they anticipate when their customers need more steel
and make them an offer then, or maybe they provide better and tighter
estimates as to when it'll be ready and how long it'll take to ship -- maybe
it's an emergency, I need several tons RIGHT NOW, and someone else manages
their inventory just a bit better, so they can get it to me days earlier.

Granted, it's a slower industry, so maybe spending years (or decades!) on
changes like the above makes sense. Maybe no one is offering or asking for the
features I've suggested -- I honestly don't know. But this is why it can
matter than one organization can implement a change in a few weeks, even a few
months, while another would take years and will likely just give up.
That, and Google was the cool kid on the block back then. Which counts
for quite a bit, too. And the market of freemail offerings was rather
stale, until GMail shook it up, and got lots of mind share really
fast.

But most people stuck with their AOL mail addresses, since they didn't
care about storage, but cared about stuff working. The technorati
quickly switched (I'm guilty as charged), but aunts, and granddads
kept their AOL, EarthLink, or Yahoo! accounts.

Most of them, for awhile.

But even granddads have grandkids emailing them photos, so there goes that 10
megs. Now they have to delete stuff, possibly download it and then delete it.
A grandkid hears them complaining and suggests switching to Gmail.
 
P

Phillip Gawlowski

Is there "roadmap safety" in C, though?

Since it is, technically, a standardized language, with defined
behavior in all cases (as if), it is.

Though, considering C++0x was supposed to be finished two years ago...
There's a limit to what you can do with that, though.

One of cost. Nobody wants to spend the amount of money that NASA
spends on the source for the Space Shuttle, but that code is
guaranteed bug free. Not sure which language is used, though, but I
guess it's ADA.

In practice, it doesn't seem like any of these are as much of a problem as the
static-typing people fear. Am I wrong?

Nope. But perceived risk outweighs actual risk. See also: US policy
since 2001 vis a vis terrorism.
Given the same level of test coverage, a bug that escapes through a Ruby test
suite (particularly unit tests) might lead to something like an "undefined
method" exception from a nil -- relatively easy to track down. In Java, it
might lead to NullPointerExceptions and the like. In C, it could lead to
_anything_, including silently corrupting other parts of the program.

Technically, it's _possible_ Ruby could do anything to any other part of the
program via things like reflection -- but this is trivial to enforce. People
generally don't monkey-patch core stuff, and monkey-patching is easy to avoid,
easy to catch, and relatively easy to do safely in one place, and avoid
throughout the rest of your program.

You know that, I know that, but the CTO of Johnson and Johnson
doesn't, and probably doesn't care. Together with the usual
bureaucratic infighting and processes to change *anything*, you'll be
SOL most of the time. Alas.
Contrast to C -- it's not like you can avoid pointers, arrays, pointer
arithmetic, etc. And Ruby at least has encapsulation and namespacing -- I
really wouldn't want to manage a large project in C.

Neither would I. But then again, there's a lot of knowledge for
managing large C code bases. Just look at the Linux kernel, or Windows
NT.
Not particularly, but I do care if someone else can sell me those beams
cheaper. Even just as a cost center, it matters how much it costs.

And who knows? Maybe someone else just implemented a feature that actually
does matter to me. Maybe they anticipate when their customers need more steel
and make them an offer then, or maybe they provide better and tighter
estimates as to when it'll be ready and how long it'll take to ship -- maybe
it's an emergency, I need several tons RIGHT NOW, and someone else manages
their inventory just a bit better, so they can get it to me days earlier.

Production, these days, is Just In Time. To stay with our steel
example: Long before the local county got around to nodding your
project through so that you can begin building, you already know what
components you need, and when (since *you* want to be under budget,
and on time, too), so you order 100 beams of several kinds of steel,
and your aggregates, and bribe the local customs people, long before
you actually need the hardware.

There's (possibly) prototyping, testing (few 100MW turbines can be
built in series, because demands change with every application), and
nobody keeps items like steel beams (or even cars!) in storage
anymore. ;)

Similar with just about anything that is bought in large quantities
and / or with loads of lead time (like the 787, or A380).

In a nutshell: being a day early, or even a month, doesn't pay off
enough to make it worthwhile to restructure the whole company's
production processes, just because J. Junior Developer found a way to
shave a couple of seconds off of the DB query to send off ordering
iron ore. ;)
Granted, it's a slower industry, so maybe spending years (or decades!) on
changes like the above makes sense. Maybe no one is offering or asking for the
features I've suggested -- I honestly don't know. But this is why it can
matter than one organization can implement a change in a few weeks, even a few
months, while another would take years and will likely just give up.

Since it takes *years* to build a modern production facility, it *is*
a slower industry, all around. IT is special in that iterates through
hardware, software, and techniques much faster that the rest of the
world.

And an anecdote:
A large-ish steel works corp introduced a PLC system to monitor their
furnaces down to the centidegree Celsius, and the "recipe" down to the
gram. After a week, they deactivated the stuff, since the steel
produced wasn't up to spec, and the veteran cookers created much
better steel, and cheaper.
Most of them, for awhile.

But even granddads have grandkids emailing them photos, so there goes that 10
megs. Now they have to delete stuff, possibly download it and then delete it.
A grandkid hears them complaining and suggests switching to Gmail.

Except that AOhoo! upgraded their storage. And you'd be surprised
how... stubborn non-techies can be. One reason why I don't do family
support anymore. :p

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
R

Robert Klemme

Well, you can (see above) but unfortunately it is still valid. =A0It just
happens to represent a different sequence.

After reading http://tools.ietf.org/html/rfc2781#section-2.2 I am not
sure any more whether the last statement still holds. It seems the
presented algorithm can only work reliable if certain code points are
unused. And indeed checking with
http://www.unicode.org/charts/charindex.html shows that D800 and DC00
are indeed reserved. Interestingly enough Java's
Character.isDefined() returns true for D800 and DC00:

https://gist.github.com/719100

Cheers

robert


--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
D

David Masover

Nope. But perceived risk outweighs actual risk. See also: US policy
since 2001 vis a vis terrorism.

Sounds like we don't actually disagree.
You know that, I know that, but the CTO of Johnson and Johnson
doesn't,

Then why the fsck is he CTO of anything?
and probably doesn't care.

This is the part I don't get.
How do you get to be CTO by not caring about technology?
Together with the usual
bureaucratic infighting and processes to change *anything*, you'll be
SOL most of the time. Alas.

Which is, again, a point I'd hope the free market would resolve. If there's a
way to build a relatively large corporation without bureaucracy and process
crippling actual progress, you'd think that'd be a competitive advantage.
Neither would I. But then again, there's a lot of knowledge for
managing large C code bases. Just look at the Linux kernel, or Windows
NT.

In each case, there wasn't really a better option, and likely still isn't.

Still, I don't know about Windows, but on Linux, there seems to be a push to
keep the kernel as small as it can be without losing speed or functionality.
There were all sorts of interesting ideas in filesystems, but now we have
fuse, so there's no need for ftpfs in the kernel. Once upon a time, there was
a static HTTP server in the kernel, but even a full apache in userspace is
fast enough.

And the reason is clear: Something blows up in a C program, it can affect
anything else in that program, or any memory it's connected to. Something
blows up in the kernel, it can affect _anything_.

I'm also not sure how much of that knowledge really translates. After all, if
an organization is choosing C because it's the "safe" choice, what are the
chances they'll use Git, or open development, or any of the other ways the
Linux kernel is managed?
Production, these days, is Just In Time. To stay with our steel
example: Long before the local county got around to nodding your
project through so that you can begin building, you already know what
components you need, and when (since you want to be under budget,
and on time, too), so you order 100 beams of several kinds of steel,

So what happens if they cancel your project?
In a nutshell: being a day early, or even a month, doesn't pay off
enough to make it worthwhile to restructure the whole company's
production processes, just because J. Junior Developer found a way to
shave a couple of seconds off of the DB query to send off ordering
iron ore. ;)

Shaving a couple seconds off is beside the point. The question is whether
there's some fundamental way in which the process can be improved -- something
which can be automated which actually costs a large amount of time, or some
minor shift in process, or small amount of knowledge...

Another contrived example: Suppose financial records were kept as text fields
and balanced by hand. The computer still helps, because you have all the data
in one place, easily backed up, multiple people can be looking at the same
data simultaneously, and every record is available to everyone who needs it
instantly.

But as soon as you want to analyze any sort of financial trend, as soon as you
want to mine that data in any meaningful way, you have a huge problem. The
query running slowly because it's text is probably minor enough. The problem
is that your data is mangled -- it's got points where there should be commas,
commas where there should be points, typo after typo, plus a few "creative"
entries like "a hundred dollars." None of these were issues before -- the
system did work, and had no bugs. But clearly, you want to at least start
validating new data entered, even if you don't change how it's stored or
processed just yet.

In a modern system, adding a validation is a one-liner. Some places, that
could take a week to go through the process. Some places, it could be pushed
to production the same day. (And some places arguably don't have enough
process, and could see that one-liner in production thirty seconds after
someone thought of it.)

To retrofit that onto an ancient COBOL app could take a lot more work.

I don't know enough about steel to say whether it's relevant here, but I have
to imagine that even here, there are opportunities to dramatically improve
things. Given an opportunity to make the change, in choosing whether to
rewrite or not, I'd have to consider that this isn't likely to be the last
change anyone ever makes.

The depressing thing is that in a modern corporation, this sort of discussion
would be killed reflexively by that conservative-yet-short-term mentality. A
rewrite may or may not be a sound investment down the road, but if it costs
money and doesn't pay off pretty immediately, it's not worth the risk at
pretty much any level of the company. Not so much because it might not pay off
ever, but more because investors will see you've cost the company money (if
only in the short term) and want you gone.
And an anecdote:
A large-ish steel works corp introduced a PLC system to monitor their
furnaces down to the centidegree Celsius, and the "recipe" down to the
gram. After a week, they deactivated the stuff, since the steel
produced wasn't up to spec, and the veteran cookers created much
better steel, and cheaper.

Cool.

I can only wonder how well that works when the veteran cookers retire. Does
that knowledge translate?

I've definitely learned something about steel today, though. Interesting
stuff. Also good to know what I want to avoid...
Except that AOhoo! upgraded their storage.

Which is kind of my point. Why did they upgrade? While it's true that it's
relatively cheap, and they may also be monetizing their customer's data, I
have to imagine at least part of the reason is that they were feeling the
pressure from Gmail.
And you'd be surprised
how... stubborn non-techies can be.

Not terribly. I'm more surprised how stubborn techies can be.
 
B

Brian Candler

And you'd be surprised
Not terribly. I'm more surprised how stubborn techies can be.

IME the main problems are:

* Operational. You have a whole workforce trained up to use
mainframe-based system A; getting them all to change to working with new
system B can be expensive. This is in addition to their "business as
usual" work.

* Change resistance. If system B makes even minor aspects of life for
some of the users more difficult than it was before, those users will
complain very loudly.

* Functional. System A embodies in its code a whole load of knowledge
about business processes, some of which is probably obsolete, but much
is still current. It's probably either not documented, or there are
errors and omissions in the documentation. Re-implementing A as B needs
to reverse-engineer the behaviour *and* decide which is current and
which is obsolete, or else re-specify it from scratch.

And to be honest, over time new System B is likely to become as
undocumented and hard to maintain as System A was, unless you have a
highly skilled and strongly directed development team.

So, unless System B delivers some killer feature which could not instead
be implemented as new system C alongside existing system A, it's hard to
make a business case for reimplementing A as B.

The market ensures that IBM prices their mainframe solutions just at the
level where the potential cost saving of moving away from A is
outweighed by the development and rollout cost of B, for most users
(i.e. those who have not migrated away already)
 
D

David Masover

IME the main problems are:

* Operational. You have a whole workforce trained up to use
mainframe-based system A; getting them all to change to working with new
system B can be expensive. This is in addition to their "business as
usual" work.

This is what I was talking about, mostly. I'm not even talking about stuff
like switching to Linux or Dvorak, but I'm constantly surprised by techies who
use IE because it's there and they can't be bothered to change, or C++ because
it's what they know and they don't want to learn a new language -- yet they're
perfectly willing to learn a new framework, which is a lot more work.
* Functional. System A embodies in its code a whole load of knowledge
about business processes, some of which is probably obsolete, but much
is still current. It's probably either not documented, or there are
errors and omissions in the documentation. Re-implementing A as B needs
to reverse-engineer the behaviour *and* decide which is current and
which is obsolete, or else re-specify it from scratch.

This is probably the largest legitimate reason not to rewrite. In fact, if
it's just a bad design on otherwise good technology, an iterative approach is
slow, torturous, but safe.
And to be honest, over time new System B is likely to become as
undocumented and hard to maintain as System A was, unless you have a
highly skilled and strongly directed development team.

Well, technologies _do_ improve. I'd much rather have an undocumented and hard
to maintain Ruby script than C program any day, let alone COBOL.
 
P

Phillip Gawlowski

Then why the fsck is he CTO of anything?


This is the part I don't get.
How do you get to be CTO by not caring about technology?

Because C-level execs working for any of the S&P 500 don't deal with
minutiae, and details. They set *policy*. Whether or not to even look
into the cloud services, if and how to centralize IT support, etc.

The CTO supports the CEO, and you hardly expect the CEO to be
well-versed with a tiny customer, either, would you?

Oh, and he's the fall guy in case the database gets deleted. :p
Which is, again, a point I'd hope the free market would resolve. If there's a
way to build a relatively large corporation without bureaucracy and process
crippling actual progress, you'd think that'd be a competitive advantage.

There isn't. The bureaucratic overhead is a result of keeping a) a
distributed workforce on the same page, and b) to provde consistent
results, and c) to keep the business running even if the original
first five employees have long since quit.

It's why McD and BK can scale, but a Michelin star restaurant can't.
And the reason is clear: Something blows up in a C program, it can affect
anything else in that program, or any memory it's connected to. Something
blows up in the kernel, it can affect _anything_.

I'm also not sure how much of that knowledge really translates. After all, if
an organization is choosing C because it's the "safe" choice, what are the
chances they'll use Git, or open development, or any of the other ways the
Linux kernel is managed?

None to zero. But C is older than Linux or Git, too. It's around for
quite a few years now, and well understood.
So what happens if they cancel your project?

At that late a stage, a project doesn't get canceled anymore. It can
be postponed, or paused, but it rarely gets canceled.

You don't order a power plant or a skyscraper on a whim, but because
it is something that is *necessary*.

And the postponing (or cancelling, as rarely as it happens), has
extreme repercussions. But that's why there's breach of contract fees
and such included, to cover the work already done.

Shaving a couple seconds off is beside the point. The question is whether
there's some fundamental way in which the process can be improved -- something
which can be automated which actually costs a large amount of time, or some
minor shift in process, or small amount of knowledge...

That assumes that anything *can* be optimized. Considering the
accounting standards and practices that are needed, the ISO
certification for ISO 900x, etc. There is little in the way of
optimizing the actual processes of selling goods. Keep in mind, that
IT isn't he lifeblood of any non-IT corporation, but a means to an
end.
Another contrived example: Suppose financial records were kept as text fields
and balanced by hand. The computer still helps, because you have all the data
in one place, easily backed up, multiple people can be looking at the same
data simultaneously, and every record is available to everyone who needs it
instantly.

But as soon as you want to analyze any sort of financial trend, as soon as you
want to mine that data in any meaningful way, you have a huge problem. The
query running slowly because it's text is probably minor enough. The problem
is that your data is mangled -- it's got points where there should be commas,
commas where there should be points, typo after typo, plus a few "creative"
entries like "a hundred dollars." None of these were issues before -- the
system did work, and had no bugs. But clearly, you want to at least start
validating new data entered, even if you don't change how it's stored or
processed just yet.

In a modern system, adding a validation is a one-liner. Some places, that
could take a week to go through the process. Some places, it could be pushed
to production the same day. (And some places arguably don't have enough
process, and could see that one-liner in production thirty seconds after
someone thought of it.)

To retrofit that onto an ancient COBOL app could take a lot more work.

Why do you think the Waterfall Process was invented? Or IT processes
in the first place? To discover and deliver the features required.

That's also why new software generally is preferred to change existing
software: It's easier to implement changes that way, and to plug into
the ERP systems that already exist.
I don't know enough about steel to say whether it's relevant here, but I have
to imagine that even here, there are opportunities to dramatically improve
things. Given an opportunity to make the change, in choosing whether to
rewrite or not, I'd have to consider that this isn't likely to be the last
change anyone ever makes.

If a steel cooker goes down, it takes 24 to 48 hours to get it going
again. It takes about a week for the ore to smelt, and to produce
iron. Adding in carbon to create steel makes this process take even
longer.

So, what'd be the point of improving a detail, when it doesn't speed
up the whole process *significantly*?
The depressing thing is that in a modern corporation, this sort of discussion
would be killed reflexively by that conservative-yet-short-term mentality. A
rewrite may or may not be a sound investment down the road, but if it costs
money and doesn't pay off pretty immediately, it's not worth the risk at
pretty much any level of the company. Not so much because it might not pay off
ever, but more because investors will see you've cost the company money (if
only in the short term) and want you gone.
Agreed.


I can only wonder how well that works when the veteran cookers retire. Does
that knowledge translate?

Yup. Their subordinates acquire the knowledge. That's how trades are
taught in Europe (in general): In a master-apprentice system, where an
accomplished tradesman teaches their apprentice what they know (used
to be that a freshly minted "Geselle", as we call non-Masters,
non-apprentices in Germany, went on a long walk through Europe, to
acquire new and refine their skills, before settling down and have
their own apprentices; that's how the French style of Cathedral
building came to England, for example.)
I've definitely learned something about steel today, though. Interesting
stuff. Also good to know what I want to avoid...

Just take what I say with a grain of salt. The closest I got to an
iron smelter was being 500 yards away from one when it had to do an
emergency shutdown because the power cable broke.
Which is kind of my point. Why did they upgrade? While it's true that it's
relatively cheap, and they may also be monetizing their customer's data, I
have to imagine at least part of the reason is that they were feeling the
pressure from Gmail.

Absolutely! GMail did lots of good at reinvograting a stagnat market.

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.
 
B

Brian Candler

David Masover wrote in post #964917:
Well, technologies _do_ improve. I'd much rather have an undocumented
and hard
to maintain Ruby script than C program any day, let alone COBOL.

But rewriting COBOL in Perl may be a bad idea :) Even rewriting it in
Ruby may be a bad idea if the programmers concerned don't have a lot of
experience of Ruby.

(There is some truly awful legacy Ruby code that I have to look at from
time to time. It is jammed full of @@class variables, has no tests, and
the logic is a tortuous maze. We're aiming to retire the whole system)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top