Getting Started in Programming & Scripting

A

apg

Steve said:
On Thu, 15 Sep 2005 08:47:50 +0000 (UTC)



Absolutely not - but those who learn their road sense on a bicycle
(dodgy brakes, no airbags, no seatbelts and worse) are probably safer
drivers than those who learned it in a Volvo. Maybe...


Yes but there's a lot to be said for learning the art of
programming with something dangerous like BCPL or C where careless
coding gets rewarded with core dumps and other program crashes.
Why stop there, learn with assembler, I did, you've not lived until
you've patched the vtoc in hex or keyed in a bootstrap loader in octal.
 
W

Walter Roberson

There is nothing wrong with using pointers if they are used properly.
Pointers (indexes) are used in assembler frequently. There should be no
need for pointers in a high level language other than an index to a
record or similar requirement. Strong typing I agree is a pain at times
but if you can't express what you are attempting to achieve without
breaking the rules then there is something wrong and it's probably not
the rules.

We had a prototype research program that was in a mix of C
(computation section) and IDL (GUI). It went a lot further than
proof of concept, and should have been broken down and rewritten
long long before the point it go to, but you know what they say:
Programming is what happens while you're busy making other plans.

Eventually a decision was taken to generalize some important parts of it,
and modernize it all, and a fresh team was put together. As Java was
The Way Of The Future at the time, the team decided to write the
program in Java as much as possible, dropping into C++ for those
portions that were -measureably- too slow in Java.

You are, I am sure, familiar with the reasons to abandon C and move
to Java: Java is a clean new language design, object oriented,
without those ugly buffer overruns, and without all those pointer
problems C has, and any "well-written" Java program doesn't -need-
anything as gouche as type-punning. And with the IBM JIT compilers,
Java is even faster than C.


Well, about a person-decade of programming effort later, I chatted
with the team leader, and found that they had given up on that
approach, and for their next generation of the project, they were
rewriting everything into as pure C++ as they could, with just a
small graphics library interface for the GUI.

The reason? Because {I was told} Java's "reference" mechanisms in
concert with Java's typing rules, are such that every time they sliced
up a multidimensional array into 1-D vectors for the mathematical
analysis we do, Java insisted on taking a -copy- of the data insted
of just passing along the address of the data. And since the
datasets could be a gigabyte and since the analysis portion might
iterate through ten thousand or more possibilities, all of the
data copying that was going on was just killing their performance.

but if you can't express what you are attempting to achieve without
breaking the rules then there is something wrong and it's probably not
the rules.

On the other hand, as hinted at in my anecdote, a strong typing
system can become a millstone if there are legitimate reasons to
want to logically decompose the types. Even just the layer
change from double[x][y] to double[x*y] can crush you -- a layer change
that is trivial in C but not in a strongly typed system.
 
A

apg

Walter said:
There is nothing wrong with using pointers if they are used properly.
Pointers (indexes) are used in assembler frequently. There should be no
need for pointers in a high level language other than an index to a
record or similar requirement. Strong typing I agree is a pain at times
but if you can't express what you are attempting to achieve without
breaking the rules then there is something wrong and it's probably not
the rules.


We had a prototype research program that was in a mix of C
(computation section) and IDL (GUI). It went a lot further than
proof of concept, and should have been broken down and rewritten
long long before the point it go to, but you know what they say:
Programming is what happens while you're busy making other plans.

Eventually a decision was taken to generalize some important parts of it,
and modernize it all, and a fresh team was put together. As Java was
The Way Of The Future at the time, the team decided to write the
program in Java as much as possible, dropping into C++ for those
portions that were -measureably- too slow in Java.

You are, I am sure, familiar with the reasons to abandon C and move
to Java: Java is a clean new language design, object oriented,
without those ugly buffer overruns, and without all those pointer
problems C has, and any "well-written" Java program doesn't -need-
anything as gouche as type-punning. And with the IBM JIT compilers,
Java is even faster than C.


Well, about a person-decade of programming effort later, I chatted
with the team leader, and found that they had given up on that
approach, and for their next generation of the project, they were
rewriting everything into as pure C++ as they could, with just a
small graphics library interface for the GUI.

The reason? Because {I was told} Java's "reference" mechanisms in
concert with Java's typing rules, are such that every time they sliced
up a multidimensional array into 1-D vectors for the mathematical
analysis we do, Java insisted on taking a -copy- of the data insted
of just passing along the address of the data. And since the
datasets could be a gigabyte and since the analysis portion might
iterate through ten thousand or more possibilities, all of the
data copying that was going on was just killing their performance.


but if you can't express what you are attempting to achieve without
breaking the rules then there is something wrong and it's probably not
the rules.


On the other hand, as hinted at in my anecdote, a strong typing
system can become a millstone if there are legitimate reasons to
want to logically decompose the types. Even just the layer
change from double[x][y] to double[x*y] can crush you -- a layer change
that is trivial in C but not in a strongly typed system.

In a strong typed system you should still be able to access your data by
value or reference. Your problem seems to be not dealing with strong
typed items but with the way you are forced to access each item of data.
This is one of the problems I have seen in oop before.

Could it be that more of a problem for people to use oop? rather than
one of the more traditional methodologies. I personally have never been
convinced that oop is the best way to go, however elegant it may be.

Your example appears to be more a shortcoming of the way Java handles
data than that of strongly typed variables. Strong typing limits what
you can do to a variable, you can't assign an ascii value to a strongly
typed integer variable however languages which do not support strong
typing such as Fortran IV will allow this to happen. As pointers in C
are not strongly typed so you could perform the same kind of operation.
Where as pointers in Pascal are strongly typed and so such operations
are not permitted.

I've never understood why some people must have the latest car off the
production line, without too much regard to how it performs and whether
it suits their needs.
 
W

Walter Roberson

Your example appears to be more a shortcoming of the way Java handles
data than that of strongly typed variables. Strong typing limits what
you can do to a variable, you can't assign an ascii value to a strongly
typed integer variable however languages which do not support strong
typing such as Fortran IV will allow this to happen. As pointers in C
are not strongly typed so you could perform the same kind of operation.
Where as pointers in Pascal are strongly typed and so such operations
are not permitted.

In C, if I have double trouble[1048576][7] then trouble[4][2] has type
double and there is no problem with us working arithmetically with that
single value.

But in C, trouble[4] also has a type -- it is double [7] which is to
say an array of 7 doubles. Hence, if one were using a strongly-typed C,
one would be able to access the doubles in the array "trouble" at most
7 at a time, unless the language were to be enhanced with array
operators (such as in IDL.) This isn't a matter of wanting to store
characters into the space used by the doubles, this is a matter of
the type system.

In any given language, there might not -be- a type system, or there
might be a type system that operates only with the primitive types --
but as soon as you start being able to build aggregate types that are
considered distinct from "just a convenient way to organize primitive
data types", then if you can refer to array sub-sections at all
in the language, those sub-sections have types of their own and
a "strongly typed" system would enforce those types.

Continuing the example, examine trouble[4][9] -- to a strongly typed
system, that's a type error, as trouble[4] is not a type with 10
elements.

For the mathematical analysis we were doing, we often needed to
pass a subsection of a large array. In C as it is now, that's
trivial to do efficiently, as we can make use of the synonym
between the address of an array (&trouble), the address of the
first element of its first dimension (&trouble[0]), and the
address of the first element of the first element of its first
dimension (&trouble[0][0]). C allows all of these to be passed
into a routine that has its corresponding parameter declared
as any of double* or double[] or double[][] . C's type laxity allows
us to access what we know to be a block of consequative memory
in any way that is convenient to us -- but in a strongly typed system,
we would be constrained to only access the memory in the same way
it was declared.


There are languages in which multidimensional arrays have unspecified
internal structure -- allowing the implementation to transparently
move between straight blocks of storage, or vectors of pointers to
vectors of storage, or various sparse representation techniques.
(e.g., Mathematica.) For those languages, one must enforce strong
typing of array subsections (or deny the possibility of those),
or the language must provide a mandatory accessor function along with
the data pointer, or the language must provide a way to allow the current
storage arrangement to be examined.

If one was hoping for efficiency by memcpy()'ing an array area larger
than the fastest-varying dimension, and one cannot force a particular
representation, then only the latter of those three possibilities
(ability to examine the internal representation) offers any hope of
that at all: one cannot get memory-block efficiencies without
escaping from strong typing.
 
A

apg

Walter said:
Your example appears to be more a shortcoming of the way Java handles
data than that of strongly typed variables. Strong typing limits what
you can do to a variable, you can't assign an ascii value to a strongly
typed integer variable however languages which do not support strong
typing such as Fortran IV will allow this to happen. As pointers in C
are not strongly typed so you could perform the same kind of operation.
Where as pointers in Pascal are strongly typed and so such operations
are not permitted.


In C, if I have double trouble[1048576][7] then trouble[4][2] has type
double and there is no problem with us working arithmetically with that
single value.

But in C, trouble[4] also has a type -- it is double [7] which is to
say an array of 7 doubles. Hence, if one were using a strongly-typed C,
one would be able to access the doubles in the array "trouble" at most
7 at a time, unless the language were to be enhanced with array
operators (such as in IDL.) This isn't a matter of wanting to store
characters into the space used by the doubles, this is a matter of
the type system.

In any given language, there might not -be- a type system, or there
might be a type system that operates only with the primitive types --
but as soon as you start being able to build aggregate types that are
considered distinct from "just a convenient way to organize primitive
data types", then if you can refer to array sub-sections at all
in the language, those sub-sections have types of their own and
a "strongly typed" system would enforce those types.

Continuing the example, examine trouble[4][9] -- to a strongly typed
system, that's a type error, as trouble[4] is not a type with 10
elements.

For the mathematical analysis we were doing, we often needed to
pass a subsection of a large array. In C as it is now, that's
trivial to do efficiently, as we can make use of the synonym
between the address of an array (&trouble), the address of the
first element of its first dimension (&trouble[0]), and the
address of the first element of the first element of its first
dimension (&trouble[0][0]). C allows all of these to be passed
into a routine that has its corresponding parameter declared
as any of double* or double[] or double[][] . C's type laxity allows
us to access what we know to be a block of consequative memory
in any way that is convenient to us -- but in a strongly typed system,
we would be constrained to only access the memory in the same way
it was declared.


There are languages in which multidimensional arrays have unspecified
internal structure -- allowing the implementation to transparently
move between straight blocks of storage, or vectors of pointers to
vectors of storage, or various sparse representation techniques.
(e.g., Mathematica.) For those languages, one must enforce strong
typing of array subsections (or deny the possibility of those),
or the language must provide a mandatory accessor function along with
the data pointer, or the language must provide a way to allow the current
storage arrangement to be examined.

If one was hoping for efficiency by memcpy()'ing an array area larger
than the fastest-varying dimension, and one cannot force a particular
representation, then only the latter of those three possibilities
(ability to examine the internal representation) offers any hope of
that at all: one cannot get memory-block efficiencies without
escaping from strong typing.
First if you were using a strongly typed system and wanted as you state
to pass whole rows of data from you array then you would make the array
an array of records containing an array of double by the number of
elements reading and writing to the master array is no more difficult
and you are able to pass all seven double elements with one reference.
There are many other ways I'm sure you can think of them.

You could better use a different language for array manipulation PL/I
has a whole set of array manipulation methods, I believe that the later
versions of Fortran also support array manipulation. Both of these are
strongly typed languages. I know there are several others but I can't
remember them at the moment. I know in PL/I you could do all you wish
and more with an array.

Why are you copying the array when you only have to reference them, you
can still either write back your changes or write them elsewhere. I
believe that even in Pascal arrays are passed by reference not by
copying them. You don't get much more strongly typed than Pascal.

I agree the way you are doing it will get the job done, it is alright if
you are using it like some kind of programable calculator but not
suitable for production software which will be used by others who may
not understand it's limitations. I wouldn't want similar methods used in
the autopilot of the plane I was a passenger or controlling the local
nuclear power plant. I really wouldn't want similar methods in my OS but
who knows, garbage in BSD.

The only reason for not checking types and bounds is speed but by not
doing so it's like riding fast down hill and not checking that the
breaks work...
 
D

Dave Thompson

<Groups clipped a little>

And we're actually OT in clc but I don't read c.prog.
Dave Thompson wrote:

No, not quite CAR refers to the first element of a pair, CDR the second
element of that pair. A list is formed by a list of pairs. The data is
stored in the CARs and the link to the next pair being stored in the
CDR. This leads to CAR and CDR being called FIRST and REST.
That's what I said. Each cell is (at least logically) a pair; the
first cell's left/CAR is the first element and its right/CDR "is"
(points to) the second cell, and so on down the list. (I only gave the
example to length 3, which IMO is enough to make the pattern clear,
though for programmers it might be better to just state the rule.)
In the above code if A is (1 2 3):
(setq A '(1 2 3))

(car A) is 1
(cdr A) is the list (2 3) , *not* 2
(cdr a) is the list (2 3); (car (cdr a)) aka (cadr a) is 2
If I do:
(setq b (cdr A)) b is set to (2 3)
then:
(rplaca b 9) b is set to (9 3) because rplaca is a destructive
operation.
Right. And (cdr a) is still EQ (the same cell as) b, so a is (1 9 3).
Hence, at least as seen by the programmer, these are pointers.
These functions can be thought of as implying pointer operations, but
the concept isn't really necessary. Conceptually CAR and CDR access
the elements of a pair, which the lisp implementation deals with.
CAR and CDR access the halves of a pair a.k.a. cell, but a pair has
identity and is referencable and mutable.
They need not be implemented quite as the programmer may expect either.
There have been lisp implementations where pairs & lists are
represented in quite subtle ways.
True; I should have said 'vanilla' or 'canonical' Lisp. I was trying
to be brief, especially with my (recently worse) off-line delay.
I think programmers of lower level languages like C are more
comfortable thinking of this as pointer manipulation, which is OK.


Yep, and use the few modifying operations that exist for efficiency.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,016
Latest member
TatianaCha

Latest Threads

Top