George said:
Sounds interesting. Will it be related at all to OLAP or the
Multi-Dimensional eXpressions language
(
http://msdn2.microsoft.com/en-us/library/ms145506.aspx) ?
Thanks for the reference! I didn't know about any of these. It will
probably be interesting to learn from them. From a brief look at OLAP
in wikipedia, it may have similarities to OLAP. I don't think it will
be related to Microsoft's language, because the language will simply by
Python, hopefully making it very easy to do whatever you like with the
data.
I posted to python-dev a message that (hopefully) better explains my
use for x[]. Here it is - I think that it also gives an idea on how it
will look like.
I'm talking about something similar to a spreadsheet in that it saves
data, calculation results, and the way to produce the results.
However, it is not similar to a spreadsheet in that the data isn't
saved in an infinite two-dimensional array with numerical indices.
Instead, the data is saved in a few "tables", each storing a different
kind of data. The tables may be with any desired number of dimensions,
and are indexed by meaningful indices, instead of by natural numbers.
For example, you may have a table called sales_data. It will store the
sales data in years from set([2003, 2004, 2005]), for car models from
set(['Subaru', 'Toyota', 'Ford']), for cities from set(['Jerusalem',
'Tel Aviv', 'Haifa']). To refer to the sales of Ford in Haifa in 2004,
you will simply write: sales_data[2004, 'Ford', 'Haifa']. If the table
is a source of data (that is, not calculated), you will be able to set
values by writing: sales_data[2004, 'Ford', 'Haifa'] = 1500.
Tables may be computed tables. For example, you may have a table which
holds for each year the total sales in that year, with the income tax
subtracted. It may be defined by a function like this:
lambda year: sum(sales_data[year, model, city] for model in models for
city in cities) / (1 + income_tax_rate)
Now, like in a spreadsheet, the function is kept, so that if you
change the data, the result will be automatically recalculated. So, if
you discovered a mistake in your data, you will be able to write:
sales_data[2004, 'Ford', 'Haifa'] = 2000
and total_sales[2004] will be automatically recalculated.
Now, note that the total_sales table depends also on the
income_tax_rate. This is a variable, just like sales_data. Unlike
sales_data, it's a single value. We should be able to change it, with
the result of all the cells of the total_sales table recalculated. But
how will we do it? We can write
income_tax_rate = 0.18
but it will have a completely different meaning. The way to make the
income_tax_rate changeable is to think of it as a 0-dimensional table.
It makes sense: sales_data depends on 3 parameters (year, model,
city), total_sales depends on 1 parameter (year), and income_tax_rate
depends on 0 parameters. That's the only difference. So, thinking of
it like this, we will simply write:
income_tax_rate[] = 0.18
Now the system can know that the income tax rate has changed, and
recalculate what's needed. We will also have to change the previous
function a tiny bit, to:
lambda year: sum(sales_data[year, model, city] for model in models for
city in cities) / (1 + income_tax_rate[])
But it's fine - it just makes it clearer that income_tax_rate[] is a
part of the model that may change its value.
Have a good day,
Noam