# Re: create global variables?-the full story

Discussion in 'Python' started by Alistair King, Oct 31, 2006.

1. ### Alistair KingGuest

Steve Holden wrote:
> Alistair King wrote:
>
>> Steve Holden wrote:
>>
>>
>>> wrote:
>>>
>>>
>>>
>>>> J. Clifford Dyer wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Alistair King wrote:
>>>>>
>>>>>
>>> [... advice and help ...]
>>>
>>>
>>>
>>>
>>>> this worked a treat:
>>>>
>>>> def monoVarcalc(atom):
>>>>
>>>> a = atom + 'aa'
>>>> Xaa = a.strip('\'')
>>>> m = atom + 'ma'
>>>> Xma = m.strip('\'')
>>>> Xaa = DS1v.get(atom)
>>>> Xma = pt.get(atom)
>>>> return Xaa, Xma
>>>>
>>>>
>>>> Caa, Cma = monoVarcalc('C')
>>>>
>>>>
>>>>
>>> In which case I suspect you will find that this works just as well:
>>>
>>> def monoVarcalc(atom):
>>>
>>> Xaa = DS1v.get(atom)
>>> Xma = pt.get(atom)
>>> return Xaa, Xma
>>>
>>>
>>> Unless there is something decidedly odd about the side-effects of the
>>> statements I've removed, since you never appear to use the values of a,
>>> m, Xaa and Xma there seems little point in calculation them.
>>>
>>> regards
>>> Steve
>>>
>>>

>> Yup...it works..but now i have to create a dictionary of 'a' and 'm',
>> ie... "Xaa" and "Xma" string, key:value pairs so i can use other
>> functions on the Xaa, Xma variables by iterating over them and
>> retrieving the values from the variables. I think if i just input Xaa
>> and Xma, only the values associated with those variables will go into
>> the dictionary and ill just be iterating over nonsence.....
>>
>> atomsmasses = {}
>>
>> def monoVarcalc(atom):
>> a = atom + 'aa'
>> m = atom + 'ma'
>> atomsmasses[a]=m
>> Xaa = a.strip('\'')
>> Xma = m.strip('\'')
>> Xma = pt.get(atom)
>> if DS1v.get(atom) != None:
>> Xaa = DS1v.get(atom)
>> else:
>> Xaa = 0
>> return Xaa, Xma
>>
>> Caa, Cma = monoVarcalc('C')
>> Oaa, Oma = monoVarcalc('O')
>> Haa, Hma = monoVarcalc('H')
>> Naa, Nma = monoVarcalc('N')
>> Saa, Sma = monoVarcalc('S')
>> Claa, Clma = monoVarcalc('Cl')
>> Braa, Brma = monoVarcalc('Br')
>> Znaa, Znma = monoVarcalc('Zn')
>>
>>
>>
>> i think?
>> thanks
>>
>> a
>>
>>

> No fair: you only just added atomsmasses! ;-)
>
> However, it seems to me that your atomsmasses dictionary is going to be
> entirely predictable, and you are still focusing on storing the *names*
> of things rather than building up a usable data structure. Indeed I
> suspect that your problem can be solved using only the names of the
> elements, and not the names of the variables that hold various
> attributes of the elements.
>
> Perhaps if you explain in plain English what you *really* want to do we
> can help you find a more Pythonic solution. It'll probably end up
> something like this:
>
> mass = {}
> for element in ['C', 'O', ..., 'Zn']
> mass[element] = monoVarcalc(element)
>
> But I could, of course, be completely wrong ... it wouldn't be the first
> time. Do you understand what I'm saying?
>
> regards
> Steve
>

OK...

from the start.

im trying to develop a simple command line application for determining
the degree of substitution (DS) on a polymer backbone from elemental
analysis, i.e., the % weights of different elements in the
monomer-substituent compound ( i want each element to give a result and
heaviest atoms give the most accurate results).

most basic comp chem programs use input files but i dont know anything
about file iteration yet and want the program to be as user friendly as
possible..i.e. command line prompt. GUI would be great but too much for
me at this stage

at the start of the script i have 2 dictionaries 1) containing every
atom in the periodic table with associated isotopic average masses 2)
containing the molecular forumla of the monomer unit...eg for cellulose
AGU {'C': 6, 'H': 10, 'O': 5}.

the basic steps are

1. calculate the weight percentage values for each atom in the monomer
2. iterate into dictionaries from DS=0 - DS=15 (0.00005 step) the
projected % values for the monomer plus substituent, for EACH atom in
the compound.
3. find the (local) minimum from each dictionary/atom to give the
appropriate DS value.

*Note* I have to iterate ALL the values as there is a non-linear
relationship between % values and DS due to the different atomic weights
The computer seems to cope with this in about 10 seconds with the above
parameters and about 8 elements for the iteration step

I have a script which works perfectly from some time ago but the problem
is, some of the samples contain excess water so i was going to use the
heaviest atom to give the most accurate DS value i could and then adjust
the polymer back bone dictionary to include the initial substituent from
step 3. Then do the calculation again for water {'H': 2, 'O': 1} as the
new substituent. Once i determine the new 'DS' value of water its simple
to recalculate the % values...iterate..find (local) minima and get more
accurate DS values for all the heavy atoms (not including Hydrogen or
Oxygen).
............................................................................................................

I would like to get the script running now to get some results from our
experimental data which shoudnt take too long but eventually i dont want
to rely on the monomer dictionary...just input the elements and number
of atoms of each element in the starting monomer and further
substituents...therefore adapting the program to not only cellulose as a
polymer but all polymers and small organic molecules, perhaps even
proteins of knows elemental analysis.

For this im taking my basic script that i started with, as a beginner to
python and all programming languages, and trying to make functions of
all the calculations i have to do. I guess that i could eventually use
just the elements as references in datastructures..i.e. 'C', 'H'..etc..
but would like to have many variables bound to those data structures

One other thing is that it would be nice to create a 3D iterated 'array'
of the % values to find the local minima but i really dont have a clue
how to do this.

Any comments would be appreciated as i am a beginner and just went out
and did this how i thought it should be done. Hope this is not too much
for peeps

thanks

Ali

--
Dr. Alistair King
Research Chemist,
Laboratory of Organic Chemistry,
Department of Chemistry,
Faculty of Science
P.O. Box 55 (A.I. Virtasen aukio 1)
FIN-00014 University of Helsinki
Tel. +358 9 191 50392, Mobile +358 (0)50 5279446
Fax +358 9 191 50366

Alistair King, Oct 31, 2006

2. ### J. Clifford DyerGuest

> OK...
>
> from the start.
>
> im trying to develop a simple command line application for determining
> the degree of substitution (DS) on a polymer backbone from elemental
> analysis, i.e., the % weights of different elements in the
> monomer-substituent compound ( i want each element to give a result and
> heaviest atoms give the most accurate results).
>
> most basic comp chem programs use input files but i dont know anything
> about file iteration yet and want the program to be as user friendly as
> possible..i.e. command line prompt. GUI would be great but too much for
> me at this stage
>
> at the start of the script i have 2 dictionaries 1) containing every
> atom in the periodic table with associated isotopic average masses 2)
> containing the molecular forumla of the monomer unit...eg for cellulose
> AGU {'C': 6, 'H': 10, 'O': 5}.
>
> the basic steps are
>
> 1. calculate the weight percentage values for each atom in the monomer
> 2. iterate into dictionaries from DS=0 - DS=15 (0.00005 step) the
> projected % values for the monomer plus substituent, for EACH atom in
> the compound.
> 3. find the (local) minimum from each dictionary/atom to give the
> appropriate DS value.
>
> *Note* I have to iterate ALL the values as there is a non-linear
> relationship between % values and DS due to the different atomic weights
> The computer seems to cope with this in about 10 seconds with the above
> parameters and about 8 elements for the iteration step
>

Since you have a parallel structure for each element, consider using a
dictionary with the element names as keys:

>>> atomicdata = {}
>>> for element in 'C','H','U':

.... atomicdata[element] = getAtomVars(element)
....
>>> print atomicdata

{ 'C': (1, 2), 'H': (4, 5), 'U': (78, 20) }

The first value of each tuple will be your Xaa, and the second value
will be Xma. Do you really need to keep the names Caa, Cma, Haa, Hma
around? Instead of Caa, you have atomicdata['C'][0] and Cma becomes
atomicdata['C'][1]. Completely unambiguous. A bit more verbose,
perhaps, but you don't have to try to sneak around the back side of the
language to find the data you are looking for. That's very much against
the tao. If you really want the names, nest dicts, but don't try to get
the element name into the keys, because you already have that:

>>> atomicdata = { 'C': { 'aa': 1,

.... 'ma': 2},
.... 'H': { 'aa': 4
.... 'ma': 5},
.... 'U': { 'aa': 78
.... 'ma': 20} }

and to get from there to storing all your data for all however many
steps, change the value of each entry in atomic data from a tuple (or
dict) to a list of tuples (or dicts).

>>> atomicdata = { 'C': [ (1,2), (4,6), (7,8), (20,19) ],

.... 'H': [ (5,7), (2,986), (3,4) ] }
>>> atomicdata['H'].append((5,9))
>>> atomicdata

{ 'C': [ (1, 2), (4, 6), (7, 8), (20, 19) ], 'H': [ (5, 7), (2, 986),
(3, 4), (5, 9) ] }

You can build up those lists with nested for loops. (one tells you which
element you're working on, the other which iteration).

The indexes of your lists, of course, will not correspond to the DS
values, but to the step number. To get back to the DS number, of
course, let the index number be i, and calculate DS = i * 0.00005

That should get you off and running now. Happy pythoning!

Cheers,
Cliff

J. Clifford Dyer, Nov 2, 2006

3. ### Alistair KingGuest

J. Clifford Dyer wrote:
>> OK...
>>
>> from the start.
>>
>> im trying to develop a simple command line application for determining
>> the degree of substitution (DS) on a polymer backbone from elemental
>> analysis, i.e., the % weights of different elements in the
>> monomer-substituent compound ( i want each element to give a result and
>> heaviest atoms give the most accurate results).
>>
>> most basic comp chem programs use input files but i dont know anything
>> about file iteration yet and want the program to be as user friendly as
>> possible..i.e. command line prompt. GUI would be great but too much for
>> me at this stage
>>
>> at the start of the script i have 2 dictionaries 1) containing every
>> atom in the periodic table with associated isotopic average masses 2)
>> containing the molecular forumla of the monomer unit...eg for cellulose
>> AGU {'C': 6, 'H': 10, 'O': 5}.
>>
>> the basic steps are
>>
>> 1. calculate the weight percentage values for each atom in the monomer
>> 2. iterate into dictionaries from DS=0 - DS=15 (0.00005 step) the
>> projected % values for the monomer plus substituent, for EACH atom in
>> the compound.
>> 3. find the (local) minimum from each dictionary/atom to give the
>> appropriate DS value.
>>
>> *Note* I have to iterate ALL the values as there is a non-linear
>> relationship between % values and DS due to the different atomic weights
>> The computer seems to cope with this in about 10 seconds with the above
>> parameters and about 8 elements for the iteration step
>>
>>

>
> Since you have a parallel structure for each element, consider using a
> dictionary with the element names as keys:
>
> >>> atomicdata = {}
> >>> for element in 'C','H','U':

> ... atomicdata[element] = getAtomVars(element)
> ...
> >>> print atomicdata

> { 'C': (1, 2), 'H': (4, 5), 'U': (78, 20) }
>
> The first value of each tuple will be your Xaa, and the second value
> will be Xma. Do you really need to keep the names Caa, Cma, Haa, Hma
> around? Instead of Caa, you have atomicdata['C'][0] and Cma becomes
> atomicdata['C'][1]. Completely unambiguous. A bit more verbose,
> perhaps, but you don't have to try to sneak around the back side of the
> language to find the data you are looking for. That's very much against
> the tao. If you really want the names, nest dicts, but don't try to get
> the element name into the keys, because you already have that:
>
> >>> atomicdata = { 'C': { 'aa': 1,

> ... 'ma': 2},
> ... 'H': { 'aa': 4
> ... 'ma': 5},
> ... 'U': { 'aa': 78
> ... 'ma': 20} }
>
> and to get from there to storing all your data for all however many
> steps, change the value of each entry in atomic data from a tuple (or
> dict) to a list of tuples (or dicts).
>
>
> >>> atomicdata = { 'C': [ (1,2), (4,6), (7,8), (20,19) ],

> ... 'H': [ (5,7), (2,986), (3,4) ] }
> >>> atomicdata['H'].append((5,9))
> >>> atomicdata

> { 'C': [ (1, 2), (4, 6), (7, 8), (20, 19) ], 'H': [ (5, 7), (2, 986),
> (3, 4), (5, 9) ] }
>
> You can build up those lists with nested for loops. (one tells you which
> element you're working on, the other which iteration).
>
> The indexes of your lists, of course, will not correspond to the DS
> values, but to the step number. To get back to the DS number, of
> course, let the index number be i, and calculate DS = i * 0.00005
>
> That should get you off and running now. Happy pythoning!
>
> Cheers,
> Cliff
>

Thanks Cliff,

this is what i need, it seems to make much more sense to do this way. I
think once i learn to include datastructures within each other im gonna
try to make up this 3D 'array' (i think this is a word from C, is there
a python equivalent?) of further iterations within each iteration to
take into account the excess water. For this i think ill have to add
another 100 values onto the values i already have, i.e. 1.8x10**8
entries in total and god knows how many calculations in potentially one
datastructure. Could my computer cope with this or should i try a series
of refinement iterations? Does anyone knows of a simple way of
determining how much processer time it takes to do these calculations?

thanks

a

--
Dr. Alistair King
Research Chemist,
Laboratory of Organic Chemistry,
Department of Chemistry,
Faculty of Science
P.O. Box 55 (A.I. Virtasen aukio 1)
FIN-00014 University of Helsinki
Tel. +358 9 191 50392, Mobile +358 (0)50 5279446
Fax +358 9 191 50366

Alistair King, Nov 7, 2006
4. ### Steven D'ApranoGuest

On Tue, 07 Nov 2006 12:54:36 +0200, Alistair King wrote:

> Does anyone knows of a simple way of
> determining how much processer time it takes to do these calculations?

See the timeit module.

--
Steven.

Steven D'Aprano, Nov 7, 2006