Little direction please Python MySQL

L

len

Hi all;

I am looking for a little direction in moving from novice python MySQL
to real world processing.

I can connect to MySQL databases and have performed most of the
various select, create, update, insert, etc given the examples in the
various books and internet tutorials not to many problems.

Here is my problem. I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL. I have gotten past packed decimal fields and
various other little tidbits. My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc. I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.

As you can image the data coming from old cobol files fieldname are
generally very large and tons of code that might start looking like;

order['ordhdr_sales_amount'] += order['ordlin_sales_qty'] * order
['ordlin_sales_price']

could start becoming quite cumbersum.

I'm sure that there is someway to make all of this less verbose using
classes and such but I need some direction.

Could someone recommend either books, website, or free package whose
code I could look at to move from the student type programs to gee I
work in the real world now programs.

Thanks Len
 
S

Steve Holden

len said:
Hi all;

I am looking for a little direction in moving from novice python MySQL
to real world processing.

I can connect to MySQL databases and have performed most of the
various select, create, update, insert, etc given the examples in the
various books and internet tutorials not to many problems.

Here is my problem. I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL. I have gotten past packed decimal fields and
various other little tidbits. My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc. I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.

As you can image the data coming from old cobol files fieldname are
generally very large and tons of code that might start looking like;

order['ordhdr_sales_amount'] += order['ordlin_sales_qty'] * order
['ordlin_sales_price']

could start becoming quite cumbersum.

I'm sure that there is someway to make all of this less verbose using
classes and such but I need some direction.

Could someone recommend either books, website, or free package whose
code I could look at to move from the student type programs to gee I
work in the real world now programs.
There may be some usable ideas in

http://holdenweb.com/PyConTX2007/dbPythonIntro.pdf

though it sounds like you might have already come across most of them.
It's mostly just a matter of packaging things in a usable way so you can
write simple code. For example

order.sales_amount += ordlin.sales_qty * ordlin.sales_price

By all means post specific questions when the code starts to get tacky
or when you can't see your way through to a solution. People on this
list are usually incredibly helpful.

regards
Steve
 
E

Ethan Furman

len said:
Hi all;
[snip]

Here is my problem. I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL. I have gotten past packed decimal fields and
various other little tidbits. My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc. I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.
[snip]

Thanks Len

I've never had the (mis?)fortune to work with COBOL -- what are the
files like? Fixed format, or something like a dBase III style? I
presume also that you only need access to them in COBOL format long
enough to transfer them into MySQL -- true?

~ethan~
 
B

Bruno Desthuilliers

len a écrit :
Hi all;

I am looking for a little direction in moving from novice python MySQL
to real world processing.

I can connect to MySQL databases and have performed most of the
various select, create, update, insert, etc given the examples in the
various books and internet tutorials not to many problems.

Here is my problem. I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL. I have gotten past packed decimal fields and
various other little tidbits. My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc. I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.

As you can image the data coming from old cobol files fieldname are
generally very large and tons of code that might start looking like;

order['ordhdr_sales_amount'] += order['ordlin_sales_qty'] * order
['ordlin_sales_price']

could start becoming quite cumbersum.

I'm sure that there is someway to make all of this less verbose using
classes and such but I need some direction.

You may want to have a look at SQLAlchemy. It will require some
additional learning, but chances are you'll waste less time than trying
to roll your own half-backed ORM-like system.

My 2 cents...
 
L

len

len said:
Hi all;
[snip]

Here is my problem.  I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL.  I have gotten past packed decimal fields and
various other little tidbits.  My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc.  I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.
[snip]

Thanks Len

I've never had the (mis?)fortune to work with COBOL -- what are the
files like?  Fixed format, or something like a dBase III style?  I
presume also that you only need access to them in COBOL format long
enough to transfer them into MySQL -- true?

~ethan~

Files are fixed format no field delimiters, fields are position and
length
records are terminated by newline. In cobol the read statement which
read
a record from the file automaticly mapped the date to the fieldnames
in
the cobol file definition.

In python you as the programmer have to do the mapping of data to
fieldnames
whether this is using list and numeric indexing (list[n]),
dictionaries
file['fieldname'] = value or attribute (self.fieldname = value through
some class).
Now in my case I literally have a couple of hundred files and each
file may have
20 or 30 fieldnames and in several cases 100 to 150 fields (customer
file alone
has 98). So as you can imagine standardize the mapping is a big deal
to me.

Now all of the sample code you find (understandably) usually shows SQL
code
and python code manipulating 3 or 4 fields at the most and one 1 or 2
tables
at a time. In the real world I have programs that will need to work
on 5, 10, and
15 files at a time and 100's of fields. Basicly it is the difference
between
writing your jave, C++, or python program to complete your programming
language
assignment for your college class and then graduating and getting a
job and being
told to write the companies new CRM or ERP system. You can find
plenty of beginning
tutorial and code snippets or esotiric code using stuff for landing
the lunar lander
but where is the middle ground.

That is the stuff I'm looking for.

Please understand this is not a rant against SQL or python or their
communities
but at my own progress in these to become a competent programmer and
I'm sure
as every programmer in the world has experienced, it just never occurs
fast
enough.

Len
 
L

len

len a écrit :




I am looking for a little direction in moving from novice python MySQL
to real world processing.
I can connect to MySQL databases and have performed most of the
various select, create, update, insert, etc given the examples in the
various books and internet tutorials not to many problems.
Here is my problem.  I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL.  I have gotten past packed decimal fields and
various other little tidbits.  My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc.  I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.
As you can image the data coming from old cobol files fieldname are
generally very large and tons of code that might start looking like;
     order['ordhdr_sales_amount'] += order['ordlin_sales_qty'] * order
['ordlin_sales_price']
could start becoming quite cumbersum.
I'm sure that there is someway to make all of this less verbose using
classes and such but I need some direction.

You may want to have a look at SQLAlchemy. It will require some
additional learning, but chances are you'll waste less time than trying
to roll your own half-backed ORM-like system.

My 2 cents...

I haved looked at SQLAlchemy and will start using it. I do feel that
I need to hand code a few things just to get and understanding of
whats involved. I find that this usually making the package a little
easier to use and generally makes me a better user of the package or
at least it has in the past.

Len
 
E

Ethan Furman

len said:
len said:
Hi all;
[snip]


Here is my problem. I need to start doing this in the really world at
my company converting some older cobol system and data to python
programs and MySQL. I have gotten past packed decimal fields and
various other little tidbits. My problem is the data files aren't
little three of four field files but Customer File with 98 fields
etc. I understand building dictionaries and building with zip and I
have even seen a reference to using __setattr__ in an empty class but
I'm having a hard time moving past the little code snippts to real
code.
[snip]


Thanks Len

I've never had the (mis?)fortune to work with COBOL -- what are the
files like? Fixed format, or something like a dBase III style? I
presume also that you only need access to them in COBOL format long
enough to transfer them into MySQL -- true?

~ethan~


Files are fixed format no field delimiters, fields are position and
length
records are terminated by newline. In cobol the read statement which
read
a record from the file automaticly mapped the date to the fieldnames
in
the cobol file definition.
[snip]


Len

Are the cobol file definitions available in a file that can be parsed,
or are they buried in the source code?

What type of data is in the files? Integer, float, character, date, etc.

Once you have the data out, will you need access these same cobol files
in the future? (i.e. more data is being added to them that you will
need to migrate)

~ethan~
 
L

len

        Sounds like standard COBOL record definitions. Next factor would be
if they are text format (human readable) or COBOL binary format (and if
so, are they using comp-1 integers or COBOL standard packed decimal?)...
Given the mention of new-line termination, probably not binary (though
technically, COBOL's fixed width files probably don't even require a
new-line).

        In either event, use of the struct module to break the input record
into a cluster of Python strings is probably useful, and may be more
efficient than a series of string slicing operations.

        Also, if the conversion is from file direct to database, it is
likely safe to leave most of the fields in text format; since MySQLdb
passes everything as delimited strings in the INSERT statement -- which
convert from "123.5" to float("123.5") -> 123.5 only to have the
cursor.execute() convert it back to "123.5"

        Exception: might want to convert date/time fields into Python
date/time objects and let MySQLdb handle conversion to/from MySQL
datetime formats.




        Hmmm, ever seen COBOL source? <G>

        Nothing is buried in COBOL -- the data section should have  nicely
laid out record representations... (it's been some time, so this is
pseudo-COBOL)

01      MYRECORD
        03      NAME    PIC A(50)
        03      DATE
                05      MONTH   PIC 99
                05      DAY             PIC 99
                05      YEAR            PIC 9999
        03      AGE     PIC 999
        03      ADDRESS
                05 STREET       PIC X(50)
                05 CITY         PIC A(50)
                05 STATE                PIC A(50)
                05 ZIP                  PIC 99999-9999


        If new-line terminated, likely all is human readable text -- see my
above comment re: numeric conversions and MySQL


        That is what I considered key also...

        Best would be a one-time conversion -- once the new applications
have been checked out -- meaning the converter may be used multiple
times during development and testing of the new applications (to refresh
the development database with production data), but that in the end the
files become defunct and the new input process directly loads to the
production database.

        No indication of what type of processes the existing COBOL
application is performing, but I can easily visualize a pre-database
processing style, using sorted input files, with parallel readings

read EMPLOYEE (with salary rate)
                read TIMECARD (with hours)

while EMPLOYEE.ID < TIMECARD.ID
        write EXCEPTION No timecard for EMPLOYEE
        read EMPLOYEE
while TIMECARD.ID < EMPLOYEE.ID
        write EXCEPTION No employee for TIMECARD
        read TIMECARD

compute and write paycheck

repeat until EOF on both EMPLOYEE and TIMECARD

{side note: apologies for piggy-backing -- the original poster is using
an address that my filters are set to kill; as most of the spam on this
group has the same domain}
--
        Wulfraed        Dennis Lee Bieber               KD6MOG
        (e-mail address removed)              (e-mail address removed)
                HTTP://wlfraed.home.netcom.com/
        (Bestiaria Support Staff:               (e-mail address removed))
                HTTP://www.bestiaria.com/

If anyone is interested I have just posted on the group under the
title
'Newbie code review of parsing program Please'

Len
 
L

Lawrence D'Oliveiro

len said:
Files are fixed format no field delimiters, fields are position and
length records are terminated by newline.

Assuming no COMPUTATIONAL fields, it should be easy enough to split each line up into fixed-length pieces, e.g. assuming a simple example

01 Sample-Record.
02 Field-1 pic XXX.
02 Field-2 pic XXXX.
02 Field-3 pic XXXXX.

then a Python sequence that read one line's worth of fields might be

line = infile.read()
(field_1, field_2, field_3) = (line[0:3], line[3:7], line[7:12])
 
T

Tim Chase

Files are fixed format no field delimiters, fields are position and
length records are terminated by newline.

Assuming no COMPUTATIONAL fields, it should be easy enough to split each line up into fixed-length pieces, e.g. assuming a simple example

01 Sample-Record.
02 Field-1 pic XXX.
02 Field-2 pic XXXX.
02 Field-3 pic XXXXX.

then a Python sequence that read one line's worth of fields might be

line = infile.read()
(field_1, field_2, field_3) = (line[0:3], line[3:7], line[7:12])


A recent posting on the list offered an elegant solution to this
with a function something like

def splitter(s, *lens):
offset = 0
pieces = []
for length in lens:
pieces.append(s[offset:eek:ffset+length])
offset += length
return pieces


which could then be used to simplify that to

(f1, f1, f3) = splitter(line, 3, 4, 5)

It may not be quite so significant with just 3 items, but the OP
mentioned having a large number of items in each record. One
could even use something like a mapping for this. Something like:

field_lens = [
("field1", 3),
("field2", 4),
("field3", 5),
# stacks more
]
fields = dict(zip(
(name for name, _ in field_lens),
splitter(line, *[length for _, length in field_lens])
))
something = "Whatever %s you want" % fields["field2"]

If you like this method, you can even make a more targeted
splitter() function and add some function mappings like

field_lens = [ # somewhat similar to your Cobol masks
("field1", 3, str),
("field2", 4, int),
("field3", 5, float),
("field4", 17, lambda s: s.strip().upper() ),
# stacks more
]


def splitter(s, field_lens):
pieces = {}
offset = 0
for name, length, fn in field_lens:
pieces[name] = fn(s[offset:eek:ffset+length])
offset += length
return pieces

bits = splitter(line), field_lens)
# do addition of int+float rather than string concat
print bits["field2"] + bits["field3"]

-tkc
 
L

len

len said:
Files are fixed format no field delimiters, fields are position and
length records are terminated by newline.

Assuming no COMPUTATIONAL fields, it should be easy enough to split each line up into fixed-length pieces, e.g. assuming a simple example

    01 Sample-Record.
      02 Field-1 pic XXX.
      02 Field-2 pic XXXX.
      02 Field-3 pic XXXXX.

then a Python sequence that read one line's worth of fields might be

    line = infile.read()
    (field_1, field_2, field_3) = (line[0:3], line[3:7], line[7:12])
 
L

len

len said:
Files are fixed format no field delimiters, fields are position and
length records are terminated by newline.

Assuming no COMPUTATIONAL fields, it should be easy enough to split each line up into fixed-length pieces, e.g. assuming a simple example

    01 Sample-Record.
      02 Field-1 pic XXX.
      02 Field-2 pic XXXX.
      02 Field-3 pic XXXXX.

then a Python sequence that read one line's worth of fields might be

    line = infile.read()
    (field_1, field_2, field_3) = (line[0:3], line[3:7], line[7:12])

Thank you for the suggestion. I have done what you have suggested and
if your interested you can take a look at the code in my post titled;

'Newbie code review of parsing program Please'

Just shows great minds run in the same gutter;)

Len
 
L

len

Assuming no COMPUTATIONAL fields, it should be easy enough to split each line up into fixed-length pieces, e.g. assuming a simple example
    01 Sample-Record.
      02 Field-1 pic XXX.
      02 Field-2 pic XXXX.
      02 Field-3 pic XXXXX.
then a Python sequence that read one line's worth of fields might be
    line = infile.read()
    (field_1, field_2, field_3) = (line[0:3], line[3:7], line[7:12])

A recent posting on the list offered an elegant solution to this
with a function something like

   def splitter(s, *lens):
     offset = 0
     pieces = []
     for length in lens:
       pieces.append(s[offset:eek:ffset+length])
       offset += length
     return pieces

which could then be used to simplify that to

   (f1, f1, f3) = splitter(line, 3, 4, 5)

It may not be quite so significant with just 3 items, but the OP
mentioned having a large number of items in each record.  One
could even use something like a mapping for this.  Something like:

   field_lens = [
     ("field1", 3),
     ("field2", 4),
     ("field3", 5),
     # stacks more
     ]
   fields = dict(zip(
     (name for name, _ in field_lens),
     splitter(line, *[length for _, length in field_lens])
     ))
   something = "Whatever %s you want" % fields["field2"]

If you like this method, you can even make a more targeted
splitter() function and add some function mappings like

   field_lens = [ # somewhat similar to your Cobol masks
     ("field1", 3, str),
     ("field2", 4, int),
     ("field3", 5, float),
     ("field4", 17, lambda s: s.strip().upper() ),
     # stacks more
     ]

   def splitter(s, field_lens):
     pieces = {}
     offset = 0
     for name, length, fn in field_lens:
       pieces[name] = fn(s[offset:eek:ffset+length])
       offset += length
     return pieces

   bits = splitter(line), field_lens)
   # do addition of int+float rather than string concat
   print bits["field2"] + bits["field3"]

-tkc

Thank for your reply. I will have to study your code and
see if I can inprove mine.

Len
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top