Best python module for Oracle, but portable to other RDBMSes

D

dananrg

I'm a little confused about what's out there for database modules at:

http://python.org/topics/database/modules.html

What I'd like to do is use Python to access an Oracle 9.X database for
exporting a series of tables into one aggregated table as a text file,
for import into a mainframe database. The catch is that I'd like to
write code that wouldn't need to be changed (much) if we switched from
Oracle to some other RDBMS (still need to export out as a text file for
import to the mainframe database.

Looks like I'd probably want to use something that conforms to the
DB-API 2.0. On the module list, I see DCOracle2, but it hasn't been
updated since 2001. Anyone here use this module, and does it work with
both Oracle 9.2 and Oracle 10g? Are there other DB-API 2.0 compliant
modules I should be looking at? What are the advantages and
disadvantages of the ones that can access Oracle 9.2 and Oracle 10g
databases?

Thanks in advance.
 
O

Olivier

(e-mail address removed) a écrit :
I'm a little confused about what's out there for database modules at:

http://python.org/topics/database/modules.html

What I'd like to do is use Python to access an Oracle 9.X database for
exporting a series of tables into one aggregated table as a text file,
for import into a mainframe database. The catch is that I'd like to
write code that wouldn't need to be changed (much) if we switched from
Oracle to some other RDBMS (still need to export out as a text file for
import to the mainframe database.

Looks like I'd probably want to use something that conforms to the
DB-API 2.0.


You want to use cx_Oracle :

http://www.python.net/crew/atuining/cx_Oracle/index.html

which is nowadays the very best python Oracle driver.

Olivier
 
J

Jonathan Gardner

On database portability...

While it is noble to try to have a generic interface to these
libraries, the end result is that the databases are always different
enough that the interface just has to work differently. My experience
in going from one database to another is that you should revisit your
entire database interface anyway. (I remember going from Sybase to
Oracle when we had Perl's DBI way back when. It was still very, very
messy.)

So, pick a good module, learn it inside and out, and plan on using a
completely different module if you use a different database, with
perhaps at least a slightly different interface.
 
D

dananrg

Thanks Olivier and Jonathan.

Do either of you, or anyone else, know of a good open source data
modeling / ER-diagram / CASE tools? I'd like to be able to build
relatively simple schemas in one open source tool and be able to create
a database on different platforms as needed (e.g. MySQL, PostgreSQL,
Oracle, etc).

Just wondering what's out there.

Thanks.
 
J

Jonathan Gardner

I've never seen the points of those tools. Just lay it out on paper or
document it somewhere. Be consistant with your naming scheme and it
shouldn't be hard to see the relations. If found that the people who
don't understand how tables should relate to one another are also the
same people who don't understand the special arrows DBAs like to use.
 
D

dananrg

What would be the next best Oracle database module for Python next to
cx_oracle? I'd like to compare two and choose one, just for the sake of
seeing how two modules doing the same thing operate.

Also, does installing cx_oracle create registry entries or require
admin privs on a Windows XP machine? I see that cx_oracle is
distributed as an EXE.

Thanks.
 
D

dananrg

Also, what's the difference between something like cx_oracle and an
ODBC module? If I were to use an ODBC module (not trying to torture
myself here, I promise, but I just want to see what alternatives exist
and how they work).
 
?

=?ISO-8859-1?Q?Gerhard_H=E4ring?=

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

What would be the next best Oracle database module for Python next to
cx_oracle?

That would probably be DCOracle2.
I'd like to compare two and choose one, just for the sake of
seeing how two modules doing the same thing operate.

Also, does installing cx_oracle create registry entries or require
admin privs on a Windows XP machine? I see that cx_oracle is
distributed as an EXE.

It's most probably created using distutils and "python setup.py
bdist_wininst". These installers only use the registry to look up the
path Python where is installed. Of course it will need a correctly
working Oracle client to operate.

- -- Gerhard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEBJxBdIO4ozGCH14RAqXCAJ9Vq6L8SLvnhlBCDc4EzwloJYp28ACfVt8J
TNN+XgNxFLmQscu9wpPIK4M=
=txAA
-----END PGP SIGNATURE-----
 
M

Magnus Lycka

Jonathan said:
On database portability...

While it is noble to try to have a generic interface to these
libraries, the end result is that the databases are always different
enough that the interface just has to work differently.

Considering the use case in question...

"What I'd like to do is use Python to access an Oracle 9.X database for
exporting a series of tables into one aggregated table as a text file,
for import into a mainframe database."

....it certainly seems reasonable to achieve this without too much
modifications between database engines. The problem I see directly
is if it uses the system tables to figure out what to export, but
if it doesn't, I don't forsee any big problems. There are even
ebcdic codecs in Python! :)

Read the DB-API 2 spec well. You might also want to look at
http://www.thinkware.se/cgi-bin/thinki.cgi/DatabaseProgrammingWithPython

I guess the main differences would be the connect string and
parameter passing.

Using ODBC (e.g. via mxODBC) should remove these problems.
Assuming that you use a subset of SQL which is supported by all
your engines (seems reasonable for this use case) it should be
enough to change ODBC data source to select data from either
Oracle or some other server.

Once upon a time, ODBC meant a significant performance penalty.
I don't know if that is still true.

Note that mxODBC has a licence that doesn't allow it to be used
freely in commercial contexts.

There is also an ODBC driver in the Python Windows extensions,
but I don't think it's been actively developed for many years.
It's not complient with DB API 2. There is another DB API 2
driver for ODBC sources on Windows called adodbapi, but in my
experience, it's slow and has problems with unicode strings in
its error handling.
 
D

dananrg

Thanks Gerhard and Magnus. Magnus, thanks for the references. I will
follow up on those.

I was messing around with the native ODBC module you mentioned (I am
using Python in a Win32 environment), e.g:

import dbi, odbc

....and it seems to meet my needs. The only issue I've had so far is
retrieving data from Oracle when an integer has been defined like:

number(p) [same thing as number(p,0) evidently

This is from a database I didn't design and can't change. The problem
is that the ODBC module suffixes an "L" to any integer returned that
was defined as data type number(p). For example, an integer stored as:
56 will be returned as 56L. Numbers that were specified as
number(p,s), the module has no problem with.

Anyone know why this would happen?

Incidentally, performance isn't an issue for this particular use case.
 
D

dananrg

The other thing I didn't do a good job of explaining is that I want to
have a layer of abstraction between the underlying RDBMS and the
business logic. It's the business logic I want to use Python for, so
that would stay roughly the same between RDBMS changes, if we ever have
an RDBMS change. I agree that I probably have more things to worry if I
was to change RDBMS vendors than what I'm describing here.
 
M

Magnus Lycka

The other thing I didn't do a good job of explaining is that I want to
have a layer of abstraction between the underlying RDBMS and the
business logic. It's the business logic I want to use Python for, so
that would stay roughly the same between RDBMS changes, if we ever have
an RDBMS change. I agree that I probably have more things to worry if I
was to change RDBMS vendors than what I'm describing here.

Have a look at SQLAlchemy.
 
M

Magnus Lycka

This is from a database I didn't design and can't change. The problem
is that the ODBC module suffixes an "L" to any integer returned that
was defined as data type number(p). For example, an integer stored as:
56 will be returned as 56L. Numbers that were specified as
number(p,s), the module has no problem with.

Anyone know why this would happen?

I'm sure the Python tutorial explains the difference between integer
and long types. Fields of type NUMBER or DECIMAL might well be larger
than sys.maxint, so you always get longs back when you fetch data
from such a column. This is as it should be.

What seems to be the problem?

If you actually get a suffixed L in the resulting text file, you
are using a strange way to convert your data to text. You aren't
simply printing lists or tuples are you? Then other types, such as
datetime objects will also look bizarre. (Not that the ancient
odbc would support that...)

You might want to look at the csv module for text export.
 
D

dananrg

If you actually get a suffixed L in the resulting text file, you
are using a strange way to convert your data to text. You aren't
simply printing lists or tuples are you? Then other types, such as
datetime objects will also look bizarre. (Not that the ancient
odbc would support that...)
You might want to look at the csv module for text export.

Thanks Magnus. I didn't know there was a csv module.

Here's how I'm getting the suffixed "L"

import dbi, odbc # Import ODBC modules
connectString = 'odbc_con_name/username/password'
dbc = odbc.odbc(connectString) # Connect to Oracle
cursor = dbc.cursor() # Create cursor
sql = "select statement here..." # Define SQL statement
cursor.execute(sql) # Execute sql statement
allRecords = cursor.fetchall() # Fetch all returned records
into a list of tuples
numRecords = len(allRecords) # Get num of records returned by
the query

# Note: I'm leaving out the for loop for this example...

# Print first record:
print allRecords[0]

# Convert first tuple to a list so I have a mutable object
recordList = list(allRecords[0])

# Print new list
print recordList
[872L, 'ACTIVE', <DbiDate object at 011F6000>, <DbiDate object at 00EA1428>, None, '1.0.0.0', None, None, None]

# Convert long integer to short integer (int) to get rid of the "L"
recordList[0] = int(recordList[0])

# Print list with changed item. No more "L"
print recordList[0]
[872, 'ACTIVE', <DbiDate object at 011F6000>, <DbiDate object at 00EA1428>, None, '1.0.0.0', None, None, None]
# The End

Are you saying I'm getting the "L" as an artifact of printing?
 
S

skip

dananrg> Are you saying I'm getting the "L" as an artifact of printing?

No, you're getting the "L" because you're printing a long integer. If you
execute

x = 872L
y = 872

at a Python prompt, x will be a long integer and y will be an integer. Long
integers can represent arbitrarily large numbers (subject only to memory
limitations). Integers are signed objects that are generally the same size
as the C long int type. They are currently two more-or-less distinct types.
As time goes on they are converging though. By the time Python 3.0 is
released I suspect there will be no difference.

If passing a long integer to some other routine is a problem (because it can
only accept regular integers) you can always convert it explicitly:

z = int(x)

Skip
 
M

Magnus Lycka

# Print new list
print recordList

[872L, 'ACTIVE', <DbiDate object at 011F6000>, <DbiDate object at 00EA1428>, None, '1.0.0.0', None, None, None]

Read the Python library manual chapter 2. Read all of it, it's all
very useful information, but take a particular look at str() and
repr(). All Python object can be "viewed" in two standard ways, via
the str() or repr() functions. In short, the str() stringification
is typically to be more end-user friendly, while the repr() stringi-
fication is more intended to properly identify exactly what kind of
an object we see: what type it is, and often the value too. (Above,
you don't see any reasonable value at all in the DbiDate objects,
but for some reason that didn't seem to bother you as much as the
suffixed L on the long ints.)

When you just print a Python object x of some kind, i.e.

print x

it will be equivalent of

print str(x)

To see the other representation, use

print repr(x)

Python collections, such as lists, tuples and dicts, aren't really
intended to be printed as is to end users. If recordList is a list,
and there is a statement "print recordList", it's probable that it
is intended as a diagnostic help to a programmer during development,
rather than to an end user. So, it's rather clever to use the repr()
stringification, so that it's clear exactly what we see, e.g. all
strings are quoted, so you clearly see things as trailing spaces,
can differentiate between tabs and sequences of spaces, and aren't
confused by commas inside the strings. Also, for longs, you get a
trailing L to indicate that this isn't simply a normal integer, but
an arbitrarily long one.
 
D

dananrg

dananrg> Are you saying I'm getting the "L" as an artifact of printing?

No, you're getting the "L" because you're printing a long integer. If you
execute

x = 872L
y = 872

at a Python prompt, x will be a long integer and y will be an integer. Long

Thanks Skip. Much appreciated.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top