About Databases...

A

andrea_gavana

Hello NG,

I am still quite a newbie with Python (I intensely use wxPython, anyway).
I would like to know what are, in your opinions, the best/faster databases
that I could use in Python (and, of course, I should be able to "link" everything
with a wxPython GUI)? Specifically, I work on Reservoir Simulation, and
usually I have to store a discrete/huge amount of data (it depends on the
oil field). As you may have understood, I know almost NOTHING about databases
;-)
In general, my data will be numeric (floats, integers). Will a binary storage
(if it is possible) reduce the size of the DB? And what about speed in storing/retrieving
data?

Thank you a lot for every suggestion.

Andrea.
 
D

Dennis Lee Bieber

oil field). As you may have understood, I know almost NOTHING about databases

Then your first approach should be a book on relational database
design, normalization (and when to violate normalization), and SQL.
Preferably a book that is NOT specific to ANY Relational DBMS. Though
I'd probably avoid any book that has chapters on Hierarchical DBMS,
CODASYL/DBTG Network DBMS, or Relational Algebra and/or Relational
Calculus -- Unless you're running on 1970s "big iron", but in that case,
you may not have Python by which to even attempt to access them... <G>

After you've gotten the generics down, then you can consider the
specifics of any available RDBMS...

Python modules likely exist for: M$ JET (I'm presuming you're on
a Windows box, including the server), M$ SQL Server/MSDE, MySQL, MaxDB
by MySQL (their name for the repackaged SAP DB), PostgreSQL, Firebird
(the descendent of Interbase). I'm pretty sure all of the above run on
WinNT, and most are even available for W9x.
In general, my data will be numeric (floats, integers). Will a binary storage
(if it is possible) reduce the size of the DB? And what about speed in storing/retrieving
data?
I don't know of any RDBMS that /doesn't/ support binary
numerics, though they may do so with 32-bit-only integer and 64-bit-only
floats (many though have short and long ints, and single or double
floats). Many also have either a decimal or currency data type. However,
IN the SQL query, your data will be formatted as text during the
insertion, and converted back to binary by the RDBMS server.

If you need to store what are essentially large arrays, which
will never by accessed via the RDBMS piecemeal (IOW: you will retrieve
or store the entire array at once) you'll want to check on the
capabilities of BLOBs (Binary Large OBjects, as I recall) in each
candidate RDBMS.

--
 
R

Robert Kern

Hello NG,

I am still quite a newbie with Python (I intensely use wxPython, anyway).
I would like to know what are, in your opinions, the best/faster databases
that I could use in Python (and, of course, I should be able to "link" everything
with a wxPython GUI)? Specifically, I work on Reservoir Simulation, and
usually I have to store a discrete/huge amount of data (it depends on the
oil field). As you may have understood, I know almost NOTHING about databases
;-)
In general, my data will be numeric (floats, integers). Will a binary storage
(if it is possible) reduce the size of the DB? And what about speed in storing/retrieving
data?

If your data is numeric, you'll probably want to use PyTables.

http://pytables.sourceforge.net/html/WelcomePage.html

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
L

Lee Harr

Hello NG,

I am still quite a newbie with Python (I intensely use wxPython, anyway).
I would like to know what are, in your opinions, the best/faster databases
that I could use in Python (and, of course, I should be able to "link" everything
with a wxPython GUI)? Specifically, I work on Reservoir Simulation, and
usually I have to store a discrete/huge amount of data (it depends on the
oil field). As you may have understood, I know almost NOTHING about databases
;-)
In general, my data will be numeric (floats, integers). Will a binary storage
(if it is possible) reduce the size of the DB? And what about speed in storing/retrieving
data?

Thank you a lot for every suggestion.

Andrea.


There is also the PostGIS extension for PostgreSQL
which may be of use in your situation:

http://postgis.refractions.net/


"""
PostGIS adds support for geographic objects to the
PostgreSQL object-relational database. In effect,
PostGIS "spatially enables" the PostgreSQL server,
allowing it to be used as a backend spatial database
for geographic information systems (GIS), much like
ESRI's SDE or Oracle's Spatial extension.
"""
 
R

Ruben Baumann

Hello NG,

I am still quite a newbie with Python (I intensely use wxPython, anyway).
I would like to know what are, in your opinions, the best/faster databases
that I could use in Python (and, of course, I should be able to "link"
everything
with a wxPython GUI)? Specifically, I work on Reservoir Simulation, and
usually I have to store a discrete/huge amount of data (it depends on the
oil field). As you may have understood, I know almost NOTHING about
databases
;-)
In general, my data will be numeric (floats, integers). Will a binary
storage
(if it is possible) reduce the size of the DB? And what about speed in
storing/retrieving
data?

Thank you a lot for every suggestion.

Andrea.
==============================================================================
Hello Andrea.

One thing you should consider, maybe the MOST important - What kinds of
reports are you/your bosses expecting to get out of your application? Are
you going to write the reports by hand? They can be a pretty expensive
proposition in terms of development time and energy.

Most RDBMS's meet your criteria of speed, numeric data, etc. I would
strongly encourage you to look into a database that can be accessed easily
by some kind of report writer such as Crystal Reports, or MS-Access, or some
of the other fine report writers out there. Unless of course you plan to do
all the heavy work of hand writing all your reports. Your bosses may ask
you to graph the data as well. Oracle(if you can afford it), MS-SQLServer,
MySQL, PostgreSQL, Even the Free MS-MSDE/SQL Express would work for you.
Most report writers have hooks to these DBMS's, or can use ODBC to get to
the data.

Granted, you have said that you know almost NOTHING about databases, but it
sounds to me like your app is a prime candidate for one, so maybe it's time
to learn. ;-) And, in my experience, sometimes the smallest project can
wind up a major project depending on how important or critical the need for
the data and it's presentation is. And no matter what reports you produce,
you'll always be called upon to produce more!

Good Luck!

Ruben
 
P

Peter A. Schott

Have to agree with others here - get a good database backend.

MSSQL/Oracle - good choices commercially.
Both of these offer good feature sets and have a lot of support from users.
However, they are commercial apps and not necessarily free. (You can play with
MS SQL Server in the Developer edition for $50 if you're interested - full
feature set with minor throttling to keep from being used in production.)

MySQL - fast, but not completely relational. This may or may not matter to you.
For what it does, it is very fast.

PostGreSQL - good feature set, open source. Not sure on speed, but it should
run natively on Unix, Linux, and Win32 (not sure about BSD or others).

There are quite a few others - these just come to mind off the top of my head.

I'd tend to stay away from a file-based system because people tend to want some
form of report off of data like this pretty quickly. Files may be fast, but
they're not easy to get data from.


As others have suggested, take a little time or pull in a consultant if
necessary to learn SQL and build a good database to store this. You may not
even need heavy normalization at first, though I'd keep in mind that repetitive
data may be best stored in lookup tables.

Putting this into a database will also give you access to some of the advanced
set-based operations that database engines tend to be very good with.



As for your questions: Numerics tend to be base types in most DB platforms.
They tend to take up less space than character values and can be operated on
mathematically without much trouble. I'd tend towards numeric as opposed to
float if you need accuracy.

With the right code in the backend modules/classes, you should be able to link
anything you need to objects in the database.

Best of luck,
-Pete Schott
 
D

Dennis Lee Bieber

However, they are commercial apps and not necessarily free. (You can play with
MS SQL Server in the Developer edition for $50 if you're interested - full
feature set with minor throttling to keep from being used in production.)
Micro$loth is charging for MSDE now?
MySQL - fast, but not completely relational. This may or may not matter to you.
For what it does, it is very fast.
And getting closer to full compliance...
PostGreSQL - good feature set, open source. Not sure on speed, but it should
run natively on Unix, Linux, and Win32 (not sure about BSD or others).
Isn't the "native" W32 a recent capability, or do you mean a
CYGWIN build?
There are quite a few others - these just come to mind off the top of my head.
Firebird -- developed from Interbase 6, released last summer.

Maybe "MaxDB by MySQL", developed from SAS DB (and likely
available with similar licenses as MySQL -- I don't know if MySQL AB
plan to merge SAS DB features into MySQL or the other way around <G>)


--
 
O

Ola Natvig

Dennis said:
Firebird -- developed from Interbase 6, released last summer.

Maybe "MaxDB by MySQL", developed from SAS DB (and likely
available with similar licenses as MySQL -- I don't know if MySQL AB
plan to merge SAS DB features into MySQL or the other way around <G>)

It was called SAP DB, I can recomend that one, the only trouble I've had
with it is the Python drives in a heavily threaded environment, even
with locking.
 
L

Leif B. Kristensen

Peter said:
PostGreSQL - good feature set, open source. Not sure on speed, but it
should run natively on Unix, Linux, and Win32 (not sure about BSD or
others).

A lot of the core developers of PostgreSQL are apparently running *BSD.
On the PostgreSQL-General mailing list, I've noted that people run it
on Solaris, AIX, HP-UX and all kinds of 'nixes. I think that the
keyword is POSIX compliance.

The native Win32 version (8.x) is very recent, and probably not a good
bet for a production environment for the time being. Earlier versions
may be run under Cygwin.

The "speed" of a database engine is notoriously hard to assess by
objective means. It will depend on tuning, indexing, load, read/write
ratio, number of concurrent users, and a lot of other variables. Some
RDBMSs also performs terribly in situations where others shine.

Anyway, I'd recommend a closer study of PostgreSQL for anyone interested
in a good RDBMS. I have been using MySQL for several years, but have
recently come to appreciate all the advanced features of PostgreSQL
that still is kind of "vaporware" in MySQL.
 
P

PA

Anyway, I'd recommend a closer study of PostgreSQL for anyone
interested
in a good RDBMS.

Talking of which, I would recommend to take a serious look at FrontBase:

http://www.frontbase.com/

Excellent database, running on pretty much anything. While not open
source, they do offer a free, unlimited license (E-Starter).

Cheers
 
D

Dennis Lee Bieber

It was called SAP DB, I can recomend that one, the only trouble I've had

Whoops... I did have that correct two days ago... <G>

SAS made the C compiler for my old Amiga (the only architecture
for which they /sold/ the compiler and not some end product built with
it).

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top