Python choice of database

Philippe C. Martin · Jun 20, 2005

Yes, I agree, but as most of the customer base I target uses the O/S that
cannot be named ;-) , file names could become a problem just as 'ln -s' is
out of the question.

Yet, this might be the best trade-off.

Regards,

Philippe

Brian · Jun 20, 2005

I am really surprised that someone hasn't mentioned Gadfly yet. It is a
quick, free, relational database written directly for Python itself.

http://gadfly.sourceforge.net/

Brian

EP · Jun 21, 2005

Oren suggested:

How about using the filesystem as a database? For the number of records
you describe it may work surprisingly well. A bonus is that the
database is easy to manage manually.

I tried this for one application under the Windows OS and it worked fine...

until my records (text - maybe 50KB average) unexpectedly blossomed into the 10,000-1,000,000 ranges. If I or someone else (who innocently doesn't know better) opens up one of the directories with ~150,000 files in it, the machine's personality gets a little ugly (it seems buggy but is just very busy; no crashing). Under 10,000 files per directory seems to work just fine, though.

For less expansive (and more structured) data, cPickle is a favorite.

Jeremy Sanders · Jun 21, 2005

until my records (text - maybe 50KB average) unexpectedly blossomed into
the 10,000-1,000,000 ranges. If I or someone else (who innocently doesn't
know better) opens up one of the directories with ~150,000 files in it,
the machine's personality gets a little ugly (it seems buggy but is just
very busy; no crashing). Under 10,000 files per directory seems to work
just fine, though.

Yes. Programs like "squid" use subdirectories to avoid this problem. If
your key is a surname, then you can just use the first letter to divide
the names up, for instance, or part of the hash value.

Many Linux FSs can cope with lots of files, but it doesn't hurt to try to
avoid this.

Jeremy

Charles Krug · Jun 21, 2005

Oren suggested:

I tried this for one application under the Windows OS and it worked fine...

until my records (text - maybe 50KB average) unexpectedly blossomed
into the 10,000-1,000,000 ranges. If I or someone else (who
innocently doesn't know better) opens up one of the directories with
~150,000 files in it, the machine's personality gets a little ugly (it
seems buggy but is just very busy; no crashing). Under 10,000 files
per directory seems to work just fine, though.

For less expansive (and more structured) data, cPickle is a favorite.

Related question:

What if I need to create/modify MS-Access or SQL Server dbs?

Konstantin Veretennicov · Jun 21, 2005

Related question:

What if I need to create/modify MS-Access or SQL Server dbs?

You could use ADO + adodbapi for both.
http://adodbapi.sourceforge.net/

- kv

GMane Python · Jun 21, 2005

For my database, I have a table of user information with a unique
identifier, and then I save to the filesystem my bitmap files, placing the
unique identifier, date and time information into the filename. Why stick a
photo into a database?

For instance:

User Table:
uniqueID: 0001
lNane: Rose
fName: Dave

Then save the bitmap with filename:
0001_13:00:00_06-21-2005.bmp

To make things faster, I also have a table of filenames saved, so I can know
exactly which files I want to read in.

-Dave

Peter Hansen · Jun 21, 2005

GMane said:
For my database, I have a table of user information with a unique
identifier, and then I save to the filesystem my bitmap files, placing the
unique identifier, date and time information into the filename. Why stick a
photo into a database?

There are various possible reasons, depending on one's specific situation.

A database allows you to store the date and time info, or other
attributes, as separate fields so you can use standard SQL (or whatever
your favourite DB supports) to sort, query, and retrieve.

A database makes it possible to update or remove the photo in the same
manner you use to access all your other data, rather than requiring you
to deal with filesystem ideosyncracies and exceptional conditions.

A database such as SQLite will store *all* your data in a single file,
making it much easier to copy for archival purposes, to send to someone
else, or to move to another machine.

Then save the bitmap with filename:
0001_13:00:00_06-21-2005.bmp

A database shouldn't make you jump through hoops to create "interesting"
file names just to store your data.

To make things faster, I also have a table of filenames saved, so I can know
exactly which files I want to read in.

Oh yeah, databases can have indexes (or indices, if you will) which let
you get that sort of speedup without having to resort to still more
custom programming.

Not that a database is always the best way to store an image (or to do
anything else, for that matter), but there are definitely times when it
can be a better approach than simple files. (There are disadvantages
too, of course, such as making it harder to "get at" the data from
outside the application which created it. In the case of bitmaps, this
might well be a deciding factor, but each case should be addressed on
its own merits.)

-Peter

Philippe C. Martin · Jun 21, 2005

I guess I use databases to store .... data ;-) and I do not wish to worry
about the type of data I'm storing. That's why I love to pickle.

I understand that during an optimization phase, decisions might be taken to
handle data otherwise.

Regards,

Philippe

Andy Dustman · Jun 29, 2005

Post your question here:
http://sourceforge.net/forum/forum.php?forum_id=70461

Christos TZOTZIOY Georgiou · Jul 2, 2005

For very short keys and record (e.g. email addresses) you can use
symbolic links instead of files. The advantage is that you have a
single system call (readlink) to retrieve the contents of a link. No
need to open, read and close.

readlink also does open, read and close too. And why go through
indirection? Why not make indexes into subdirectories, say, and
hard-link the records under different filenames?

This works only on posix systems, of course.

There aren't any non-posix-conformant --or, at least, any
non-self-described-as-posix-conformant

-- operating systems in wide
use today.

Hint: win32file.CreateHardLink

Christos TZOTZIOY Georgiou · Jul 2, 2005

You could use ADO + adodbapi for both.
http://adodbapi.sourceforge.net/

Or pywin32/ctypes and COM (btw, I prefer DAO to ADO, but that is a
personal choice).

Christos TZOTZIOY Georgiou · Jul 2, 2005

I tried this for one application under the Windows OS and it worked fine...

until my records (text - maybe 50KB average) unexpectedly blossomed into the 10,000-1,000,000 ranges. If I or someone else (who innocently doesn't know better) opens up one of the directories with ~150,000 files in it, the machine's personality gets a little ugly (it seems buggy but is just very busy; no crashing). Under 10,000 files per directory seems to work just fine, though.

Although I am not a pro-Windows person, I have to say here directories
containing more than 10000 files is not a problem for NTFS (at least
NTFS of Win2000 and WinXP based on my experience) since AFAIK
directories are stored in B-tree format; the problem is if one tries to
*view* the directory contents using Explorer. Command-line dir had no
problem on a directory with >15000 files.

ANN: eGenix mxODBC Connect - Python Database Interface 2.0.2	0	Dec 14, 2012
ANN: eGenix mxODBC 3.2.0 - Python ODBC Database Interface	0	Aug 28, 2012
Python database of plain text editable by notepad or vi	7	Mar 25, 2010
fast copying of large files in python	1	Nov 2, 2011
Make Python Compilable, convert to Python source to Go	12	May 25, 2014
ANN: dbf (aka Python dBase)	0	Mar 1, 2013
MySQL data types vs Django/Python data types	0	Sep 1, 2013
Stand-Alone Python Executable Skeletons	0	Feb 15, 2012

Python choice of database

Philippe C. Martin

Brian

EP

Jeremy Sanders

Charles Krug

Konstantin Veretennicov

GMane Python

Peter Hansen

Philippe C. Martin

Andy Dustman

Christos TZOTZIOY Georgiou

Christos TZOTZIOY Georgiou

Christos TZOTZIOY Georgiou

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads