beta test release - catchmail for Python - new open source utility -

Discussion in 'Python' started by Andrew Stuart, Oct 3, 2004.

  1. Hello all,

    I am releasing a utility called catchmail to open source under a BSD style

    Here is the catchmail homepage:

    Catchmail is a Python utility that writes emails into a Postgres database.

    Catchmail's SQL schema is based on an extended version of the Yukatan data
    model (a SQL schema for relational storage of email RFC822 messages).

    catchmail needs real world testing and feedback however before it can
    progress beyond beta release.
    It's not quite ready for release however - it needs more people to try to
    use it and check it out before full scale public release.

    If anyone has the time or the inclination I would value a code review and
    advice being given as to how to do things differently or better.

    I'm no great Python programmer so any volunteers who might be interested in
    helping to enhance and help support catchmail would be much appreciated. I
    have set up a newsgroup at

    There is also a final known problem that I would value advice on.
    Everything seems to be working fine except one thing - unicode

    If I create the database using this command, everything seems to run fine -
    I can import 4000 emails if I create the Postgres data with this command:
    createdb -U postgres catchmail;

    If I create the Postgres database using this command, postgres starts to
    come back with unicode errors when I do the import
    createdb --encoding=UNICODE -U postgres test

    The import process starts to fail on lots of messages with this error:
    Database error: ERROR: invalid byte sequence for encoding "UNICODE":

    The objective is to have the database in Unicode so I suppose its quite an
    important problem to resolve. It looks to me like some sort of
    encoding/decoding requirement but although I had a good look I couldn't sort
    it out.

    I'm afraid I don't much understand how unicode is meant to be used in this
    sort of application - if you can throw any light on it for me it would be

    How SHOULD unicode be implemented for a utility such as this? I'd like
    catchmail to be as flexible as possible and to lose as little data as
    possible through things like character set conversions.

    I found some references to client encoding and multibyte in the postgres
    docs here - but maybe it should be fixed in the Python code?


    The latest version of catchmail is the one found on the website at

    Any feedback on catchmail or your experience with catchmail valued.

    Thanks to the great work of Mark Hammond and Jukka Zitting!

    Andrew Stuart
    a n d r e w . s t u a r t @ x s e . c o m . a u
    Andrew Stuart, Oct 3, 2004
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ed Anuff
    Ed Anuff
    Feb 8, 2004
  2. David Dorward
    David Dorward
    Jul 12, 2005
  3. arrowplain
    Feb 28, 2006
  4. Thierry Miceli
    Thierry Miceli
    Nov 18, 2003
  5. The Eternal Squire

    New release of Diet Python (0.2 Beta)!!!

    The Eternal Squire, Jul 27, 2006, in forum: Python
    Istvan Albert
    Jul 27, 2006

Share This Page