python daemon - compress data and load data into MySQL by pyodbc

Discussion in 'Python' started by MacRules, Sep 3, 2009.

  1. MacRules

    MacRules Guest

    Hi,

    I installed Python daemon, pyodbc module to access the back-end DB server.

    My setup is like this

    load data job -> Python Daemon A, port 6000 -> Python Daemon B, port
    7000 -> MySQL

    Daemon A will perform data compression, such as GZIP, and send over data
    to Daemon B.
    Daemon B will perform data uncompression, GUNZIP, and insert records to
    MySQL or MSSQL or Oracle.

    Where should I start this to code this?
    Can someone give me a hint, as detail as possible here?

    I am a python newbie.

    Thanks for all the help I can get,
    MacRules, Sep 3, 2009
    #1
    1. Advertising

  2. MacRules

    Sean DiZazzo Guest

    On Sep 2, 8:36 pm, MacRules <> wrote:
    > Hi,
    >
    > I installed Python daemon, pyodbc module to access the back-end DB server..
    >
    > My setup is like this
    >
    > load data job -> Python Daemon A, port 6000 -> Python Daemon B, port
    > 7000 -> MySQL
    >
    > Daemon A will perform data compression, such as GZIP, and send over data
    > to Daemon B.
    > Daemon B will perform data uncompression, GUNZIP, and insert records to
    > MySQL or MSSQL or Oracle.
    >
    > Where should I start this to code this?
    > Can someone give me a hint, as detail as possible here?
    >
    > I am a python newbie.
    >
    > Thanks for all the help I can get,


    Start by reading the tutorial. http://docs.python.org/tutorial/

    ~Sean
    Sean DiZazzo, Sep 3, 2009
    #2
    1. Advertising

  3. MacRules

    MacRules Guest

    Sean DiZazzo wrote:
    > On Sep 2, 8:36 pm, MacRules <> wrote:
    >> Hi,
    >>
    >> I installed Python daemon, pyodbc module to access the back-end DB server.
    >>
    >> My setup is like this
    >>
    >> load data job -> Python Daemon A, port 6000 -> Python Daemon B, port
    >> 7000 -> MySQL
    >>
    >> Daemon A will perform data compression, such as GZIP, and send over data
    >> to Daemon B.
    >> Daemon B will perform data uncompression, GUNZIP, and insert records to
    >> MySQL or MSSQL or Oracle.
    >>
    >> Where should I start this to code this?
    >> Can someone give me a hint, as detail as possible here?
    >>
    >> I am a python newbie.
    >>
    >> Thanks for all the help I can get,

    >
    > Start by reading the tutorial. http://docs.python.org/tutorial/
    >
    > ~Sean

    Are you a Python expert?
    Can you show me the basic coding to get a Daemon (pass through insert
    data) up and insert to the backend MySQL?
    MacRules, Sep 3, 2009
    #3
  4. MacRules

    David Smith Guest

    MacRules wrote:
    > Sean DiZazzo wrote:
    >> On Sep 2, 8:36 pm, MacRules <> wrote:
    >>> Hi,
    >>>
    >>> I installed Python daemon, pyodbc module to access the back-end DB
    >>> server.
    >>>
    >>> My setup is like this
    >>>
    >>> load data job -> Python Daemon A, port 6000 -> Python Daemon B, port
    >>> 7000 -> MySQL
    >>>
    >>> Daemon A will perform data compression, such as GZIP, and send over data
    >>> to Daemon B.
    >>> Daemon B will perform data uncompression, GUNZIP, and insert records to
    >>> MySQL or MSSQL or Oracle.
    >>>
    >>> Where should I start this to code this?
    >>> Can someone give me a hint, as detail as possible here?
    >>>
    >>> I am a python newbie.
    >>>
    >>> Thanks for all the help I can get,

    >>
    >> Start by reading the tutorial. http://docs.python.org/tutorial/
    >>
    >> ~Sean

    > Are you a Python expert?
    > Can you show me the basic coding to get a Daemon (pass through insert
    > data) up and insert to the backend MySQL?
    >


    You've asked a rather large and non-specific question. What avenues
    have you explored so far? Can you describe the problem this is designed
    to solve?

    --David
    David Smith, Sep 3, 2009
    #4
  5. MacRules

    MacRules Guest

    Ben Finney wrote:
    > MacRules <> writes:
    >
    >> Are you a Python expert?

    >
    > This group has many Python experts, and even more people who can no
    > doubt help you if you are able to articulate what you need help with.
    >
    > To help dispel a possible misunderstanding: Don't expect to have a
    > one-on-one conversation with a single devoted correspondent here. We
    > prefer to have these discussions in a manner open for contributions from
    > everyone, to maximise the benefit for future readers and the chance of
    > correcting errors.
    >
    >> Can you show me the basic coding to get a Daemon (pass through insert
    >> data) up and insert to the backend MySQL?

    >
    > This doesn't make a whole lot of sense to me; it sounds like a
    > conflation of two separate problems that would be best tackled
    > separately. Can you re-phrase it, perhaps by describing the problem
    > you're actually trying to solve?
    >
    > On the topic of implementing a daemon process in Python, there is the
    > ‘python-daemon’ library <URL:http://pypi.python.org/pypi/python-daemon>
    > which I hope you will find useful. If you can describe what you're
    > actually trying to do, perhaps it will be clearer whether this library
    > is a good fit.
    >


    What I am looking for is this.

    Oracle DB in data center 1 (LA, west coast)
    MSSQL DB in data center 2 (DC, east coast)
    So network bandwidth is an issue, I prefer to have gzip fist and deliver
    the data.

    I need 2 python daemons or a web service here in the future.
    I will enter the Oracle table name, user id and password.
    So the task is dump out Oracle data (Linux) and insert that to MSSQL.


    I can try first with 1 daemon python. Take the Oracle data file, and let
    the daemon connects to MSSQL (with pyodbc) and load the data in.
    MacRules, Sep 3, 2009
    #5
  6. MacRules

    MacRules Guest

    Martin P. Hellwig wrote:
    > MacRules wrote:
    > <cut>
    >> What I am looking for is this.
    >>
    >> Oracle DB in data center 1 (LA, west coast)
    >> MSSQL DB in data center 2 (DC, east coast)
    >> So network bandwidth is an issue, I prefer to have gzip fist and
    >> deliver the data.

    >
    > If bandwidth is really an issue, you should send compressed delta's.
    >
    >>
    >> I need 2 python daemons or a web service here in the future.
    >> I will enter the Oracle table name, user id and password.
    >> So the task is dump out Oracle data (Linux) and insert that to MSSQL.

    >
    > That is assuming the table is the same and the columns of the table have
    > the same type with the same restrictions and don't even get me started
    > about encoding differences.
    >
    >>
    >>
    >> I can try first with 1 daemon python. Take the Oracle data file, and
    >> let the daemon connects to MSSQL (with pyodbc) and load the data in.

    >
    > I think that you are underestimating your task, I would recommend to
    > 'design' your application first using an UML like approach, whether you
    > need one, two or bazillion daemons should not be a design start but a
    > consequence of the design.
    >
    > Anyway here is a sum up and some further pointers to take in regard for
    > your design:
    > - Can I do delta's? (if yes how do I calculate them)
    > - Are the tables comparable in design
    > Specifically if there is a limit on the character width, does Oracle
    > and MS-SQL think the same about newlines? (one versus two characters)
    > - How about encoding, are you sure it works out right?
    > - Is ATOMIC an issue
    > - Is latency an issue, that is how long may the tables be out of sync
    > - Can it be manual or is it preferably automatic
    > How about a trigger in the database that sets the sync going, if that
    > is too much burden how about a trigger that set a flag on the disk and a
    > scheduled job that reads that flag first
    > - Is security an issue, do I need an encrypted channel
    >
    > In the past I've wrote a program that had the same starting point as you
    > had, over time it grew in design to be a datawarehouse push/pull central
    > server with all other databases as an agency. The only reason I wrote it
    > was because the more standard approach like business objects data
    > integrator was just way too expensive and oracles solutions didn't play
    > nice with PostgreSQL (though over time this issue seemed to be resolved).
    >


    You understand my issue clearly. For now, I only look for 1 time event,
    I do not need push/pull yet.

    For a good push/pull + compression + security + real time in mind, you
    can sell it for decent money.
    MacRules, Sep 3, 2009
    #6
  7. On Thu, 03 Sep 2009 14:43:40 -0400, MacRules <>
    declaimed the following in gmane.comp.python.general:

    >
    > Oracle DB in data center 1 (LA, west coast)
    > MSSQL DB in data center 2 (DC, east coast)


    Note that your thread subject line states MySQL... There is a big
    difference between MySQL and M$ SQL-Server.

    > So network bandwidth is an issue, I prefer to have gzip fist and deliver
    > the data.
    >
    > I need 2 python daemons or a web service here in the future.
    > I will enter the Oracle table name, user id and password.
    > So the task is dump out Oracle data (Linux) and insert that to MSSQL.
    >

    One: daemon implies something running in the background,
    stand-alone, and not associated with any "user" type account. IOW,
    something like a server program...

    Do you really intend to create a server application that sits around
    waiting for anyone to connect to it and send a properly formatted
    request? What safeguards do you intend to restrict this daemon from
    responding to hackers (after all, you're going to have to have an open
    port to accept requests from outside).

    Second: said daemon will have to be running ON the end with the
    Oracle DBMS (or, at least, within its LAN which presumably has higher
    bandwidth than the long distance connection).

    >
    > I can try first with 1 daemon python. Take the Oracle data file, and let
    > the daemon connects to MSSQL (with pyodbc) and load the data in.


    I see three operations here:

    1 dumping data from Oracle
    2 transferring data from LA to DC
    3 importing data to whatever DBMS you are really using.

    I believe, in a later post, you mention this is a one-time task. If
    so, WHY worry about network bandwidth... Just let it run overnight...

    Question: does the Oracle server permit remote connections?
    (dangerous unless behind a VPN type connection)

    If it does, I'd suggest the initial program should just open a
    connection both the Oracle server, and the local server. Request the
    contents of whatever needs to be transferred, collect the data, massage
    it into whatever format is needed for the local, and insert it to the
    local... Anything else will require being able to log into the remote
    machine and running programs ON it. In which case you could just FTP the
    data from one end to the other (assuming there is an FTP server active
    on one or the other end -- or a web server address you can stuff the
    files into and use HTTP to fetch them)
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Sep 4, 2009
    #7
  8. MacRules

    MacRules Guest

    Dennis Lee Bieber wrote:
    > On Thu, 03 Sep 2009 14:43:40 -0400, MacRules <>
    > declaimed the following in gmane.comp.python.general:
    >
    >> Oracle DB in data center 1 (LA, west coast)
    >> MSSQL DB in data center 2 (DC, east coast)

    >
    > Note that your thread subject line states MySQL... There is a big
    > difference between MySQL and M$ SQL-Server.
    >
    >> So network bandwidth is an issue, I prefer to have gzip fist and deliver
    >> the data.
    >>
    >> I need 2 python daemons or a web service here in the future.
    >> I will enter the Oracle table name, user id and password.
    >> So the task is dump out Oracle data (Linux) and insert that to MSSQL.
    >>

    > One: daemon implies something running in the background,
    > stand-alone, and not associated with any "user" type account. IOW,
    > something like a server program...
    >
    > Do you really intend to create a server application that sits around
    > waiting for anyone to connect to it and send a properly formatted
    > request? What safeguards do you intend to restrict this daemon from
    > responding to hackers (after all, you're going to have to have an open
    > port to accept requests from outside).
    >
    > Second: said daemon will have to be running ON the end with the
    > Oracle DBMS (or, at least, within its LAN which presumably has higher
    > bandwidth than the long distance connection).
    >
    >> I can try first with 1 daemon python. Take the Oracle data file, and let
    >> the daemon connects to MSSQL (with pyodbc) and load the data in.

    >
    > I see three operations here:
    >
    > 1 dumping data from Oracle
    > 2 transferring data from LA to DC
    > 3 importing data to whatever DBMS you are really using.
    >
    > I believe, in a later post, you mention this is a one-time task. If
    > so, WHY worry about network bandwidth... Just let it run overnight...
    >
    > Question: does the Oracle server permit remote connections?
    > (dangerous unless behind a VPN type connection)
    >
    > If it does, I'd suggest the initial program should just open a
    > connection both the Oracle server, and the local server. Request the
    > contents of whatever needs to be transferred, collect the data, massage
    > it into whatever format is needed for the local, and insert it to the
    > local... Anything else will require being able to log into the remote
    > machine and running programs ON it. In which case you could just FTP the
    > data from one end to the other (assuming there is an FTP server active
    > on one or the other end -- or a web server address you can stuff the
    > files into and use HTTP to fetch them)


    I have 1000 servers, and 1 time each, so it is 1000 times.
    What I try to say, I do not need real-time push/pull here.

    I know the manual operations, just want Python does automatically for me.

    I like to see good examples on how to code daemon, signal, maybe
    multi-threading.

    I wrote something like that in C way back. And good at shell scripts.
    MacRules, Sep 4, 2009
    #8
  9. MacRules

    MacRules Guest

    Dennis Lee Bieber wrote:
    > On Thu, 03 Sep 2009 14:43:40 -0400, MacRules <>
    > declaimed the following in gmane.comp.python.general:
    >
    >> Oracle DB in data center 1 (LA, west coast)
    >> MSSQL DB in data center 2 (DC, east coast)

    >
    > Note that your thread subject line states MySQL... There is a big
    > difference between MySQL and M$ SQL-Server.
    >
    >> So network bandwidth is an issue, I prefer to have gzip fist and deliver
    >> the data.
    >>
    >> I need 2 python daemons or a web service here in the future.
    >> I will enter the Oracle table name, user id and password.
    >> So the task is dump out Oracle data (Linux) and insert that to MSSQL.
    >>

    > One: daemon implies something running in the background,
    > stand-alone, and not associated with any "user" type account. IOW,
    > something like a server program...
    >
    > Do you really intend to create a server application that sits around
    > waiting for anyone to connect to it and send a properly formatted
    > request? What safeguards do you intend to restrict this daemon from
    > responding to hackers (after all, you're going to have to have an open
    > port to accept requests from outside).
    >
    > Second: said daemon will have to be running ON the end with the
    > Oracle DBMS (or, at least, within its LAN which presumably has higher
    > bandwidth than the long distance connection).
    >
    >> I can try first with 1 daemon python. Take the Oracle data file, and let
    >> the daemon connects to MSSQL (with pyodbc) and load the data in.

    >
    > I see three operations here:
    >
    > 1 dumping data from Oracle
    > 2 transferring data from LA to DC
    > 3 importing data to whatever DBMS you are really using.
    >
    > I believe, in a later post, you mention this is a one-time task. If
    > so, WHY worry about network bandwidth... Just let it run overnight...
    >
    > Question: does the Oracle server permit remote connections?
    > (dangerous unless behind a VPN type connection)
    >
    > If it does, I'd suggest the initial program should just open a
    > connection both the Oracle server, and the local server. Request the
    > contents of whatever needs to be transferred, collect the data, massage
    > it into whatever format is needed for the local, and insert it to the
    > local... Anything else will require being able to log into the remote
    > machine and running programs ON it. In which case you could just FTP the
    > data from one end to the other (assuming there is an FTP server active
    > on one or the other end -- or a web server address you can stuff the
    > files into and use HTTP to fetch them)


    I have 1000 servers, and 1 time each, so it is 1000 times.
    What I try to say, I do not need real-time push/pull here.

    I know the manual operations, just want Python does automatically for me.

    I like to see good examples on how to code daemon, signal, maybe
    multi-threading.

    I wrote something like that in C way back. And good at shell scripts.
    MacRules, Sep 4, 2009
    #9
  10. On Fri, 04 Sep 2009 09:46:13 -0400, MacRules <>
    declaimed the following in gmane.comp.python.general:

    > I have 1000 servers, and 1 time each, so it is 1000 times.
    > What I try to say, I do not need real-time push/pull here.
    >

    So you are now saying you want access to install a "one time use"
    daemon on EACH of 1000 servers (all in LA? What it this -- consolidation
    of systems installed in individual 7/11 stores?), just so some remote
    host can connect to the daemon's open port, request data, wait while the
    daemon queries the DBMS and packets the data for transfer [and are going
    to write a protocol to transfer this data too], and then go back and
    delete the daemon process and code from the each of the 1000 servers...

    Somehow I think THAT is going to take more time then just doing a
    direct connection to the database itself, fetch the records,
    disconnect... Insert into local DBMS.

    > I know the manual operations, just want Python does automatically for me.
    >

    Well... First... Itemize exactly what those steps are... then code
    an algorithm that does just that... Then modify it to take a list of
    servers and repeat the process on each.

    If you're hoping to get multiple processes pulling data in parallel,
    and inserting it all into a single DBMS, you're going to run into the
    possibility of transactional conflicts and the related need to do
    rollbacks and restarts of transactions.

    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
    Dennis Lee Bieber, Sep 5, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jean-Paul Calderone
    Replies:
    0
    Views:
    451
    Jean-Paul Calderone
    Mar 20, 2009
  2. Floris Bruynooghe
    Replies:
    1
    Views:
    448
    Floris Bruynooghe
    Mar 24, 2009
  3. Marc Bauer
    Replies:
    13
    Views:
    559
    Dr.Ruud
    Jul 23, 2006
  4. Daemon Win32::Daemon;

    , Sep 7, 2006, in forum: Perl Misc
    Replies:
    0
    Views:
    249
  5. iMath
    Replies:
    1
    Views:
    891
    Dennis Lee Bieber
    Oct 17, 2012
Loading...

Share This Page