client/server design and advice

T

TonyM

I recently completed the general guidelines for a future project that I
would like to start developing...but I've sort of hit a wall with
respect to how to design it. In short, I want to run through
approximately 5gigs of financial data, all of which is stored in a
large number of text files. Now as far as formatting and data
integrity...I would go through and ensure that each file had the
required setup so thats not really the issue. The problem I am having
is with respect to speed.

The languages I knew the best when coming into this project includes
c++ and php. However, I then thought about how long it would take one
PC to iterate through everything and figured it would probably take a
significant amount of time. As such, I started looking into various
languages and python caught my interest the most due to its power and
what seems to be ease of use. I was going to initially just use python
as a means of creating various indicators (i.e. calculations that would
be performed on the data in the file)...however I am leaning towards
moving to python entirely mostly due to its gui support.

First off, i was wondering if this is a reasonable setup: The entire
process would involve a server which manages which pc is processing
which set of data (which may be a given text file or the like), and a
client application which i would run on a few pc's locally when they
aren't in use. I would have a database (sqlite) holding all calculated
data of significance. Each client will basically login/connect with
the server, request a time interval (i.e. does anything need processed?
if so what data should i look at), and then it would update its status
with the server which would place a lock on that data set.

One thing i was wondering is if it would be worth it to use c++ for the
actual iteration through the text file or should i simply use python?
While i'm sure that c++ would be faster i am not entirely sure its
worth the headache if its not going to save me significant processing
time. Another thing is...if i was going to work with python instead of
c++, would it be worth it to import all of the data into an sqlite
database before hand (for speed issues)?

Lastly, as far as the networking goes, i have seen posts and such about
something called Pyro (http://pyro.sourceforge.net) and wondered if
that was worth looking into for the client/server interaction.

I apologize if any of these questions are more lower level, this is
simply the first client/server application ive created and am doing so
in a language ive never used before ;)

Thanks for the help
-Tony
 
D

Diez B. Roggisch

First off, i was wondering if this is a reasonable setup: The entire
process would involve a server which manages which pc is processing
which set of data (which may be a given text file or the like), and a
client application which i would run on a few pc's locally when they
aren't in use. I would have a database (sqlite) holding all calculated
data of significance. Each client will basically login/connect with
the server, request a time interval (i.e. does anything need processed?
if so what data should i look at), and then it would update its status
with the server which would place a lock on that data set.

Don't use sqlite, use a "real" RDBMS. sqlite is cool, but not really suited
for large amounts of data, and the concurrent access aspects that are dealt
with with an RDBMS for free are not to be underestimated.
One thing i was wondering is if it would be worth it to use c++ for the
actual iteration through the text file or should i simply use python?
While i'm sure that c++ would be faster i am not entirely sure its
worth the headache if its not going to save me significant processing
time. Another thing is...if i was going to work with python instead of
c++, would it be worth it to import all of the data into an sqlite
database before hand (for speed issues)?

I'd be putting them in the DB, yes.
Lastly, as far as the networking goes, i have seen posts and such about
something called Pyro (http://pyro.sourceforge.net) and wondered if
that was worth looking into for the client/server interaction.

Pyro rocks for that.

Diez
 
T

TonyM

Don't use sqlite, use a "real" RDBMS. sqlite is cool, but not really suited
for large amounts of data, and the concurrent access aspects that are dealt
with with an RDBMS for free are not to be underestimated.

Would PostgreSQL be suitable in this situation? I hadn't even thought
about the possible problems that could arise with concurrency but i do
recall it being an issue the last time i worked with sqlite. I have
also looked into mysql given my extensive experience with it...however
postgresql seems to be faster from what ive read. Either way i'll work
on writing something to convert and insert the data so that it can
process while im working on the gui and client/server apps.
Pyro rocks for that.

Awesome, ill look into it in greater detail and will most likely use
it. Given what ive seen so far it looks like it will make the
client/server interface fairly easy to write.

Now...if only i could master python gui programming and development ;)
I'm not entirely sure which gui lib im going to end up using, but as of
now im leaning more towards tkinter as i know it will work where i need
and it seems to be one of the more documented. Ive looked at used
wxpython a little but had trouble figure out a few minor things while
playing around with it initially. I've also thought about pygtk
although I haven't taken the time to play with it quite yet as i
assumed it was primarily for linux (id be running the majority of these
on windows pcs).

Thanks for the suggestions :)
Tony
 
I

Irmen de Jong

TonyM said:
Lastly, as far as the networking goes, i have seen posts and such about
something called Pyro (http://pyro.sourceforge.net) and wondered if
that was worth looking into for the client/server interaction.

I'm currently busy with a new version of Pyro (3.6) and it already
includes a new 'distributed computing' example, where there is
a single dispatcher service and one or more 'worker' clients.
The clients request work 'packets' from the dispatcher and
process them in parallel.
Maybe this is a good starting point of your system?
Current code is available from Pyro's CVS repository.

--Irmen
 
J

John Henry

TonyM wrote:
Awesome, ill look into it in greater detail and will most likely use
it. Given what ive seen so far it looks like it will make the
client/server interface fairly easy to write.

Correction: not "fairly easy" - make that "incredibly easy". Even
Micky likes it. :=)
Now...if only i could master python gui programming and development ;)
I'm not entirely sure which gui lib im going to end up using, but as of
now im leaning more towards tkinter as i know it will work where i need
and it seems to be one of the more documented. Ive looked at used
wxpython a little but had trouble figure out a few minor things while
playing around with it initially. I've also thought about pygtk
although I haven't taken the time to play with it quite yet as i
assumed it was primarily for linux (id be running the majority of these
on windows pcs).

You would short change yourself if you don't check out the other
packages such as Pythoncard, and Dabo.

The other thing I recommend for large scale applications:

http://www-128.ibm.com/developerworks/library/l-pythrd.html
 
P

Peter Decker

You would short change yourself if you don't check out the other
packages such as Pythoncard, and Dabo.

FWIW, Dabo has all of the database connectivity stuff built-in. With
PythonCard, you have to roll your own.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top