taking python enterprise level?...

S

simn_stv

hello people, i have been reading posts on this group for quite some
time now and many, if not all (actually not all!), seem quite
interesting.
i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day
(hehe,...my dad always told me to 'AIM HIGH' ;0), not some 'facebook'
or anything like it, its mainly for a financial transactions which
gets pretty busy...
so my question is this would anyone have anything that would make
python a little less of a serious candidate (cos it already is) and
the options may be to use some other languages (maybe java, C (oh
God))...i am into a bit of php and building API's in php would not be
the hard part, what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

would python be able to manage giving me a solid 'core' and will i be
able to use python provide any API i would like to implement?...

im sorry if my subject was not as clear as probably should be!.
i guess this should be the best place to ask this sort of thing, hope
im so right.

Thanks
 
S

Steve Holden

simn_stv said:
hello people, i have been reading posts on this group for quite some
time now and many, if not all (actually not all!), seem quite
interesting.
i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day
(hehe,...my dad always told me to 'AIM HIGH' ;0), not some 'facebook'
or anything like it, its mainly for a financial transactions which
gets pretty busy...
so my question is this would anyone have anything that would make
python a little less of a serious candidate (cos it already is) and
the options may be to use some other languages (maybe java, C (oh
God))...i am into a bit of php and building API's in php would not be
the hard part, what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

would python be able to manage giving me a solid 'core' and will i be
able to use python provide any API i would like to implement?...

im sorry if my subject was not as clear as probably should be!.
i guess this should be the best place to ask this sort of thing, hope
im so right.

Thanks

I'd suggest that if you are running an operation that gets 100,000 hits
a day then your problems won't be with Python but with organizational
aspects of your operation.

regards
Steve
 
M

Martin P. Hellwig

On 02/25/10 10:26, simn_stv wrote:
what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

would python be able to manage giving me a solid 'core' and will i be
able to use python provide any API i would like to implement?...
<cut>
Python isn't the most efficient language, the assembler provided by the
maker of your CPU probably is the best you can get, everything after
that is a trade-off between performance and flexibility (flexible in the
most flexible sense of the word :)).

That being said, for me, Python (well actually any turing complete
programming language), is more like a box of lego with infinite amount
of pieces.
Scalability and API issues are the same as the shape and function of the
model your making with lego.

Sure some type of pieces might be more suited than other types but since
you can simulate any type of piece with the ones that are already
present, you are more limited by your imagination than by the language.

So in short, I don't see any serious problems using Python, I have used
it in Enterprise environments without any problems but than again I was
aware not to use it for numerical intensive parts without the use of 3rd
party libraries like numpy. Which for me resulted in not doing the
compression of a database delta's in pure python but to offload that to
a more suitable external program, still controlled from Python though.
 
T

Tim Wintle

i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day
(hehe,...my dad always told me to 'AIM HIGH' ;0), not some 'facebook'
or anything like it, its mainly for a financial transactions which
gets pretty busy...

I've got apps running that handle *well* over 100,000 hits / process /
day using Python - although some of the heavy lifting is off-loaded to C
and MySql - obviously without actually looking at your requirements that
doesn't mean much as I don't know how much work each hit requires.

Regarding financial transactions - you'll almost certainly want to
integrate with something that already has transactional support (sql
etc) - so I expect that will bear the brunt of the load
so my question is this would anyone have anything that would make
python a little less of a serious candidate (cos it already is) and
the options may be to use some other languages (maybe java, C (oh
God))

I've avoided integrating java with my python (I'm not a big fan of java)
- but I've integrated quite a bit of C - it's fairly easy to do, and you
can just port the inner loops if you see the need arise.
...i am into a bit of php and building API's in php would not be
the hard part, what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

I've heard that php can be well scaled (by compiling it to bytecode/C++)
- but my preference would always be to python.

Tim
 
S

simn_stv

I'd suggest that if you are running an operation that gets 100,000 hits
a day then your problems won't be with Python but with organizational
aspects of your operation.

regards
Steve
--

very well noted steve, i'd be careful (which is a very relative word)
with the organizational aspects...
i'm sure ure quite rooted in that aspect, hey u need a job??........;)
 
T

Tim Chase

simn_stv said:
i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day
(hehe,...my dad always told me to 'AIM HIGH' ;0), not some 'facebook'
or anything like it, its mainly for a financial transactions which
gets pretty busy...
so my question is this would anyone have anything that would make
python a little less of a serious candidate (cos it already is) and
the options may be to use some other languages (maybe java, C (oh
God))...i am into a bit of php and building API's in php would not be
the hard part, what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

Python is as "enterprise" as the developer who wields it.

Scalability revolves entirely around application design &
implementation. Or you could use Erlang (or haskell, etc ;-)

-tkc
 
D

D'Arcy J.M. Cain

i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day

That's nothing. I ran a financial type app on Python that sometimes
hit 100,000 transactions an hour. We kept looking for bottlenecks that
we could convert to C but never found any. Our biggest problem was in
a network heavy element of the app and that was low level TCP/IP stuff
that rather than being Python's problem was something we used Python to
fix.

As others have pointed out, you will want some kind of enterprise
database that will do a lot of the heavy lifting. I suggest
PostgreSQL. It is the best open source database engine around. That
will take the biggest load off your app.

There's lots of decisions to make in the days ahead but I think that
choosing Python as your base language is a good first one.
so my question is this would anyone have anything that would make
python a little less of a serious candidate (cos it already is) and
the options may be to use some other languages (maybe java, C (oh
God))...i am into a bit of php and building API's in php would not be
the hard part, what i am concerned about is scalability and
efficiency, well, as far as the 'core' is concerned.

Scaleability and efficiency won't be your issues. Speed of development
and clarity of code will be. Python wins.
 
M

Martin P. Hellwig

Our biggest problem was in
a network heavy element of the app and that was low level TCP/IP stuff
that rather than being Python's problem was something we used Python to
fix.
<cut>
Out off interest, could you elaborate on that?

Thanks
 
D

D'Arcy J.M. Cain

<cut>
Out off interest, could you elaborate on that?

Somewhat - there is an NDA so I can't give exact details. It was
crucial to our app that we sync up databases in Canada and the US (later
Britain, Europe and Japan) in real time with those transactions. Our
problem was that even though our two server systems were on the
backbone, indeed with the same major carrier, we could not keep them in
sync. We were taking way to long to transact an update across the
network.

The problem had to do with the way TCP/IP works, especially closer to
the core. Our provider was collecting data and sending it only after
filling a buffer or after a timeout. The timeout was short so it
wouldn't normally be noticed and in most cases (web pages e.g.) the
connection is opened, data is pushed and the connection is closed so
the buffer is flushed immediately. Our patterns were different so we
were hitting the timeout on every single transaction and there was no
way we would have been able to keep up.

Our first crack at fixing this was to simply add garbage to the packet
we were sending. Making the packets an order of magnitude bigger sped
up the proccessing dramatically. That wasn't a very clean solution
though so we looked for a better way.

That better way turned out to asynchronous update transactions. All we
did was keep feeding updates to the remote site and forget about ACKS.
We then had a second process which handled ACKS and tracked which
packets had been properly transferred. The system had IDs on each
update and retries happened if ACKS didn't happen soon enough.
Naturally we ignored ACKS that we had already processed.

All of the above (and much more complexity not even discussed here) was
handled by Python code and database manipulation. There were a few
bumps along the way but overall it worked fine. If we were using C or
even assembler we would not have sped up anything and the solution we
came up with would have been horrendous to code. As it was I and my
chief programmer locked ourselves in the boardroom and had a working
solution before the day was out.

Python wins again.
 
A

Aahz

i plan to build an application, a network based application that i
estimate (and seriously hope) would get as many as 100, 000 hits a day
(hehe,...my dad always told me to 'AIM HIGH' ;0), not some 'facebook'
or anything like it, its mainly for a financial transactions which
gets pretty busy...

Remember that YouTube runs on Python.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
 
M

Martin P. Hellwig

On 02/25/10 16:18, D'Arcy J.M. Cain wrote:
<cut working around ISP's with braindead network configurations>
Very interesting, I had a similar kind of problem (a network balancer
that doesn't balance small tcp packages too well) and solved it by
wrapping the TCP package in UDP. UDP was treated differently, although
in overall switch and router manager it has a lower priority compared to
other tcp packages, in normal usage it was faster.

Probably because UDP has less things to inspect and by this can be
processed faster by all the network equipment in between, but to be
honest it worked for me and the client wasn't interested in academic
explanations and since this was a working solution I didn't investigated
it any further.

Oh and a big thank you for PyGreSQL,! It has proven to be an extremely
useful module for me (especially since I used to hop a lot between
different unixes and Windows).
 
R

Roy Smith

"D'Arcy J.M. Cain said:
The problem had to do with the way TCP/IP works, especially closer to
the core. Our provider was collecting data and sending it only after
filling a buffer or after a timeout. The timeout was short so it
wouldn't normally be noticed and in most cases (web pages e.g.) the
connection is opened, data is pushed and the connection is closed so
the buffer is flushed immediately. Our patterns were different so we
were hitting the timeout on every single transaction and there was no
way we would have been able to keep up.

Our first crack at fixing this was to simply add garbage to the packet
we were sending. Making the packets an order of magnitude bigger sped
up the proccessing dramatically. That wasn't a very clean solution
though so we looked for a better way.

Interesting, the system I'm working with now has a similar problem. We've
got a request/ack protocol over TCP which often sends lots of small packets
and can have all sorts of performance issues because of this.

In fact, we break completely on Solaris-10 with TCP Fusion enabled. We've
gone back and forth with Sun on this (they claim what we're doing is
broken, we claim TCP Fusion is broken). In the end, we just tell all of
our Solaris-10 customers to disable TCP Fusion.
 
D

Diez B. Roggisch

That better way turned out to asynchronous update transactions. All we
did was keep feeding updates to the remote site and forget about ACKS.
We then had a second process which handled ACKS and tracked which
packets had been properly transferred. The system had IDs on each
update and retries happened if ACKS didn't happen soon enough.
Naturally we ignored ACKS that we had already processed.

sounds like using UDP to me, of course with a protocol on top (namely
the one you implemented).

Any reason you sticked to TCP instead?

Diez
 
D

D'Arcy J.M. Cain

sounds like using UDP to me, of course with a protocol on top (namely
the one you implemented).

Any reason you sticked to TCP instead?

TCP does a great job of delivering a stream of data in order and
handling the retries. The app really was connection oriented and we
saw no reason to emulate that over an unconnected protocol. There were
other wheels to reinvent that were more important.
 
D

Diez B. Roggisch

Am 26.02.10 05:01, schrieb D'Arcy J.M. Cain:
TCP does a great job of delivering a stream of data in order and
handling the retries. The app really was connection oriented and we
saw no reason to emulate that over an unconnected protocol. There were
other wheels to reinvent that were more important.

So when you talk about ACKs, you don't mean these on the TCP-level
(darn, whatever iso-level that is...), but on some higher level?

Diez
 
M

mdipierro

100,000 hits a day is not a low. I get that some day on my web server
without problem and without one request dropped.

Most frameworks web2py, Django, Pylons can handle that kind of load
since Python is not the bottle neck.
You have to follow some tricks:

1) have the web server serve static pages directly and set the pragma
cache expire to one month
2) cache all pages that do not have forms for at least few minutes
3) avoid database joins
4) use a server with at least 512KB Ram.
5) if you pages are large, use gzip compression

If you develop your app with the web2py framework, you always have the
option to deploy on the Google App Engine. If you can live with their
constraints you should have no scalability problems.

Massimo
 
S

simn_stv

On 02/25/10 10:26, simn_stv wrote:



<cut>
Python isn't the most efficient language, the assembler provided by the
maker of your CPU probably is the best you can get,
<cut>
LOL...;), yeah right, the mere thought of writing assembler
instructions is SCARY!!
 
S

simn_stv

Somewhat - there is an NDA so I can't give exact details. It was
crucial to our app that we sync up databases in Canada and the US (later
Britain, Europe and Japan) in real time with those transactions. Our
problem was that even though our two server systems were on the
backbone, indeed with the same major carrier, we could not keep them in
sync. We were taking way to long to transact an update across the
network.

The problem had to do with the way TCP/IP works, especially closer to
the core. Our provider was collecting data and sending it only after
filling a buffer or after a timeout. The timeout was short so it
wouldn't normally be noticed and in most cases (web pages e.g.) the
connection is opened, data is pushed and the connection is closed so
the buffer is flushed immediately. Our patterns were different so we
were hitting the timeout on every single transaction and there was no
way we would have been able to keep up.

Our first crack at fixing this was to simply add garbage to the packet
we were sending. Making the packets an order of magnitude bigger sped
up the proccessing dramatically. That wasn't a very clean solution
though so we looked for a better way.

That better way turned out to asynchronous update transactions. All we
did was keep feeding updates to the remote site and forget about ACKS.
We then had a second process which handled ACKS and tracked which
packets had been properly transferred. The system had IDs on each
update and retries happened if ACKS didn't happen soon enough.
Naturally we ignored ACKS that we had already processed.

All of the above (and much more complexity not even discussed here) was
handled by Python code and database manipulation. There were a few
bumps along the way but overall it worked fine. If we were using C or
even assembler we would not have sped up anything and the solution we
came up with would have been horrendous to code. As it was I and my
chief programmer locked ourselves in the boardroom and had a working
solution before the day was out.

sure it wouldnt have sped it up a bit, even a bit?; probably the
development and maintenance time would be a nightmare but it should
speed the app up a bit...
Python wins again.

seriously added to the reputation of python, from my own
perspective...kudos python!
 
S

simn_stv

Somewhat - there is an NDA so I can't give exact details. It was
crucial to our app that we sync up databases in Canada and the US (later
Britain, Europe and Japan) in real time with those transactions. Our
problem was that even though our two server systems were on the
backbone, indeed with the same major carrier, we could not keep them in
sync. We were taking way to long to transact an update across the
network.

The problem had to do with the way TCP/IP works, especially closer to
the core. Our provider was collecting data and sending it only after
filling a buffer or after a timeout. The timeout was short so it
wouldn't normally be noticed and in most cases (web pages e.g.) the
connection is opened, data is pushed and the connection is closed so
the buffer is flushed immediately. Our patterns were different so we
were hitting the timeout on every single transaction and there was no
way we would have been able to keep up.

Our first crack at fixing this was to simply add garbage to the packet
we were sending. Making the packets an order of magnitude bigger sped
up the proccessing dramatically. That wasn't a very clean solution
though so we looked for a better way.

That better way turned out to asynchronous update transactions. All we
did was keep feeding updates to the remote site and forget about ACKS.
We then had a second process which handled ACKS and tracked which
packets had been properly transferred. The system had IDs on each
update and retries happened if ACKS didn't happen soon enough.
Naturally we ignored ACKS that we had already processed.

All of the above (and much more complexity not even discussed here) was
handled by Python code and database manipulation. There were a few
bumps along the way but overall it worked fine. If we were using C or
even assembler we would not have sped up anything and the solution we
came up with would have been horrendous to code. As it was I and my
chief programmer locked ourselves in the boardroom and had a working
solution before the day was out.

sure it wouldnt have sped it up a bit, even a bit?; probably the
development and maintenance time would be a nightmare but it should
speed the app up a bit...
Python wins again.

seriously added to the reputation of python, from my own
perspective...kudos python!
 
S

simn_stv

100,000 hits a day is not a low. I get that some day on my web server
without problem and without one request dropped.

Most frameworks web2py, Django, Pylons can handle that kind of load
since Python is not the bottle neck.
taking a look at django right now, doesnt look too bad from where im
standing, maybe when i get into the code i'd run into some issues that
would cause some headaches!!
You have to follow some tricks:

1) have the web server serve static pages directly and set the pragma
cache expire to one month
2) cache all pages that do not have forms for at least few minutes
3) avoid database joins

but this would probably be to the detriment of my database design,
which is a no-no as far as im concerned. The way the tables would be
structured requires 'joins' when querying the db; or could you
elaborate a little??
4) use a server with at least 512KB Ram.

hmmm...!, still thinking about what you mean by this statement also.
5) if you pages are large, use gzip compression

If you develop your app with the web2py framework, you always have the
option to deploy on the Google App Engine. If you can live with their
constraints you should have no scalability problems.

Massimo

thanks for the feedback...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,045
Latest member
DRCM

Latest Threads

Top