Questions about mathematical and statistical functionality in Python

Discussion in 'Python' started by Talbot Katz, Jun 14, 2007.

  1. Talbot Katz

    Talbot Katz Guest

    Greetings Pythoners!

    I hope you'll indulge an ignorant outsider. I work at a financial software
    firm, and the tool I currently use for my research is R, a software
    environment for statistical computing and graphics. R is designed with
    matrix manipulation in mind, and it's very easy to do regression and time
    series modeling, and to plot the results and test hypotheses. The kinds of
    functionality we rely on the most are standard and robust versions of
    regression and principal component / factor analysis, bayesian methods such
    as Gibbs sampling and shrinkage, and optimization by linear, quadratic,
    newtonian / nonlinear, and genetic programming; frequently used graphics
    include QQ plots and histograms. In R, these procedures are all available
    as functions (some of them are in auxiliary libraries that don't come with
    the standard distribution, but are easily downloaded from a central
    repository).

    For a variety of reasons, the research group is considering adopting Python.
    Naturally, I am curious about the mathematical, statistical, and graphical
    functionality available in Python. Do any of you out there use Python in
    financial research, or other intense mathematical/statistical computation?
    Can you compare working in Python with working in a package like R or S-Plus
    or Matlab, etc.? Which of the procedures I mentioned above are available in
    Python? I appreciate any insight you can provide. Thanks!

    -- TMK --
    212-460-5430 home
    917-656-5351 cell
     
    Talbot Katz, Jun 14, 2007
    #1
    1. Advertisements

  2. Talbot Katz

    kyosohma Guest

    I'd look at following modules:

    matplotlib - http://matplotlib.sourceforge.net/
    numpy - http://numpy.scipy.org/

    Finally, this website lists other resources: http://www.astro.cornell.edu/staff/loredo/statpy/

    Mike
     
    kyosohma, Jun 14, 2007
    #2
    1. Advertisements

  3. I use both R and Python for my work. I think R is probably better for
    most of the stuff you are mentioning. I do any sort of heavy
    lifting--database queries/tabulation/aggregation in Python and load the
    resulting data frames into R for analysis and graphics.
     
    Michael Hoffman, Jun 14, 2007
    #3
  4. Talbot Katz

    Tim Churches Guest

    I would second that. It is not either/or. Use Python, including Numpy
    and matplotlib and packages from SciPy, for some things, and R for
    others. And you can even embed R in Python using RPy - see
    http://rpy.sourceforge.net/

    We use the combination of Python, Numpy (actually, the older Numeric
    Python package, but soon to be converted to Numpy), RPy and R in our
    NetEpi Analysis project - exploratory epidemiological analysis of large
    data sets - see http://sourceforge.net/projects/netepi - and it is a
    good combination - Python for the Web interface, data manipulation and
    data heavy-lifting, and for some of the more elementary statistics, and
    R for more involved statistical analysis and graphics (with teh option
    of using matplotlib or other Python-based graphics packages for some
    tasks if we wish). The main thing to remember, though, is that indexing
    is zero-based in Python and 1-based in R...

    Tim C
     
    Tim Churches, Jun 14, 2007
    #4
  5. Talbot Katz

    Josh Gilbert Guest

    Thirded. I use R, Python, Matlab along with other languages (I hate pipeline
    pilot) in my work and from what I've seen nothing can compare with R when it
    comes to stats. I love R, from its brilliant CRAN system (PyPI needs serious
    work to be considered in the same class as CPAN et al) to its delicious Emacs
    integration.

    I just wish there was a way to distribute R packages without requiring the
    user to separately install R.

    In a similar vein, I wish there was a reasonable Free Software equivalent to
    Spotfire. The closest I've found (and they're nowhere near as good) are
    Orange (http://www.ailab.si/orange) and WEKA
    (http://www.cs.waikato.ac.nz/ml/weka/). Orange is written in Python, but its
    tied to QT 2.x as the 3.x series was not available on Windows under the GPL.


    Josh Gilbert
     
    Josh Gilbert, Jun 15, 2007
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.