very high-level IO functions?

Discussion in 'Python' started by York, Sep 19, 2005.

  1. York

    York Guest

    Hi,

    R language has very high-level IO functions, its read.table can read a
    total .csv file and recogonize the types of each column. write.table can
    do the reverse.

    R's MySQL interface has high-level functions, too, e.g. dbWriteTable can
    automatically build a MySQL table and write a table of R data
    into it.

    Is there any python packages do similar things?


    -York
     
    York, Sep 19, 2005
    #1
    1. Advertising

  2. York

    Short answer: yes

    We use python and R at work, and in general you will find python syntax a
    little cleaner for functionality they have in common. R is better for
    some of the more hard-wired stats stuff, though.

    On Mon, 19 Sep 2005 20:04:37 +0200, York <> wrote:

    > Hi,
    >
    > R language has very high-level IO functions, its read.table can read a
    > total .csv file and recogonize the types of each column. write.table can
    > do the reverse.
    >
    > R's MySQL interface has high-level functions, too, e.g. dbWriteTable can
    > automatically build a MySQL table and write a table of R data
    > into it.
    >
    > Is there any python packages do similar things?
    >
    >
    > -York
     
    Caleb Hattingh, Sep 19, 2005
    #2
    1. Advertising

  3. York

    Larry Bates Guest

    While it may "attempt" to recognize the types, it in fact cannot
    be more correct than the programmer. Example:

    data="""0X1E04 111"""

    That "looks" lile a hex and an int. But wait. What if it is
    instead two strings?

    In Python you can easily write a class with a interator that can
    read the data from the file/table and return the PROPER data types
    as lists, tuples, or dictionaries that are easy to manipulate.

    -Larry Bates

    York wrote:
    > Hi,
    >
    > R language has very high-level IO functions, its read.table can read a
    > total .csv file and recogonize the types of each column. write.table can
    > do the reverse.
    >
    > R's MySQL interface has high-level functions, too, e.g. dbWriteTable can
    > automatically build a MySQL table and write a table of R data into
    > it.
    >
    > Is there any python packages do similar things?
    >
    >
    > -York
     
    Larry Bates, Sep 19, 2005
    #3
  4. York

    York Guest

    Caleb Hattingh wrote:
    > York
    >
    > Short answer: yes
    >


    Brilliant! and what are they?

    > We use python and R at work, and in general you will find python syntax
    > a little cleaner for functionality they have in common. R is better
    > for some of the more hard-wired stats stuff, though.


    I love python. However, as a biologist, I like some high-levels
    functions in R. I don't want to spend my time on parse a data file. Then
    in my python script, I call R to read data file and write them into an
    MySQL table. If python can do this easily, I don't need R at all.

    Cheers,

    -York


    >
    > On Mon, 19 Sep 2005 20:04:37 +0200, York <> wrote:
    >
    >> Hi,
    >>
    >> R language has very high-level IO functions, its read.table can read
    >> a total .csv file and recogonize the types of each column.
    >> write.table can do the reverse.
    >>
    >> R's MySQL interface has high-level functions, too, e.g. dbWriteTable
    >> can automatically build a MySQL table and write a table of R
    >> data into it.
    >>
    >> Is there any python packages do similar things?
    >>
    >>
    >> -York

    >
    >
     
    York, Sep 19, 2005
    #4
  5. York

    York Guest

    Your are right, a program cannot be smarter than its programmer. However
    I need a program to parse any table-format data files offered by user. R
    offer such a function, I hope python such a function too.

    -York


    > While it may "attempt" to recognize the types, it in fact cannot
    > be more correct than the programmer. Example:
    >
    > data="""0X1E04 111"""
    >
    > That "looks" lile a hex and an int. But wait. What if it is
    > instead two strings?
    >
    > In Python you can easily write a class with a interator that can
    > read the data from the file/table and return the PROPER data types
    > as lists, tuples, or dictionaries that are easy to manipulate.
    >
    > -Larry Bates
    >
    > York wrote:
    >
    >>Hi,
    >>
    >>R language has very high-level IO functions, its read.table can read a
    >>total .csv file and recogonize the types of each column. write.table can
    >>do the reverse.
    >>
    >>R's MySQL interface has high-level functions, too, e.g. dbWriteTable can
    >> automatically build a MySQL table and write a table of R data into
    >>it.
    >>
    >>Is there any python packages do similar things?
    >>
    >>
    >>-York
     
    York, Sep 19, 2005
    #5
  6. York

    Larry Bates Guest

    It's so easy (using csv module), no need to build in.
    You can wrap in a class if you want to make even easier.
    Same can be done for tables from SQL database.

    import csv
    fp=open(r'C:\test.txt', 'r')
    #
    # test.txt contains:
    #
    # "record","value1","value2"
    # "1","2","3"
    # "2","4","5"
    # "3","6","7"
    table=csv.DictReader(fp)
    for record in table:
    #
    # Record is a dictionary with keys as fieldnames
    # and values of the data in each record
    #
    print "record #=%s, value1=%s, value2=%s" % \
    (record['record'],record['value1'],record['value2'])

    fp.close()

    -Larry Bates


    York wrote:
    > Your are right, a program cannot be smarter than its programmer. However
    > I need a program to parse any table-format data files offered by user. R
    > offer such a function, I hope python such a function too.
    >
    > -York
    >
    >
    >> While it may "attempt" to recognize the types, it in fact cannot
    >> be more correct than the programmer. Example:
    >>
    >> data="""0X1E04 111"""
    >>
    >> That "looks" lile a hex and an int. But wait. What if it is
    >> instead two strings?
    >>
    >> In Python you can easily write a class with a interator that can
    >> read the data from the file/table and return the PROPER data types
    >> as lists, tuples, or dictionaries that are easy to manipulate.
    >>
    >> -Larry Bates
    >>
    >> York wrote:
    >>
    >>> Hi,
    >>>
    >>> R language has very high-level IO functions, its read.table can read a
    >>> total .csv file and recogonize the types of each column. write.table can
    >>> do the reverse.
    >>>
    >>> R's MySQL interface has high-level functions, too, e.g. dbWriteTable can
    >>> automatically build a MySQL table and write a table of R data into
    >>> it.
    >>>
    >>> Is there any python packages do similar things?
    >>>
    >>>
    >>> -York
     
    Larry Bates, Sep 19, 2005
    #6
  7. York a écrit :
    (snip)

    > I love python. However, as a biologist, I like some high-levels
    > functions in R. I don't want to spend my time on parse a data file.

    http://www.python.org/doc/current/lib/module-csv.html

    > Then
    > in my python script, I call R to read data file and write them into an
    > MySQL table. If python can do this easily, I don't need R at all.


    So you don't need R at all.
     
    Bruno Desthuilliers, Sep 19, 2005
    #7
  8. York

    Tom Anderson Guest

    On Mon, 19 Sep 2005, Bruno Desthuilliers wrote:

    > York a écrit :
    > (snip)
    >
    >> I love python. However, as a biologist, I like some high-levels
    >> functions in R. I don't want to spend my time on parse a data file.

    >
    > http://www.python.org/doc/current/lib/module-csv.html
    >
    >> Then in my python script, I call R to read data file and write them
    >> into an MySQL table. If python can do this easily, I don't need R at
    >> all.

    >
    > So you don't need R at all.


    Did you even read the OP's post? Specifically, this bit:

    R language has very high-level IO functions, its read.table can read a
    total .csv file and recogonize the types of each column.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    Python's csv module gives you tuples of strings; it makes no effort to
    recognise the types of the data. AFAIK, python doesn't have any IO
    facilities like this.

    Larry's point that automagical type detection is risky because it can make
    mistakes is a good one, but that doesn't mean that magic is useless - on
    the contrary, for the majority of cases, it works fine, and is extremely
    convenient.

    The good news is that it's reasonably easy to write such a function: you
    just need a function 'type_convert' which takes a string and returns an
    object of the right type; then you can do:

    import csv

    def read_table(f):
    for row in csv.reader(f):
    yield map(type_convert, row)

    This is a very, very rough cut - it doesn't do comment stripping, skipping
    blank lines, dealing with the presence of a header line or the use of
    different separators, etc, but all that's pretty easy to add. Also, note
    that this returns an iterator rather than a list; use list(read_table(f))
    if you want an actual list, or change the implementation of the function.

    type_convert is itself fairly simple:

    def _bool(s): # helper method for booleans
    s = s.lower()
    if (s == "true"): return True
    elif (s == "false"): return False
    else: raise ValueError, s

    types = (int, float, complex, _bool, str)

    def type_convert(s):
    for type in types:
    try:
    return type(s)
    except ValueError:
    pass
    raise ValueError, s

    This whole thing isn't quite as sophisticated as R's table.convert; R
    reads the whole table in, then tries to find a type for each column which
    will fit all the values in that column, whereas i do each cell
    individually. Again, it wouldn't be too hard to do this the other way
    round.

    Anyway, hope this helps. Bear in mind that there are python bindings for
    the R engine, so you could just use R's version of read.table in python.

    tom

    --
    Don't trust the laws of men. Trust the laws of mathematics.
     
    Tom Anderson, Sep 20, 2005
    #8
  9. York

    York Guest

    Thank you, Tom.


    -York


    Tom Anderson wrote:
    > On Mon, 19 Sep 2005, Bruno Desthuilliers wrote:
    >
    >> York a écrit :
    >> (snip)
    >>
    >>> I love python. However, as a biologist, I like some high-levels
    >>> functions in R. I don't want to spend my time on parse a data file.

    >>
    >>
    >> http://www.python.org/doc/current/lib/module-csv.html
    >>
    >>> Then in my python script, I call R to read data file and write them
    >>> into an MySQL table. If python can do this easily, I don't need R at
    >>> all.

    >>
    >>
    >> So you don't need R at all.

    >
    >
    > Did you even read the OP's post? Specifically, this bit:
    >
    > R language has very high-level IO functions, its read.table can read a
    > total .csv file and recogonize the types of each column.
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > Python's csv module gives you tuples of strings; it makes no effort to
    > recognise the types of the data. AFAIK, python doesn't have any IO
    > facilities like this.
    >
    > Larry's point that automagical type detection is risky because it can
    > make mistakes is a good one, but that doesn't mean that magic is useless
    > - on the contrary, for the majority of cases, it works fine, and is
    > extremely convenient.
    >
    > The good news is that it's reasonably easy to write such a function: you
    > just need a function 'type_convert' which takes a string and returns an
    > object of the right type; then you can do:
    >
    > import csv
    >
    > def read_table(f):
    > for row in csv.reader(f):
    > yield map(type_convert, row)
    >
    > This is a very, very rough cut - it doesn't do comment stripping,
    > skipping blank lines, dealing with the presence of a header line or the
    > use of different separators, etc, but all that's pretty easy to add.
    > Also, note that this returns an iterator rather than a list; use
    > list(read_table(f)) if you want an actual list, or change the
    > implementation of the function.
    >
    > type_convert is itself fairly simple:
    >
    > def _bool(s): # helper method for booleans
    > s = s.lower()
    > if (s == "true"): return True
    > elif (s == "false"): return False
    > else: raise ValueError, s
    >
    > types = (int, float, complex, _bool, str)
    >
    > def type_convert(s):
    > for type in types:
    > try:
    > return type(s)
    > except ValueError:
    > pass
    > raise ValueError, s
    >
    > This whole thing isn't quite as sophisticated as R's table.convert; R
    > reads the whole table in, then tries to find a type for each column
    > which will fit all the values in that column, whereas i do each cell
    > individually. Again, it wouldn't be too hard to do this the other way
    > round.
    >
    > Anyway, hope this helps. Bear in mind that there are python bindings for
    > the R engine, so you could just use R's version of read.table in python.
    >
    > tom
    >
     
    York, Sep 21, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raymond Arthur St. Marie II of III

    very Very VERY dumb Question About The new Set( ) 's

    Raymond Arthur St. Marie II of III, Jul 23, 2003, in forum: Python
    Replies:
    4
    Views:
    499
    Raymond Hettinger
    Jul 27, 2003
  2. shanx__=|;-

    very very very long integer

    shanx__=|;-, Oct 16, 2004, in forum: C Programming
    Replies:
    19
    Views:
    1,675
    Merrill & Michele
    Oct 19, 2004
  3. pabbu
    Replies:
    8
    Views:
    740
    Marc Boyer
    Nov 7, 2005
  4. ramtinraji
    Replies:
    0
    Views:
    653
    ramtinraji
    Jan 18, 2008
  5. Scorpiion
    Replies:
    1
    Views:
    1,369
    Scorpiion
    Dec 25, 2008
Loading...

Share This Page