Re: best way to handle this in Python

Discussion in 'Python' started by Dennis Lee Bieber, Jul 20, 2012.

  1. {NOTE: preferences for comp.lang.python are to follow the RFC on
    "netiquette" -- that is, post comments /under/ quoted material, trimming
    what is not relevant... I've restructured this reply to match}

    On Thu, 19 Jul 2012 21:28:12 -0400, Rita <>
    declaimed the following in gmane.comp.python.general:

    >
    >
    > On Thu, Jul 19, 2012 at 8:52 PM, Dave Angel <> wrote:
    >
    > > On 07/19/2012 07:51 PM, Rita wrote:
    > > > Hello,
    > > >
    > > > I have data in many files (/data/year/month/day/) which are named like
    > > > YearMonthDayHourMinute.gz.
    > > >
    > > > I would like to build a data structure which can easily handle querying

    > > the
    > > > data. So for example, if I want to query data from 3 weeks ago till

    > > today,
    > > > i can do it rather quickly.
    > > >
    > > > each YearMonthDayHourMinute.gz file look like this and they are about 4to
    > > > 6kb
    > > > red 34
    > > > green 44
    > > > blue 88
    > > > orange 4
    > > > black 3
    > > > while 153
    > > >
    > > > I would like to query them so I can generate a plot rather quickly but

    > > not
    > > > sure what is the best way to do this.
    > > >
    > > >
    > > >

    > >
    > > What part of your code is giving you difficulty? You didn't post any
    > > code. You don't specify the OS, nor version of your Python, nor what
    > > other programs you expect to use along with Python.
    > >

    > Using linux 2.6.31; Python 2.7.3.
    > I am not necessary looking for code just a pythonic way of doing it.
    > Eventually, I would like to graph the data using matplotlib
    >
    >

    Which doesn't really answer the question. After all, since the
    source data is already in date/time-stamped files, a simple, sorted,
    "glob" of files within a desired span would answer the requirement.

    But -- it would mean that you reparse the files for each processing
    run.

    An alternative would be to run a pre-processor that parses the files
    into, say, an SQLite3 database (and which can determine, from the
    highest datetime entry in the database, which /new/ files need to be
    parsed on subsequent runs). Then do the query/plotting from a second
    program which retrieves data from the database.

    But if this is a process that only needs to be run once, or at rare
    intervals, maybe you only need to parse the files into an in-memory data
    structure... Say a list of tuples of the form:

    [ (datetime, {color: value, color2: value2, ...}), (datetime2,
    ....) ]

    --
    Wulfraed Dennis Lee Bieber AF6VN
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Jul 20, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ravikanth[MVP]
    Replies:
    6
    Views:
    3,927
    Aemca
    Jul 18, 2003
  2. Thomas Scheiderich

    Best way to handle documents in ASP.NET

    Thomas Scheiderich, May 20, 2004, in forum: ASP .Net
    Replies:
    11
    Views:
    2,499
    Jim Corey
    May 20, 2004
  3. Dave Angel

    Re: best way to handle this in Python

    Dave Angel, Jul 20, 2012, in forum: Python
    Replies:
    0
    Views:
    178
    Dave Angel
    Jul 20, 2012
  4. Ian Kelly
    Replies:
    1
    Views:
    156
    Steven D'Aprano
    Jul 21, 2012
  5. Dennis Lee Bieber

    Re: best way to handle this in Python

    Dennis Lee Bieber, Jul 20, 2012, in forum: Python
    Replies:
    0
    Views:
    156
    Dennis Lee Bieber
    Jul 20, 2012
Loading...

Share This Page