anyone know pandas ? Don't understand error: NotImplementedError...

Discussion in 'Python' started by someone, Apr 17, 2013.

  1. someone

    someone Guest

    Hi,

    Here's my script (from
    http://brenda.moon.net.au/category/data-visualisation/:

    ============================
    #!/usr/bin/python

    import pandas
    import datetime
    import numpy

    datesList = [datetime.date(2011,12,1), \
    datetime.date(2011,12,2), \
    datetime.date(2011,12,3), \
    datetime.date(2011,12,10)]

    countsList = numpy.random.randn(len(datesList))
    startData = datetime.datetime(2011,12,3)
    endData = datetime.datetime(2011,12,8)

    def convertListPairToTimeSeries(dList, cList):
    # my dateList had date objects, so convert back to datetime objects
    dListDT = [datetime.datetime.combine(x, datetime.time()) for x in
    dList]
    # found that NaN didn't work if the cList contained int data
    cListL = [float(x) for x in cList]
    # create the index from the datestimes list
    indx = pandas.Index(dListDT)
    # create the timeseries
    ts = pandas.Series(cListL, index=indx)
    # fill in missing days
    ts = ts.asfreq(pandas.datetools.DateOffset())
    return ts

    print "\nOriginal datesList list:\n", datesList
    tSeries = convertListPairToTimeSeries(datesList, countsList)
    print "\nPandas timeseries:\n", tSeries

    # use slicing to change length of data
    tSeriesSlice = tSeries.ix[startData:endData]
    print "\nPandas timeseries sliced between", startData.date(), \
    "and", endData.date(), ":\n", tSeriesSlice

    # use truncate instead of slicing to change length of data
    tSeriesTruncate = tSeries.truncate(before=startData, after=endData)
    print "\nPandas timeseries truncated between", startData.date(), \
    "and", endData.date(), ":\n", tSeriesTruncate

    # my data had lots of gaps that were actually 0 values, not missing data
    # So I used this to fix the NaN outside the known outage
    startOutage = datetime.datetime(2011,12,7)
    endOutage = datetime.datetime(2011,12,8)
    tsFilled = tSeries.fillna(0)
    # set the known outage values back to NAN
    tsFilled.ix[startOutage:endOutage] = numpy.NAN
    print "\nPandas timeseries NaN reset to 0 outside known outage between", \
    startOutage.date(), "and", endOutage.date(), ":\n", tsFilled

    print "\nPandas series.tail(1) and series.head(1) are handy for " +\
    "checking ends of list:\n", tsFilled.head(1), tsFilled.tail(1)
    print

    tsFilled.plot(); # <====== NotImplementedError...!!!
    ============================


    If I run it, I get:
    --------------------
    ....
    ....
    ...
    2011-12-09 0.000000
    2011-12-10 1.431665
    Freq: <1 DateOffset>

    Pandas series.tail(1) and series.head(1) are handy for checking ends of
    list:
    2011-12-01 -0.969533
    Freq: <1 DateOffset> 2011-12-10 1.431665
    Freq: <1 DateOffset>

    Traceback (most recent call last):
    File "./pandas_example.py", line 57, in <module>
    tsFilled.plot();
    File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
    985, in plot_series
    plot_obj.generate()
    File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
    376, in generate
    self._make_plot()
    File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
    623, in _make_plot
    if self.use_index and self._use_dynamic_x():
    File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
    619, in _use_dynamic_x
    return (freq is not None) and self._is_dynamic_freq(freq)
    File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
    602, in _is_dynamic_freq
    freq = freq.rule_code
    File "/usr/lib/pymodules/python2.7/pandas/tseries/offsets.py", line
    214, in rule_code
    raise NotImplementedError
    NotImplementedError
     
    someone, Apr 17, 2013
    #1
    1. Advertisements

  2. someone

    Wayne Werner Guest

    I don't know anything about pandas, but my recommendation?

    $ vim /usr/lib/pymodules/python2.7/pandas/tseries/offsets.py

    (or nano or emacs - whatever editor you're comfortable with).

    Go to line 214, and take a look-see at what you find. My guess is it will
    be something like:

    def rule_code():
    raise NotImplementedError()



    Which is terribly unhelpful.

    HTH,
    Wayne
     
    Wayne Werner, Apr 18, 2013
    #2
    1. Advertisements

  3. someone

    Neil Cerutti Guest

    It most likely means that the program is instantiating an
    abstract base class when it should be using one of its subclasses
    instead, e.g., BusinessDay, MonthEnd, MonthBegin,
    BusinessMonthEnd, etc.

    http://pandas.pydata.org/pandas-docs/dev/timeseries.html
     
    Neil Cerutti, Apr 18, 2013
    #3
  4. someone

    someone Guest

    Oh, yes - you're completely right:

    # line 211 (empty line)
    @property # line 212
    def rule_code(self): # line 213
    raise NotImplementedError # line 214
    # line 215 (empty line)

    Below and above this "rule_code" is code belonging to some other
    functions... hmmm... I also tried to look in:

    /usr/lib/pymodules/python2.7/pandas/tools/plotting.py

    But I'm very unfamiliar with pandas, so everything looks "correct" to me
    - because I don't understand the data structure, I think I cannot see
    what is wrong...
     
    someone, Apr 18, 2013
    #4
  5. someone

    someone Guest

    Hi Neil and Wayne,

    Thank you very much for your suggestions... I now found out something:
    In the function:

    def convertListPairToTimeSeries(dList, cList):
    ...
    ...
    # create the timeseries
    ts = pandas.Series(cListL, index=indx)
    # fill in missing days
    #ts = ts.asfreq(pandas.datetools.DateOffset())
    return ts

    I had to out-comment the last line before the return-statement (not sure
    what that line is supposed to do, in the first case)...

    Now the program runs, but no plot is seen. Then I found out that I had
    to add:

    import matplotlib.pyplot as plt

    in the top of the program and add the following in the bottom of the
    program:

    plt.show()


    Final program:
    ==================
    #!/usr/bin/python

    import pandas
    import datetime
    import numpy
    import ipdb
    import matplotlib.pyplot as plt

    datesList = [datetime.date(2011,12,1), \
    datetime.date(2011,12,2), \
    datetime.date(2011,12,3), \
    datetime.date(2011,12,10)]

    countsList = numpy.random.randn(len(datesList))
    startData = datetime.datetime(2011,12,3)
    endData = datetime.datetime(2011,12,8)

    def convertListPairToTimeSeries(dList, cList):
    # my dateList had date objects, so convert back to datetime objects
    dListDT = [datetime.datetime.combine(x, datetime.time()) for x in
    dList]
    # found that NaN didn't work if the cList contained int data
    cListL = [float(x) for x in cList]
    # create the index from the datestimes list
    indx = pandas.Index(dListDT)
    # create the timeseries
    ts = pandas.Series(cListL, index=indx)
    # fill in missing days
    #ts = ts.asfreq(pandas.datetools.DateOffset())
    return ts

    print "\nOriginal datesList list:\n", datesList
    tSeries = convertListPairToTimeSeries(datesList, countsList)
    print "\nPandas timeseries:\n", tSeries

    # use slicing to change length of data
    tSeriesSlice = tSeries.ix[startData:endData]
    print "\nPandas timeseries sliced between", startData.date(), \
    "and", endData.date(), ":\n", tSeriesSlice

    # use truncate instead of slicing to change length of data
    tSeriesTruncate = tSeries.truncate(before=startData, after=endData)
    print "\nPandas timeseries truncated between", startData.date(), \
    "and", endData.date(), ":\n", tSeriesTruncate

    # my data had lots of gaps that were actually 0 values, not missing data
    # So I used this to fix the NaN outside the known outage
    startOutage = datetime.datetime(2011,12,7)
    endOutage = datetime.datetime(2011,12,8)
    tsFilled = tSeries.fillna(0)
    # set the known outage values back to NAN
    tsFilled.ix[startOutage:endOutage] = numpy.NAN
    print "\nPandas timeseries NaN reset to 0 outside known outage between", \
    startOutage.date(), "and", endOutage.date(), ":\n", tsFilled

    print "\nPandas series.tail(1) and series.head(1) are handy for " +\
    "checking ends of list:\n", tsFilled.head(1), tsFilled.tail(1)
    print
    tsFilled.plot()
    plt.show()
    ==================

    This seem to work, although I don't fully understand it, as I'm pretty
    new to pandas...
     
    someone, Apr 18, 2013
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.