anyone know pandas ? Don't understand error: NotImplementedError...

S

someone

Hi,

Here's my script (from
http://brenda.moon.net.au/category/data-visualisation/:

============================
#!/usr/bin/python

import pandas
import datetime
import numpy

datesList = [datetime.date(2011,12,1), \
datetime.date(2011,12,2), \
datetime.date(2011,12,3), \
datetime.date(2011,12,10)]

countsList = numpy.random.randn(len(datesList))
startData = datetime.datetime(2011,12,3)
endData = datetime.datetime(2011,12,8)

def convertListPairToTimeSeries(dList, cList):
# my dateList had date objects, so convert back to datetime objects
dListDT = [datetime.datetime.combine(x, datetime.time()) for x in
dList]
# found that NaN didn't work if the cList contained int data
cListL = [float(x) for x in cList]
# create the index from the datestimes list
indx = pandas.Index(dListDT)
# create the timeseries
ts = pandas.Series(cListL, index=indx)
# fill in missing days
ts = ts.asfreq(pandas.datetools.DateOffset())
return ts

print "\nOriginal datesList list:\n", datesList
tSeries = convertListPairToTimeSeries(datesList, countsList)
print "\nPandas timeseries:\n", tSeries

# use slicing to change length of data
tSeriesSlice = tSeries.ix[startData:endData]
print "\nPandas timeseries sliced between", startData.date(), \
"and", endData.date(), ":\n", tSeriesSlice

# use truncate instead of slicing to change length of data
tSeriesTruncate = tSeries.truncate(before=startData, after=endData)
print "\nPandas timeseries truncated between", startData.date(), \
"and", endData.date(), ":\n", tSeriesTruncate

# my data had lots of gaps that were actually 0 values, not missing data
# So I used this to fix the NaN outside the known outage
startOutage = datetime.datetime(2011,12,7)
endOutage = datetime.datetime(2011,12,8)
tsFilled = tSeries.fillna(0)
# set the known outage values back to NAN
tsFilled.ix[startOutage:endOutage] = numpy.NAN
print "\nPandas timeseries NaN reset to 0 outside known outage between", \
startOutage.date(), "and", endOutage.date(), ":\n", tsFilled

print "\nPandas series.tail(1) and series.head(1) are handy for " +\
"checking ends of list:\n", tsFilled.head(1), tsFilled.tail(1)
print

tsFilled.plot(); # <====== NotImplementedError...!!!
============================


If I run it, I get:
--------------------
....
....
...
2011-12-09 0.000000
2011-12-10 1.431665
Freq: <1 DateOffset>

Pandas series.tail(1) and series.head(1) are handy for checking ends of
list:
2011-12-01 -0.969533
Freq: <1 DateOffset> 2011-12-10 1.431665
Freq: <1 DateOffset>

Traceback (most recent call last):
File "./pandas_example.py", line 57, in <module>
tsFilled.plot();
File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
985, in plot_series
plot_obj.generate()
File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
376, in generate
self._make_plot()
File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
623, in _make_plot
if self.use_index and self._use_dynamic_x():
File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
619, in _use_dynamic_x
return (freq is not None) and self._is_dynamic_freq(freq)
File "/usr/lib/pymodules/python2.7/pandas/tools/plotting.py", line
602, in _is_dynamic_freq
freq = freq.rule_code
File "/usr/lib/pymodules/python2.7/pandas/tseries/offsets.py", line
214, in rule_code
raise NotImplementedError
NotImplementedError
 
W

Wayne Werner

File "/usr/lib/pymodules/python2.7/pandas/tseries/offsets.py", line 214, in
rule_code
raise NotImplementedError
NotImplementedError

I don't know anything about pandas, but my recommendation?

$ vim /usr/lib/pymodules/python2.7/pandas/tseries/offsets.py

(or nano or emacs - whatever editor you're comfortable with).

Go to line 214, and take a look-see at what you find. My guess is it will
be something like:

def rule_code():
raise NotImplementedError()



Which is terribly unhelpful.

HTH,
Wayne
 
N

Neil Cerutti

I don't know anything about pandas, but my recommendation?

$ vim /usr/lib/pymodules/python2.7/pandas/tseries/offsets.py

(or nano or emacs - whatever editor you're comfortable with).

Go to line 214, and take a look-see at what you find. My guess is it will
be something like:

def rule_code():
raise NotImplementedError()

Which is terribly unhelpful.

It most likely means that the program is instantiating an
abstract base class when it should be using one of its subclasses
instead, e.g., BusinessDay, MonthEnd, MonthBegin,
BusinessMonthEnd, etc.

http://pandas.pydata.org/pandas-docs/dev/timeseries.html
 
S

someone

I don't know anything about pandas, but my recommendation?

$ vim /usr/lib/pymodules/python2.7/pandas/tseries/offsets.py

(or nano or emacs - whatever editor you're comfortable with).

Go to line 214, and take a look-see at what you find. My guess is it
will be something like:

def rule_code():
raise NotImplementedError()



Which is terribly unhelpful.

Oh, yes - you're completely right:

# line 211 (empty line)
@property # line 212
def rule_code(self): # line 213
raise NotImplementedError # line 214
# line 215 (empty line)

Below and above this "rule_code" is code belonging to some other
functions... hmmm... I also tried to look in:

/usr/lib/pymodules/python2.7/pandas/tools/plotting.py

But I'm very unfamiliar with pandas, so everything looks "correct" to me
- because I don't understand the data structure, I think I cannot see
what is wrong...
 
S

someone

.....


It most likely means that the program is instantiating an
abstract base class when it should be using one of its subclasses
instead, e.g., BusinessDay, MonthEnd, MonthBegin,
BusinessMonthEnd, etc.

http://pandas.pydata.org/pandas-docs/dev/timeseries.html

Hi Neil and Wayne,

Thank you very much for your suggestions... I now found out something:
In the function:

def convertListPairToTimeSeries(dList, cList):
...
...
# create the timeseries
ts = pandas.Series(cListL, index=indx)
# fill in missing days
#ts = ts.asfreq(pandas.datetools.DateOffset())
return ts

I had to out-comment the last line before the return-statement (not sure
what that line is supposed to do, in the first case)...

Now the program runs, but no plot is seen. Then I found out that I had
to add:

import matplotlib.pyplot as plt

in the top of the program and add the following in the bottom of the
program:

plt.show()


Final program:
==================
#!/usr/bin/python

import pandas
import datetime
import numpy
import ipdb
import matplotlib.pyplot as plt

datesList = [datetime.date(2011,12,1), \
datetime.date(2011,12,2), \
datetime.date(2011,12,3), \
datetime.date(2011,12,10)]

countsList = numpy.random.randn(len(datesList))
startData = datetime.datetime(2011,12,3)
endData = datetime.datetime(2011,12,8)

def convertListPairToTimeSeries(dList, cList):
# my dateList had date objects, so convert back to datetime objects
dListDT = [datetime.datetime.combine(x, datetime.time()) for x in
dList]
# found that NaN didn't work if the cList contained int data
cListL = [float(x) for x in cList]
# create the index from the datestimes list
indx = pandas.Index(dListDT)
# create the timeseries
ts = pandas.Series(cListL, index=indx)
# fill in missing days
#ts = ts.asfreq(pandas.datetools.DateOffset())
return ts

print "\nOriginal datesList list:\n", datesList
tSeries = convertListPairToTimeSeries(datesList, countsList)
print "\nPandas timeseries:\n", tSeries

# use slicing to change length of data
tSeriesSlice = tSeries.ix[startData:endData]
print "\nPandas timeseries sliced between", startData.date(), \
"and", endData.date(), ":\n", tSeriesSlice

# use truncate instead of slicing to change length of data
tSeriesTruncate = tSeries.truncate(before=startData, after=endData)
print "\nPandas timeseries truncated between", startData.date(), \
"and", endData.date(), ":\n", tSeriesTruncate

# my data had lots of gaps that were actually 0 values, not missing data
# So I used this to fix the NaN outside the known outage
startOutage = datetime.datetime(2011,12,7)
endOutage = datetime.datetime(2011,12,8)
tsFilled = tSeries.fillna(0)
# set the known outage values back to NAN
tsFilled.ix[startOutage:endOutage] = numpy.NAN
print "\nPandas timeseries NaN reset to 0 outside known outage between", \
startOutage.date(), "and", endOutage.date(), ":\n", tsFilled

print "\nPandas series.tail(1) and series.head(1) are handy for " +\
"checking ends of list:\n", tsFilled.head(1), tsFilled.tail(1)
print
tsFilled.plot()
plt.show()
==================

This seem to work, although I don't fully understand it, as I'm pretty
new to pandas...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top