package extension problem

F

Fabrizio Pollastri

Hello,

I wish to extend the functionality of an existing python package by creating
a new package that redefines the relevant classes of the old package. Each
new class inherits the equivalent old class and adds new methods.

In the new package there is something like the following.

import old_package as op

class A(op.A):
...
add new methods
...

class B(op.B):
...
add new methods
...

Some classes of the old package works as a dictionary of other classes
of the
same old package. Example: if class A and class B are classes of the old
package,
B[some_hash] returns an instance of A.

When a program imports the new package and create instances of the new
class B,
B[some_hash] still returns an instance of the old class A, while I want
an instance of the new class A.

There is a way to solve this problem without redefining in the new
package all the
methods of the old package that return old classes?


Thanks in advance for any suggestion,
Fabrizio
 
M

Miki Tebeka

B[some_hash] still returns an instance of the old class A, while I want
an instance of the new class A.
I don't understand this sentence. How does B[some_hash] related to A?

I've tried the below and it seems to work. Can you paste some code to help us understand more?

-- old.py --
class A:
pass

-- new.py __
import old
class A(old.A):
pass

-- main.py --
import new
a = new.A()
a.__class__ # Shows new.A
 
M

Miki Tebeka

B[some_hash] still returns an instance of the old class A, while I want
an instance of the new class A.
I don't understand this sentence. How does B[some_hash] related to A?

I've tried the below and it seems to work. Can you paste some code to help us understand more?

-- old.py --
class A:
pass

-- new.py __
import old
class A(old.A):
pass

-- main.py --
import new
a = new.A()
a.__class__ # Shows new.A
 
F

Fabrizio Pollastri

Ok. To be more clear, consider the real python package Pandas.

This package defines a Series class and a DataFrame class.
The DataFrame is a matrix that can have columns of
different type.

If I write

import pandas as pd
df = pd.DataFrame({'A':[1,2,3],'B':[4,5,6]})

a data frame with two cols named A and B is created.

If I write

col_A = df['A']

the returned col_A is an instance of Series.

Now , let suppose that I want to extend some functionality of pandas
by adding new methods to both Series and DataFrame classes.

One way to do this is to redefine this classes in a new package
(new_pandas) as follow

import pandas as pd

class Series(pd.Series):
...
add new methods
...

class DataFrame(pd.DataFrame):
...
add new methods
...

When I use the new package as a pandas substitute and write

import new_pandas as np
df = np.DataFrame({'A':[1,2,3],'B':[4,5,6]})
col_A = df['A']

col_A is an instance of the original pandas and not of the new pandas,
losing all the added functionality.


Fabrizio











Now, how can I add new methods to extend the functionality of pandas
classes Series and DataFrame
 
M

Miki Tebeka

import new_pandas as np
df = np.DataFrame({'A':[1,2,3],'B':[4,5,6]})
col_A = df['A']
I'm not familiar with pandas, but my *guess* will be that you'll need to override __getitem__ in the new DataFrame.
 
M

Miki Tebeka

import new_pandas as np
df = np.DataFrame({'A':[1,2,3],'B':[4,5,6]})
col_A = df['A']
I'm not familiar with pandas, but my *guess* will be that you'll need to override __getitem__ in the new DataFrame.
 
P

Peter Otten

Fabrizio said:
Ok. To be more clear, consider the real python package Pandas.

This package defines a Series class and a DataFrame class.
The DataFrame is a matrix that can have columns of
different type.

If I write

import pandas as pd
df = pd.DataFrame({'A':[1,2,3],'B':[4,5,6]})

a data frame with two cols named A and B is created.

If I write

col_A = df['A']

the returned col_A is an instance of Series.

Now , let suppose that I want to extend some functionality of pandas
by adding new methods to both Series and DataFrame classes.

One way to do this is to redefine this classes in a new package
(new_pandas) as follow

import pandas as pd

class Series(pd.Series):
...
add new methods
...

class DataFrame(pd.DataFrame):
...
add new methods
...

When I use the new package as a pandas substitute and write

import new_pandas as np
df = np.DataFrame({'A':[1,2,3],'B':[4,5,6]})
col_A = df['A']

col_A is an instance of the original pandas and not of the new pandas,
losing all the added functionality.

A quick look into the pandas source reveals that the following might work:

# untested
class DataFrame(pd.DataFrame):
@property
def _constructor(self):
return DataFrame # your class
# your new methods
 
T

Terry Reedy

import new_pandas as np df =
np.DataFrame({'A':[1,2,3],'B':[4,5,6]}) col_A = df['A']
I'm not familiar with pandas, but my *guess* will be that you'll need
to override __getitem__ in the new DataFrame.

This is essentially the same problem that if you, for instance, subclass
int as myint, you need to override (wrap) *every* method to get them to
return myints instead of ints.

class myint(int):
...
def __add__(self, other): return myint(self+other)
....

In the OP's case, if the original class is in python, one might be able
to just change the __class__ attribute. But I would make sure to have a
good set of tests in any case.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top