Data mapper - need to map an dictionary of values to a model

Luke · Jan 15, 2008

I am writing an order management console. I need to create an import
system that is easy to extend. For now, I want to accept an dictionary
of values and map them to my data model. The thing is, I need to do
things to certain columns:

- I need to filter some of the values (data comes in as YYYY-MM-
DDTHH:MM:SS-(TIMEZONE-OFFSET) and it needs to map to Order.date as a
YYYY-MM-DD field)
- I need to map parts of an input column to more than one model param
(for instance if I get a full name for input--like "John Smith"--I
need a function to break it apart and map it to
Order.shipping_first_name and Order.shipping_last_name)
- Sometimes I need to do it the other way too... I need to map
multiple input columns to one model param (If I get a shipping fee, a
shipping tax, and a shipping discount, I need them added together and
mapped to Order.shipping_fee)

I have begun this process, but I'm finding it difficult to come up
with a good system that is extensible and easy to understand. I won't
always be the one writing the importers, so I'd like it to be pretty
straight-forward. Any ideas?

Oh, I should also mention that many times the data will map to several
different models. For instance, the importer I'm writing first would
map to 3 different models (Order, OrderItem, and OrderCharge)

I am not looking for anybody to write any code for me. I'm simply
asking for inspiration. What design patterns would you use here? Why?

bearophileHUGS · Jan 15, 2008

Luke:

What design patterns would you use here?<

What about "generator (scanner) with parameters"?

Bye,
bearophile

Luke · Jan 15, 2008

Luke:

What about "generator (scanner) with parameters"?

Bye,
bearophile

I'm not familiar with this pattern. I will search around, but if you
have any links or you would like to elaborate, that would be
wonderful.

bearophileHUGS · Jan 15, 2008

Luke:

I'm not familiar with this pattern. I will search around, but if you
have any links or you would like to elaborate, that would be
wonderful.

It's not a pattern, it's a little thing:

def line_filter(filein, params):
for line in filein:
if good(line, params):
yield extract(line, params)

That equals to this too:

def line_filter(filein, params):
return (extract(line, params) for line in filein if good(line,
params))

But probably that's not enough to solve your problem, so other people
can give you a better answer.

Bye,
bearophile

George Sakkis · Jan 15, 2008

I am writing an order management console. I need to create an import
system that is easy to extend. For now, I want to accept an dictionary
of values and map them to my data model. The thing is, I need to do
things to certain columns:

- I need to filter some of the values (data comes in as YYYY-MM-
DDTHH:MM:SS-(TIMEZONE-OFFSET) and it needs to map to Order.date as a
YYYY-MM-DD field)
- I need to map parts of an input column to more than one model param
(for instance if I get a full name for input--like "John Smith"--I
need a function to break it apart and map it to
Order.shipping_first_name and Order.shipping_last_name)
- Sometimes I need to do it the other way too... I need to map
multiple input columns to one model param (If I get a shipping fee, a
shipping tax, and a shipping discount, I need them added together and
mapped to Order.shipping_fee)

I have begun this process, but I'm finding it difficult to come up
with a good system that is extensible and easy to understand. I won't
always be the one writing the importers, so I'd like it to be pretty
straight-forward. Any ideas?

Oh, I should also mention that many times the data will map to several
different models. For instance, the importer I'm writing first would
map to 3 different models (Order, OrderItem, and OrderCharge)

I am not looking for anybody to write any code for me. I'm simply
asking for inspiration. What design patterns would you use here? Why?

The specific transformations you describe are simple to be coded
directly but unless you constrain the set of possible transformations
that can take place, I don't see how can this be generalized in any
useful way. It just seems too open-ended.

The only pattern I can see here is breaking down the overall
transformation to independent steps, just like the three you
described. Given some way to specify each separate transformation,
their combination can be factored out. To illustrate, here's a trivial
example (with dicts for both input and output):

class MultiTransformer(object):
def __init__(self, *tranformers):
self._tranformers = tranformers

def __call__(self, input):
output = {}
for t in self._tranformers:
output.update(t(input))
return output

date_tranformer = lambda input: {'date' : input['date'][:10]}
name_tranformer = lambda input: dict(
zip(('first_name', 'last_name'),
input['name']))
fee_tranformer = lambda input: {'fee' : sum([input['fee'],
input['tax'],
input['discount']])}
tranformer = MultiTransformer(date_tranformer,
name_tranformer,
fee_tranformer)
print tranformer(dict(date='2007-12-22 03:18:99-EST',
name='John Smith',
fee=30450.99,
tax=459.15,
discount=985))
# output
#{'date': '2007-12-22', 'fee': 31895.140000000003,
'first_name': #'J', 'last_name': 'o'}

You can see that the MultiTransformer doesn't buy you much by itself;
it just allows dividing the overall task to smaller bits that can be
documented, tested and reused separately. For anything more
sophisticated, you have to constrain what are the possible
transformations that can happen. I did something similar for
transforming CSV input rows (http://pypi.python.org/pypi/csvutils/) so
that it's easy to specify 1-to-{0,1} transformations but not 1-to-many
or many-to-1.

HTH,
George

George Sakkis · Jan 16, 2008

name_tranformer = lambda input: dict(
zip(('first_name', 'last_name'),
input['name']))

Of course that should write:

name_tranformer = lambda input: dict(
zip(('first_name', 'last_name'),
input['name'].split()))

George

Luke · Jan 16, 2008

I am writing an order management console. I need to create an import
system that is easy to extend. For now, I want to accept an dictionary
of values and map them to my data model. The thing is, I need to do
things to certain columns:

Click to expand...

- I need to filter some of the values (data comes in as YYYY-MM-
DDTHH:MM:SS-(TIMEZONE-OFFSET) and it needs to map to Order.date as a
YYYY-MM-DD field)
- I need to map parts of an input column to more than one model param
(for instance if I get a full name for input--like "John Smith"--I
need a function to break it apart and map it to
Order.shipping_first_name and Order.shipping_last_name)
- Sometimes I need to do it the other way too... I need to map
multiple input columns to one model param (If I get a shipping fee, a
shipping tax, and a shipping discount, I need them added together and
mapped to Order.shipping_fee)

Click to expand...

I have begun this process, but I'm finding it difficult to come up
with a good system that is extensible and easy to understand. I won't
always be the one writing the importers, so I'd like it to be pretty
straight-forward. Any ideas?

Click to expand...

Oh, I should also mention that many times the data will map to several
different models. For instance, the importer I'm writing first would
map to 3 different models (Order, OrderItem, and OrderCharge)

Click to expand...

I am not looking for anybody to write any code for me. I'm simply
asking for inspiration. What design patterns would you use here? Why?

Click to expand...

The specific transformations you describe are simple to be coded
directly but unless you constrain the set of possible transformations
that can take place, I don't see how can this be generalized in any
useful way. It just seems too open-ended.

The only pattern I can see here is breaking down the overall
transformation to independent steps, just like the three you
described. Given some way to specify each separate transformation,
their combination can be factored out. To illustrate, here's a trivial
example (with dicts for both input and output):

class MultiTransformer(object):
def __init__(self, *tranformers):
self._tranformers = tranformers

def __call__(self, input):
output = {}
for t in self._tranformers:
output.update(t(input))
return output

date_tranformer = lambda input: {'date' : input['date'][:10]}
name_tranformer = lambda input: dict(
zip(('first_name', 'last_name'),
input['name']))
fee_tranformer = lambda input: {'fee' : sum([input['fee'],
input['tax'],
input['discount']])}
tranformer = MultiTransformer(date_tranformer,
name_tranformer,
fee_tranformer)
print tranformer(dict(date='2007-12-22 03:18:99-EST',
name='John Smith',
fee=30450.99,
tax=459.15,
discount=985))
# output
#{'date': '2007-12-22', 'fee': 31895.140000000003,
'first_name': #'J', 'last_name': 'o'}

You can see that the MultiTransformer doesn't buy you much by itself;
it just allows dividing the overall task to smaller bits that can be
documented, tested and reused separately. For anything more
sophisticated, you have to constrain what are the possible
transformations that can happen. I did something similar for
transforming CSV input rows (http://pypi.python.org/pypi/csvutils/) so
that it's easy to specify 1-to-{0,1} transformations but not 1-to-many
or many-to-1.

HTH,
George

thank you that is very helpful. I will ponder that for a while

Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023
How to get all values of an object	1	Mar 26, 2022
Is this right way to convert data attributes values to number in javascipt? Need to get valid numeric value or 0	2	May 30, 2023
How to try a range of hex values in C# code ?	0	Nov 19, 2022
How to treat an input data as variable?	4	Apr 13, 2023
Moving to an OOP model from an classically imperitive one	13	Apr 23, 2014
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
How to store data from a sign up form on a website into an sql databse	1	Sep 9, 2022

Data mapper - need to map an dictionary of values to a model

Luke

bearophileHUGS

Luke

bearophileHUGS

George Sakkis

George Sakkis

Luke

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads