Re-thinking my if-thens - a software engineering question

M

metaperl

Ok, I have a module called textgen.py. The point of this module is to
generate a csv file from an array of dictionaries. As I iterate through
each dictionary, I "massage" the dictionary values before writing them
out to csv. Now, for one dictionary entry, I have the following code:

if dict_key == 'PCN':
fields = dict_val.split("/")
mo = re.match( '(\w{2})(\d{2})(\d{2})' , fields[1] )
if mo:
dict_val = "%s/%s%s/%s" % (fields[0], mo.group(1), mo.group(3),
fields[2][1:])
else:
dict_val = dict_val

Ok, so now here is what has happened. This code was based on the
assumption that dict_val would have 2 forward slashes in it. It turns
out that I need to follow a different process of massaging when no
slashes are present. A naive solution would be something like:

if dict_key == 'PCN':
fields = dict_val.split("/")
if fields == 3:
dict_val = pcn_three(fields) # where pcn_three
is the code above
else:
# new logic

But I am wondering if I should abstract the flow of control into a
class or something.

Ideas welcome.
 
N

Neil Cerutti

if dict_key == 'PCN':
fields = dict_val.split("/")
mo = re.match( '(\w{2})(\d{2})(\d{2})' , fields[1] )
if mo:
dict_val = "%s/%s%s/%s" % (fields[0], mo.group(1), mo.group(3),
fields[2][1:])
else:
dict_val = dict_val

Ok, so now here is what has happened. This code was based on
the assumption that dict_val would have 2 forward slashes in
it. It turns out that I need to follow a different process of
massaging when no slashes are present. A naive solution would
be something like:

if dict_key == 'PCN':
fields = dict_val.split("/")
if fields == 3:
dict_val = pcn_three(fields) # where pcn_three
is the code above
else:
# new logic

But I am wondering if I should abstract the flow of control
into a class or something.

This is what I do in Python when a new requirement pops up:

1. Write the simplest/first thing that comes to mind to fix it.
1. a) Am I done? Probably. But maybe not.
2. Now I examine what I've written to see the lay of the code.
Only after writing something new once do I usually have
enough information to write it better. In other words,
writing the code organizes my thoughts. I usually have to
fully understand something, even to get a kludgey solution to
work. The Kludgey solution informs the design of something
better.
2. a) Got to 1. a)

In the case above, I've tried to figure out what you're
specifically doing, and failed. So I don't have more specific
advice.
 
B

Bruno Desthuilliers

metaperl a écrit :
Ok, I have a module called textgen.py. The point of this module is to
generate a csv file from an array of dictionaries.

Err... You know there's a csv module in the stdlib, don't you ?
As I iterate through
each dictionary, I "massage" the dictionary values before writing them
out to csv. Now, for one dictionary entry, I have the following code:

if dict_key == 'PCN':
fields = dict_val.split("/")
mo = re.match( '(\w{2})(\d{2})(\d{2})' , fields[1] )
if mo:
dict_val = "%s/%s%s/%s" % (fields[0], mo.group(1), mo.group(3),
fields[2][1:])
else:
dict_val = dict_val

FWIW, you could as well skip this else clause
Ok, so now here is what has happened. This code was based on the
assumption that dict_val would have 2 forward slashes in it. It turns
out that I need to follow a different process of massaging when no
slashes are present. A naive solution would be something like:

if dict_key == 'PCN':
fields = dict_val.split("/")
if fields == 3: if len(fields) == 3:
dict_val = pcn_three(fields) # where pcn_three
is the code above
else:
# new logic

But I am wondering if I should abstract the flow of control into a
class or something.

Depends... If your code is starting to smell, then yes, it's probably
time to refactor a bit. Note that you don't necessarily needs OO to
clean up a mess. Python functions are objects on their own, so you can
use them like any other variable. A common idiom is to use a dict as a
dispatch table:

def test1(arg):
print "test1", arg

def test2(arg):
print "test2", arg

def test3(arg):
print "test3", arg

dispatch = {
'TEST_ONE': test1,
'TEST_TWO': test2,
'TEST_THREE': test3,
}
default = lambda arg: arg

#...

func = dispatch(dict_key)
dict_val = func(fields)

And using lambdas you can even come close to Perl !-)

dispatch = {
'PCN' : lambda arg: \
(new_logic, pcn_three)[arg.count('/') == 2](arg.split('/')),
'XXX' : code_here,
...
}

Or you could write some "dispatcher helper" functions, like:

def dispatch_on_arg_count(default, *funcs):
def _dispatcher(*args):
try:
func = funcs[len(args)]
except IndexError:
func =default
return func(*args)
return _dispatcher

dispatch = {
'PCN': dispatch_on_arg_count(
default,
default,
pcn_one,
pcn_two,
pcn_three
),
'XXX' : code_here,
}


Or you could check Philip Eby's dispatch package...

And of course, since Python classes are themselves callables (no 'new'
operator, you just call the class to instanciate it), and instances can
be callable too (just define a __call__(self, ...) method), you can
throw OO in the mix too !-)

Now weither it will make your code more readable is another question...
Ideas welcome.

HTH
 
S

Steven D'Aprano

Ok, I have a module called textgen.py. The point of this module is to
generate a csv file from an array of dictionaries.

You probably should use the csv module to do the grunt work, leaving your
module just to massage the dicts into a form that csv can handle.

As I iterate through
each dictionary, I "massage" the dictionary values before writing them
out to csv. Now, for one dictionary entry, I have the following code:

if dict_key == 'PCN':
fields = dict_val.split("/")
mo = re.match( '(\w{2})(\d{2})(\d{2})' , fields[1] )
if mo:
dict_val = "%s/%s%s/%s" % (fields[0], mo.group(1), mo.group(3),
fields[2][1:])
else:
dict_val = dict_val

Your indentation seems very inconsistent -- the lines starting with
"fields = ..." and "mo = ..." should have the same indentation.

The line "dict_val = dict_val" is a no-op. You could just as easily write
that as "pass", or even better, leave out the entire else clause.

Reading between the lines, it sounds like you have a mass of if...elif
clauses, like this:

if dict_key == 'PCN':
# some processing
elif dict_key == 'NCP':
# some different processing
elif ...
# blah blah blah
else:
# default processing


This is an excellent candidate for dictionary-based dispatching. First,
write a function for each processing block:

def process_PCN(value):
# do stuff to value
return result

and then create a dispatch table:

dispatcher = {"PCN": process_PCN, "NCP": process_NCP, ...}

Now your huge if...elif block becomes one line:

# no default processing
dispatcher[dict_key](dict_value)

# with default processing
dispatcher.get(dict_key, process_default)(dict_value)

Ok, so now here is what has happened. This code was based on the
assumption that dict_val would have 2 forward slashes in it. It turns
out that I need to follow a different process of massaging when no
slashes are present. A naive solution would be something like:

if dict_key == 'PCN':
fields = dict_val.split("/")
if fields == 3:

That can't possibly work. fields is a list, not an integer.

I think you want if len(fields) == 3.
dict_val = pcn_three(fields) # where pcn_three
is the code above
else:
# new logic

But I am wondering if I should abstract the flow of control into a
class or something.

Nope. Abstract it into functions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top