Factory pattern implementation in Python

Discussion in 'Python' started by googlegroups@romulo.e4ward.com, Dec 4, 2006.

  1. 4ward.com

    4ward.com Guest

    Hi,

    I need to parse a binary file produced by an embedded system, whose
    content consists in a set of events laid-out like this:

    <event 1> <data 1> <event 2> <data 2> ... <event n> <data n>

    Every "event" is a single byte in size, and it indicates how long is
    the associated "data". Thus, to parse all events in the file, I need to
    take it like a stream and read one event at a time, consuming bytes
    according to the event value, and jumping to the next event, until an
    EOF is reached.

    Since there are dozens of almost completely heterogeneous events and
    each one of them may imply different actions on the program parsing the
    file, I thought it would be convenient to have one class encapsulating
    the logic for every event. The parser would then sit in a loop,
    creating objects of different classes and calling a method (say
    "execute"). That method (different in every class) is responsible for
    consuming the bytes associated with the event.

    Hence, as the class the parser needs to instantiate in each iteration
    is not known in advance, a factory should be implemented. Somehow the
    factory should know how to map an event to a class. I don't know of the
    best way I should do that in Python. I made an attempt along the
    following lines:

    1. Create a base class for the events;
    2. For every descendant class declare (in the class body) a public
    attribute "eventNum" and assign it the value of the event it will be
    responsible for;
    3. At runtime, the factory constructor scans the event class hierarchy
    and builds a dictionary mapping "eventNum"'s to classes.

    A draft of the implementation follows:

    #################################

    ##### <events.py module> #####

    class EvtBase:
    def __init__(self, file):
    self.file = file

    def execute(self):
    pass

    class Evt1(EvtBase):
    eventNum = 1
    def execute(self):
    ...

    class Evt2(EvtBase):
    eventNum = 2
    def execute(self):
    ...

    ....

    class EvtN(EvtBase):
    eventNum = N
    def execute(self):
    ...


    ##### <factory.py module> #####

    import inspect
    import events

    class Factory:
    def __isValidEventClass(self, obj):
    if inspect.isclass(obj) and obj != events.EvtBase and \
    events.EvtBase in inspect.getmro(obj):
    for m in inspect.getmembers(obj):
    if m[0] == 'eventNum':
    return True
    return False

    def __init__(self):
    self.__eventDict = {}
    for m in inspect.getmembers(events, self.__isValidEventClass):
    cls = m[1]
    self.__eventDict.update({cls.eventNum: cls})

    def parseEvents(self, file):
    while not file.eof():
    ev = file.read(1)
    self.__eventDict[ev](file).execute()

    #################################

    I'm using the inspect module to find the event classes. One drawback of
    this approach is the need to keep the event classes in a module
    different from that of the factory, because the getmembers method
    expects an already parsed object or module. (The advantage is keeping
    the event number near the class declaration.) I've already had to make
    the solution generic and I found it was not straightforward to separate
    the common logic while avoiding the need to keep the factory and the
    events in two distinct modules.

    Is there anything better I can do? I don't have enough experience with
    Python, then I don't know whether it offers a more obvious way to
    address my problem.

    Thanks in advance.

    --
    Romulo A. Ceccon
    'romulo%s\x40yahoo.com.br' % 'ceccon'
     
    4ward.com, Dec 4, 2006
    #1
    1. Advertising

  2. 4ward.com wrote:

    > Hi,
    >
    > I need to parse a binary file produced by an embedded system, whose
    > content consists in a set of events laid-out like this:
    >
    > <event 1> <data 1> <event 2> <data 2> ... <event n> <data n>
    >
    > Every "event" is a single byte in size, and it indicates how long is
    > the associated "data". Thus, to parse all events in the file, I need to
    > take it like a stream and read one event at a time, consuming bytes
    > according to the event value, and jumping to the next event, until an
    > EOF is reached.
    >
    > Since there are dozens of almost completely heterogeneous events and
    > each one of them may imply different actions on the program parsing the
    > file, I thought it would be convenient to have one class encapsulating
    > the logic for every event. The parser would then sit in a loop,
    > creating objects of different classes and calling a method (say
    > "execute"). That method (different in every class) is responsible for
    > consuming the bytes associated with the event.
    >
    > Hence, as the class the parser needs to instantiate in each iteration
    > is not known in advance, a factory should be implemented. Somehow the
    > factory should know how to map an event to a class. I don't know of the
    > best way I should do that in Python. I made an attempt along the
    > following lines:
    >
    > 1. Create a base class for the events;
    > 2. For every descendant class declare (in the class body) a public
    > attribute "eventNum" and assign it the value of the event it will be
    > responsible for;
    > 3. At runtime, the factory constructor scans the event class hierarchy
    > and builds a dictionary mapping "eventNum"'s to classes.
    >
    > A draft of the implementation follows:
    >
    > #################################
    >
    > ##### <events.py module> #####
    >
    > class EvtBase:
    > def __init__(self, file):
    > self.file = file
    >
    > def execute(self):
    > pass
    >
    > class Evt1(EvtBase):
    > eventNum = 1
    > def execute(self):
    > ...
    >
    > class Evt2(EvtBase):
    > eventNum = 2
    > def execute(self):
    > ...
    >
    > ...
    >
    > class EvtN(EvtBase):
    > eventNum = N
    > def execute(self):
    > ...
    >
    >
    > ##### <factory.py module> #####
    >
    > import inspect
    > import events
    >
    > class Factory:
    > def __isValidEventClass(self, obj):
    > if inspect.isclass(obj) and obj != events.EvtBase and \
    > events.EvtBase in inspect.getmro(obj):
    > for m in inspect.getmembers(obj):
    > if m[0] == 'eventNum':
    > return True
    > return False
    >
    > def __init__(self):
    > self.__eventDict = {}
    > for m in inspect.getmembers(events, self.__isValidEventClass):
    > cls = m[1]
    > self.__eventDict.update({cls.eventNum: cls})
    >
    > def parseEvents(self, file):
    > while not file.eof():
    > ev = file.read(1)
    > self.__eventDict[ev](file).execute()
    >
    > #################################
    >
    > I'm using the inspect module to find the event classes. One drawback of
    > this approach is the need to keep the event classes in a module
    > different from that of the factory, because the getmembers method
    > expects an already parsed object or module. (The advantage is keeping
    > the event number near the class declaration.) I've already had to make
    > the solution generic and I found it was not straightforward to separate
    > the common logic while avoiding the need to keep the factory and the
    > events in two distinct modules.
    >
    > Is there anything better I can do? I don't have enough experience with
    > Python, then I don't know whether it offers a more obvious way to
    > address my problem.
    >
    > Thanks in advance.


    If you actually intend to
    1) name your Event subclasses Evt1, Evt2, ... EvtN and not give more
    descriptive (but unrelated to the magic event number) names, and
    2) put them all in one module (events.py),
    you can avoid the code duplication of putting the event number both in
    the class name and as a class attribute. Your dispatcher could then be
    as simple as:

    import events

    def parseEvents(file):
    while not file.eof():
    ev = int(file.read(1))
    cls = getattr(events, 'Evt%d' % ev)
    cls(file).execute()

    By the way, it is not clear from your description if the event number
    equals to the size of the associated data. If it is, you can factor out
    the data extraction part in the factory function and pass just the
    extracted data in the Event constructor instead of the file:

    def parseEvents(file):
    while not file.eof():
    ev = int(file.read(1))
    cls = getattr(events, 'Evt%d' % ev)
    cls(file.read(ev)).execute()


    George
     
    George Sakkis, Dec 4, 2006
    #2
    1. Advertising

  3. 4ward.com

    Chris Mellon Guest

    On 4 Dec 2006 08:39:17 -0800, 4ward.com
    <4ward.com> wrote:
    > Hi,
    >
    > I need to parse a binary file produced by an embedded system, whose
    > content consists in a set of events laid-out like this:
    >
    > <event 1> <data 1> <event 2> <data 2> ... <event n> <data n>
    >
    > Every "event" is a single byte in size, and it indicates how long is
    > the associated "data". Thus, to parse all events in the file, I need to
    > take it like a stream and read one event at a time, consuming bytes
    > according to the event value, and jumping to the next event, until an
    > EOF is reached.
    >
    > Since there are dozens of almost completely heterogeneous events and
    > each one of them may imply different actions on the program parsing the
    > file, I thought it would be convenient to have one class encapsulating
    > the logic for every event. The parser would then sit in a loop,
    > creating objects of different classes and calling a method (say
    > "execute"). That method (different in every class) is responsible for
    > consuming the bytes associated with the event.
    >
    > Hence, as the class the parser needs to instantiate in each iteration
    > is not known in advance, a factory should be implemented. Somehow the
    > factory should know how to map an event to a class. I don't know of the
    > best way I should do that in Python. I made an attempt along the
    > following lines:
    >
    > 1. Create a base class for the events;
    > 2. For every descendant class declare (in the class body) a public
    > attribute "eventNum" and assign it the value of the event it will be
    > responsible for;
    > 3. At runtime, the factory constructor scans the event class hierarchy
    > and builds a dictionary mapping "eventNum"'s to classes.
    >
    > A draft of the implementation follows:
    >
    > #################################
    >
    > ##### <events.py module> #####
    >
    > class EvtBase:
    > def __init__(self, file):
    > self.file = file
    >
    > def execute(self):
    > pass
    >
    > class Evt1(EvtBase):
    > eventNum = 1
    > def execute(self):
    > ...
    >
    > class Evt2(EvtBase):
    > eventNum = 2
    > def execute(self):
    > ...
    >
    > ...
    >
    > class EvtN(EvtBase):
    > eventNum = N
    > def execute(self):
    > ...
    >
    >
    > ##### <factory.py module> #####
    >
    > import inspect
    > import events
    >
    > class Factory:
    > def __isValidEventClass(self, obj):
    > if inspect.isclass(obj) and obj != events.EvtBase and \
    > events.EvtBase in inspect.getmro(obj):
    > for m in inspect.getmembers(obj):
    > if m[0] == 'eventNum':
    > return True
    > return False
    >
    > def __init__(self):
    > self.__eventDict = {}
    > for m in inspect.getmembers(events, self.__isValidEventClass):
    > cls = m[1]
    > self.__eventDict.update({cls.eventNum: cls})
    >
    > def parseEvents(self, file):
    > while not file.eof():
    > ev = file.read(1)
    > self.__eventDict[ev](file).execute()
    >
    > #################################
    >
    > I'm using the inspect module to find the event classes. One drawback of
    > this approach is the need to keep the event classes in a module
    > different from that of the factory, because the getmembers method
    > expects an already parsed object or module. (The advantage is keeping
    > the event number near the class declaration.) I've already had to make
    > the solution generic and I found it was not straightforward to separate
    > the common logic while avoiding the need to keep the factory and the
    > events in two distinct modules.
    >
    > Is there anything better I can do? I don't have enough experience with
    > Python, then I don't know whether it offers a more obvious way to
    > address my problem.
    >


    I'd have the classes register themselves rather than trying to find
    them. This removes the need to have a common base class (preserves
    duck typing) and lets everything communicate via the factory module.

    #in module Factory.py

    EventMap = {}

    #in module events.py

    import Factory
    class EventHandler:
    Factory.EventMap[1] = EventHandler

    #in module parser.py

    import Factory

    handler = Factory.EventMap[event]()
    handler.handleEvent(data)


    There's probably some way to wrap the registration up in a metaclass
    so it's handled implicitly, but I prefer the explicit approach.


    > Thanks in advance.
    >
    > --
    > Romulo A. Ceccon
    > 'romulo%s\x40yahoo.com.br' % 'ceccon'
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
     
    Chris Mellon, Dec 4, 2006
    #3
  4. George Sakkis wrote:

    > If you actually intend to
    > 1) name your Event subclasses Evt1, Evt2, ... EvtN and not give more
    > descriptive (but unrelated to the magic event number) names


    No, those names are just an example. The actual classes have
    descriptive names.

    > By the way, it is not clear from your description if the event number
    > equals to the size of the associated data.


    I'm sorry, George. The event number has nothing to do with the size of
    the associated data. I meant the program has a way to discover the size
    from the event number.

    --
    Romulo A. Ceccon
    'romulo%s\x40yahoo.com.br' % 'ceccon'
     
    Romulo A. Ceccon, Dec 4, 2006
    #4
  5. Romulo A. Ceccon wrote:
    > George Sakkis wrote:
    >
    > > If you actually intend to
    > > 1) name your Event subclasses Evt1, Evt2, ... EvtN and not give more
    > > descriptive (but unrelated to the magic event number) names

    >
    > No, those names are just an example. The actual classes have
    > descriptive names.


    Even then, I'd prefer a naming convention plus a global Event registry
    than relying on inspect, both for implementation and (mostly)
    documentation reasons. It's good if a human can browse through a list
    of a few dozen names and immediately know that
    CamelCasedNameEndingWithEvent is an Event subclass. It's also good to
    be able to find in one place the explicit mapping of magic numbers to
    classes rather than searching in the whole file (or worse, multiple
    files) for it. YMMV.

    George
     
    George Sakkis, Dec 4, 2006
    #5
  6. On 4 Dec 2006 08:39:17 -0800, 4ward.com declaimed
    the following in comp.lang.python:

    > Hi,
    >
    > I need to parse a binary file produced by an embedded system, whose
    > content consists in a set of events laid-out like this:
    >
    > <event 1> <data 1> <event 2> <data 2> ... <event n> <data n>
    >
    > Every "event" is a single byte in size, and it indicates how long is
    > the associated "data". Thus, to parse all events in the file, I need to


    Unclear: is <event x> /just/ a length marker, and no two /types/ of
    events have the same data length, or is it an event type code, and the
    length of the data is implicit?
    > Is there anything better I can do? I don't have enough experience with
    > Python, then I don't know whether it offers a more obvious way to
    > address my problem.
    >

    I've been playing with Python for a decade now, and never used
    inspect or any of the complexities you seem to be starting with.

    Presuming the <event x> is a type code I'd just set up a list of
    functions:

    def process_1(stream):
    data = stream.read(#bytes specific to type 1)
    do stuff with data

    def process_2(stream):
    repeat until all defined

    Then create a dictionary of them, keyed by the <event x> code

    processors = { "1" : process_1,
    "2" : process_2,
    ....
    "x" : process_x }


    and a main processing loop of

    while not stream.eof():
    code = stream.read(1)
    processors
    Code:
    (stream)
    
    -- 
    	Wulfraed	Dennis Lee Bieber		KD6MOG
    			
    		HTTP://wlfraed.home.netcom.com/
    	(Bestiaria Support Staff:		)
    		HTTP://www.bestiaria.com/
     
    Dennis Lee Bieber, Dec 4, 2006
    #6
  7. 4ward.com

    Guest

    Dennis Lee Bieber:
    > Presuming the <event x> is a type code I'd just set up a list of functions:
    > Then create a dictionary of them, keyed by the <event x> code
    > processors = { "1" : process_1,
    > "2" : process_2,
    > ....
    > "x" : process_x }


    Just a dict of functions was my solution too, I think avoiding more
    complex solutions is positive.

    Bye,
    bearophile
     
    , Dec 4, 2006
    #7
  8. 4ward.com

    Terry Reedy Guest

    <> wrote in message
    news:...
    > Dennis Lee Bieber:
    >> Presuming the <event x> is a type code I'd just set up a list of
    >> functions:
    >> Then create a dictionary of them, keyed by the <event x> code
    >> processors = { "1" : process_1,
    >> "2" : process_2,
    >> ....
    >> "x" : process_x }

    >
    > Just a dict of functions was my solution too, I think avoiding more
    > complex solutions is positive.


    If the event codes start at 0 and run sequentially, a tuple or list would
    be even easier.
     
    Terry Reedy, Dec 4, 2006
    #8
  9. 4ward.com

    Paul McGuire Guest

    <> wrote in message
    news:...
    > Dennis Lee Bieber:
    >> Presuming the <event x> is a type code I'd just set up a list of
    >> functions:
    >> Then create a dictionary of them, keyed by the <event x> code
    >> processors = { "1" : process_1,
    >> "2" : process_2,
    >> ....
    >> "x" : process_x }

    >
    > Just a dict of functions was my solution too, I think avoiding more
    > complex solutions is positive.
    >
    > Bye,
    > bearophile
    >

    I think I'd go one step up the OO ladder and match each event code to a
    class. Have every class implement a staticmethod something like
    "load(stream)" (using pickle terminology), and then use a dict to dispatch.

    eventTypes = { "1" : BlahEvent, "2" : BlehEvent, "3" : BelchEvent, "4" :
    BlechEvent }

    eventObj = eventTypes[ stream.read(1) ].load( stream )

    Now transcending from plain-old-OO to Pythonic idiom, make this into a
    generator:

    eventTypes = { "1" : BlahEvent, "2" : BlehEvent, "3" : BelchEvent, "4" :
    BlechEvent }
    def eventsFromStream(stream):
    while not stream.EOF:
    evtTyp = stream.read(1)
    yield eventTypes[evtTyp].load(stream)

    and then get them all in a list using

    list( eventsFromStream( stream ) )

    -- Paul
     
    Paul McGuire, Dec 4, 2006
    #9
  10. At Monday 4/12/2006 13:39, 4ward.com wrote:

    >class Factory:
    > def __isValidEventClass(self, obj):
    > if inspect.isclass(obj) and obj != events.EvtBase and \
    > events.EvtBase in inspect.getmro(obj):
    > for m in inspect.getmembers(obj):
    > if m[0] == 'eventNum':
    > return True
    > return False
    >
    > def __init__(self):
    > self.__eventDict = {}
    > for m in inspect.getmembers(events, self.__isValidEventClass):
    > cls = m[1]
    > self.__eventDict.update({cls.eventNum: cls})
    >
    > def parseEvents(self, file):
    > while not file.eof():
    > ev = file.read(1)
    > self.__eventDict[ev](file).execute()


    You already got other ways to go.
    But if you want to use several classes (maybe span along several
    modules), you can code the checking a lot more easily:

    if issubclass(cls, EvtBase) and hasattr(cls, 'eventNum'): ...

    I'd register each class itself, so no need to iterate over the
    members, but anyway you could use vars(module).
    In short, inspect may be good for debugging or documenting tools, but
    hardly needed for usual code.

    BTW, that file format is horrible. Users have to know *every* posible
    event (at least its size), even if the're not interested in them. And
    if you get out of sync, you can't recover. Add a new event, and all
    programs using the file don't work anymore. Ugh!


    --
    Gabriel Genellina
    Softlab SRL

    __________________________________________________
    Correo Yahoo!
    Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
    ¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
     
    Gabriel Genellina, Dec 4, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Medi Montaseri
    Replies:
    17
    Views:
    921
    Medi Montaseri
    Sep 3, 2003
  2. Nathan Harmston

    Use of factory pattern in Python?

    Nathan Harmston, Dec 7, 2006, in forum: Python
    Replies:
    0
    Views:
    350
    Nathan Harmston
    Dec 7, 2006
  3. Gabriel Genellina

    Re: Use of factory pattern in Python?

    Gabriel Genellina, Dec 7, 2006, in forum: Python
    Replies:
    1
    Views:
    373
    Nick Craig-Wood
    Dec 7, 2006
  4. sunny
    Replies:
    1
    Views:
    482
    Salt_Peter
    Dec 7, 2006
  5. C#
    Replies:
    4
    Views:
    440
Loading...

Share This Page