Importing a class without knowing the module

Discussion in 'Python' started by Franck PEREZ, Nov 17, 2005.

  1. Franck PEREZ

    Franck PEREZ Guest

    Hello,

    I'm developing a small XML marshaller and I'm facing an annoying
    issue. Here's some sample code:

    ########### My test application ############
    class Foo(object):
    #The class I'd like to serialize
    pass

    import myMarshaller
    foo = Foo()
    s = myMarshaller.dumps(foo) #works fine, spits something like <object
    class = "Foo"...>
    another_foo = loads(s) #fails, see below

    ########### My marshaller (in its own module) ############
    def loads(s):
    #First, get class name (here "Foo")
    klass = eval(className) #fails because "Foo" is not in the
    marshaller's namespace !

    How could I tell the marshaller to locate the Foo class and any other
    class I try to deserialize ? I've tried to pass my test application's
    globals() to the marshaller, it works but it's dirty IMHO... I've
    tried also to locate the class (here "Foo") somewhere in sys.modules
    in the "loads" method, but it was heavy and unsuccessful.

    Thanks a lot for your help !
    Franck
     
    Franck PEREZ, Nov 17, 2005
    #1
    1. Advertisements

  2. Franck PEREZ

    Mike Meyer Guest

    How about adding Foo.__file__ to the serialized data?

    <mike
     
    Mike Meyer, Nov 18, 2005
    #2
    1. Advertisements

  3. Franck PEREZ

    Franck PEREZ Guest

    I thought about it, but it would make the XML file depend on the
    machine... no more portability...
     
    Franck PEREZ, Nov 18, 2005
    #3
  4. Your top-posting makes this discourse weird (why put your comments
    BEFORE the text you're commenting on?!), but anyway I think that using
    the class's __module__ rather than __file__ should be what you want.


    Alex
     
    Alex Martelli, Nov 18, 2005
    #4
  5. Franck PEREZ

    Mike Meyer Guest

    [Format recovered from top posting.]

    They already depend on the machine. You can't take them to an arbitary
    machine and reconstruct them: it has to have the classes the XML file
    depends on somewhere on it. You can use the module name if you have it
    available. If not, deriving the module name from the file name is
    about the best you can do.

    <mike
     
    Mike Meyer, Nov 18, 2005
    #5
  6. I disagree with the last sentence. From a filepath of
    '/zip/zap/zop/zup.py', it's VERY hard to say whether the module name is
    to be zup, zop.zup, or zap.zop.zup -- it depends on which directories
    are on sys.path and which have __init__.py, which is impossible to tell
    at unmarshaling time. If you use Foo.__module__, you should get the
    *module* name correctly, independently from sys.path or __init__.py's,
    or .pth files for that matter -- so, I do NOT agree that using __file__
    is "about the best you can do" for this use case.


    Alex
     
    Alex Martelli, Nov 18, 2005
    #6
  7. Franck PEREZ

    Mike Meyer Guest

    You should read the next-to-last sentence, which says to use the
    module name if you have it. The last sentence starts "If not" -
    meaning you don't have the module name. *That's* the case for which
    the file name is about the best you can do.

    <mike
     
    Mike Meyer, Nov 18, 2005
    #7
  8. I see! Thanks for clarifying. Could you please give me an example of a
    Foo class which has a Foo.__file__ attribute but not a Foo.__module__
    attribute? Sorry, must be some moment of weakness on my mind's part
    (quite possible, since I am recovering from recent surgery), but I
    cannot think of a situation where that would be the case (while classes
    with __module__ and w/o __file__ are the normal situation). Were there
    no case in which, given a class, you can learn about its file (by a
    __file__ attribute) but not about its module (by a __module__
    attribute), I would of course hope that my inability to parse that
    sentence of yours, which would under such hypothetical circumstaces be
    an absurd hypothesis, might be more forgivable.


    Alex
     
    Alex Martelli, Nov 18, 2005
    #8
  9. Franck PEREZ

    Mike Meyer Guest

    A classes __module__ attribute doesn't always tell you the name of the
    module - or at least, not a name that would be usefull for the the OPs
    use case. That's the case where you don't have the module name. The
    reference to a classes __file__ attribute was meant to be to the
    modules __file__ attribute - I'm surprised that no one picked up on
    that. Again, assuming that the module has an __file__ attribute at
    all. Getting the __file__ attribute to a module you don't know the
    name of is a bit tricky, but possible.

    <mike
     
    Mike Meyer, Nov 18, 2005
    #9
  10. Franck PEREZ

    Franck PEREZ Guest

    Thanks for your answers. And btw, sorry for top-posting, I'll be more
    careful next time.
     
    Franck PEREZ, Nov 18, 2005
    #10
  11. How do you arrange a module so that its classes' __module__ attributes
    don't tell you the name of the module "that would be useful", yet the
    module's __file__ DOES give you information that you can usefully
    process, heuristically I assume, to infer a module name that IS useful?

    I just don't see it, which of course doesn't mean it can't be done, with
    sufficient evil ingenuity. But since you're the one arguing that this
    case is important enough to be worth dwelling on, I'd rather see some
    specific examples from you, to understand whether my gut reaction "if
    possible at all, this has gotta be such a totally CORNER, ARTIFICIAL
    case, that it's absurd to bend over backwards to cover it" is justified,
    or whether I'm misjudging instead.
    If you have the module object, getting its __file__ isn't hard -- but
    then, neither is getting its module name... m.__file__ and m.__name__
    are just about as accessible as each other. Of the two, the one more
    likely to be useless would be the __file__ -- import hooks (zipimport or
    more deviously sophisticated ones) might mean that string is not a file
    path at all; __name__ is supposed to be the string key of that module
    object within sys.modules, and it is, I think, far less likely that any
    tricks will have been played wrt that -- plus, you can easily
    doublecheck by scouring sys.modules.

    I just cannot see ANY case where I would want to try heuristics on some
    __file__ attribute (hopefully but not necessarily a filename) to try and
    recover a __name__ that appears to be missing, mangled, or incorrect; my
    instinct would be to raise exceptions informing the user that what
    they're trying to marshal is too weird and strange for their own good,
    and let them deal with the situation. But as I said, that may depend on
    a failure of the imagination -- if you can show me compelling use cases
    in which heuristics on __file__ prove perfectly satisfactory where just
    dealing with __name__ wouldn't, I'm quite ready to learn!


    Alex
     
    Alex Martelli, Nov 18, 2005
    #11
  12. Franck PEREZ

    Mike Meyer Guest

    So what module name do you import if C.__module__ is __main__?
    I'm not dwelling on it, you are. All I did was recommend using the
    module name if you had one, and if not then the file name was your
    best bet. You chose to ignore part of my statement to focus on the
    part you disagreed with, because you thought what I had suggested in
    the first place was a better idea. I pointed out this oversight on
    your part, and you've been dwelling on it ever since.

    Frankly, I thought you were brighter than that, have no idea why
    you're acting like this, and frankly don't think you need a tutor. I
    figured this out by fooling around with the interpreter before posting
    in the first place, you certainly know enough to do the same.

    <mike
     
    Mike Meyer, Nov 18, 2005
    #12
  13. In this case, I would strongly suggest that raising an exception upon
    the attempt to marshal is better than trying to guess that the file
    holding the main script could later be innocuously imported from a
    different script, or, alternatively, that the same main script would be
    running when a future unmarshaling attempt is done. "In the face of
    ambiguity, refuse the temptation to guess" is a sound principle.

    I don't see the history of this thread the same way as you do,
    apparently. I first posted to this thread in a post identified as:
    """
    Date: Thu, 17 Nov 2005 17:28:56 -0800
    Message-ID: <1h66lub.295q3b19n84anN%>
    """
    and later replied to a post of yours identified as:
    """
    Date: Thu, 17 Nov 2005 23:27:45 -0500
    Message-ID: <>
    """
    i.e., your post, to which I was replying, was about 3 hours later than
    my original one. My original post said:
    so it seems incorrect to say that you had "suggested in the first place"
    the idea of using __module__; on the contrary, in a previous post of
    yours, the first one of yours on this thread, specifically:
    """
    Date: Thu, 17 Nov 2005 19:10:39 -0500
    Message-ID: <>
    """
    your ENTIRE suggestion -- what you "suggested in the first place" -- was
    the entirely wrong:
    So, it seems to me that you're the one dwelling on the issue, because
    you made the original wrong suggestion, and after I corrected it,
    repeated and are now defending the gist of it (though not the obviously
    wrong idea of using a non-existent __file__ attribute of the Foo class).

    "Before posting in the first place" (your post at 19:10 EST) you clearly
    had not tested your suggestion (nor had you claimed to, until now). In
    any case, if your specific idea is that mucking around with the __file__
    attribute of the __main__ module is a reasonable way to solve the
    conumdrum of marshaling instances of classes defined in __main__ (which,
    as per the first paragraph of this post, I dispute) it would have been
    vastly preferable to say so in the first place, rather than expressing
    this particular case as just an "If not" (wrt "You can use the module
    name if you have it available") -- particularly since you DO have the
    module name available for classes defined in __main__, it's just
    suspicious and risky to use it in marshaling. Incidentally, pickle DOES
    just use the module name in marshaling even in this case -- the opposite
    tack to what you suggest, and, in its own special way, almost as wrong,
    IMHO (just guessing the other way, in the presence of obvious
    ambiguity). Still, pickle does take ONE precaution worth noticing: it
    checks that the class it's marshaling is, in fact, the same object
    that's available under that class's name from the module it belongs to
    -- otherwise (e.g., for classes dynamically defined inside a function),
    it raises a PicklingError.

    As a general point, refusing to marshal data that strongly smells like
    it will be hard or unlikely to unmarshal later is commendable prudence,
    and I would recommend this general, prudent approach strongly.


    Alex
     
    Alex Martelli, Nov 18, 2005
    #13
  14. Franck PEREZ

    Mike Meyer Guest

    Once you move the data to another system, pretty much anything you do
    to try and figure out how to load the code that defines a class is
    ambiguous. After all, it may not be present at all. The only way to
    avoid ambiguity is to serialize the code.
    Given that this is netnews, that's not at *all* surprising.
    Ok, so you suggested it first. Congratulations. I didn't see them in
    that order. I didn't reply to that post of yours. I replied to the OP
    pointing out that the problem with __file__ being system-dependent. He
    never suggested using module.
    I never did anything to defend it. As soon as the OP pointed out that
    he'd didn't like __file__ because of the loss of system independence,
    I suggested using the module, and mentioned that he might not have a
    better choice than to use __file__ anyway. You overlooked the mention
    of __module__ to complain that __file__ wasn't his best choice,
    because he could use __module__. All I did was point out what you
    overlooked, and have been answering your questions about it ever
    since. If you don't think __file__ is such a good idea, you should
    have said so in the first place. Instead, you asked questions about
    it. Trying to help you out, I answered them. If trying to help someone
    who seems to be having trouble understanding an issue is "dwelling on
    the issue", well, I'm glad we have a newsgroup full of people who
    dwell on issues.

    <mike
     
    Mike Meyer, Nov 18, 2005
    #14
  15. Sorry to have given the impression that I overlooked something -- I
    didn't KNOW whether I had, which is why I checked back with you about
    it, but apparently I hadn't.
    And I did! My exact words in my first post to this thread were "I think
    that using the class's __module__ rather than __file__ should be what
    you want" -- a pretty clear indication that I don't think using __file__
    is a hot idea, don't you think?
    About what would be the cases where the module's name was not available,
    as you had claimed it could be; turns out that you haven't given a
    single example yet, just one (__main__) where the module name is
    perfectly available, though it may not be *useful* (and there's no
    special reason to guess that the *file*'s name would be any use either).
    You did help me to better understand some of the roots of your many
    mistaken assertions in this thread, from your first "How about adding
    Foo.__file__" proposal onwards, yes -- thank you. However, I am now
    reasonably convinced that your attempt to *defend*, at some level, those
    assertions, appear to have no sound technical basis, just a
    defensiveness that (in my mind) justifies the use of the word
    "dwelling". But we do fully agree that trying to be helpful, per se, is
    never inherently blameworthy, whether one succeeds or fails in the
    attempt -- so, thanks for your many (and mostly better based than this
    one) attempts to be of help to c.l.py posters.


    Alex
     
    Alex Martelli, Nov 18, 2005
    #15
  16. Franck PEREZ

    Mike Meyer Guest

    I claim I haven't made a single mistaken assertion in this
    thread. Referencing class.__file__ instead of module.__file__ was a
    mistake, but not an assertion. In fact, when I did that, I asked the
    OP what was wrong with it.

    You have read into some of the things I've written assertions which
    simply weren't there. For instance, I said "you may not have the
    module name", which led you to ask for an example of a class without
    an __module__ attribute. I never asserted that you could get a class
    without a __module__ attribute, merely that you could get classes
    where the module name wasn't useful. if you had spent your time
    reading the text a bit more carefully, you might have saved us both
    some time.

    <mike
     
    Mike Meyer, Nov 19, 2005
    #16
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.