Re: Does hashlib support a file mode?

Discussion in 'Python' started by Carl Banks, Jul 6, 2011.

  1. Carl Banks

    Carl Banks Guest

    On Wednesday, July 6, 2011 12:07:56 PM UTC-7, Phlip wrote:
    > If I call m = md5() twice, I expect two objects.
    >
    > I am now aware that Python bends the definition of "call" based on
    > where the line occurs. Principle of least surprise.


    Phlip:

    We already know about this violation of the least surprise principle; most of us acknowledge it as small blip in an otherwise straightforward and clean language. (Incidentally, fixing it would create different surprises, butprobably much less common ones.)

    We've helped you with your problem, but you risk alienating those who helped you when you badmouth the whole language on account of this one thing, and you might not get such prompt help next time. So try to be nice.

    You are wrong about Python bending the definition of "call", though. Surprising though it be, the Python language is very explicit that the default arguments are executed only once, when creating the function, *not* when calling it.


    Carl Banks
    Carl Banks, Jul 6, 2011
    #1
    1. Advertising

  2. Carl Banks

    Phlip Guest

    On Jul 6, 1:25 pm, Carl Banks <> wrote:

    > We already know about this violation of the least surprise principle; most of us acknowledge it as small blip in an otherwise straightforward and clean language.


    Here's the production code we're going with - thanks again all:


    def file_to_hash(path, hash_type=hashlib.md5):
    """
    Per: http://groups.google.com/group/comp.lang.python/browse_thread/thread/ea1c46f77ac1738c
    """

    hash = hash_type()

    with open(path, 'rb') as f:

    while True:
    s = f.read(8192) # CONSIDER: io.DEFAULT_BUFFER_SIZE
    if not s: break
    hash.update(s)

    return hash.hexdigest()

    Note the fix also avoids comparing to None, which, as usual, is also
    icky and less typesafe!

    (And don't get me started about the extra lines needed to avoid THIS
    atrocity!

    while s = f.read(8192):
    hash.update(s)

    ;)
    Phlip, Jul 6, 2011
    #2
    1. Advertising

  3. Phlip wrote:

    > Note the fix also avoids comparing to None, which, as usual, is also
    > icky and less typesafe!


    "Typesafe"? Are you trying to make a joke?



    --
    Steven
    Steven D'Aprano, Jul 7, 2011
    #3
  4. Carl Banks

    Andrew Berg Guest

    On 2011.07.06 06:16 PM, Steven D'Aprano wrote:
    > Phlip wrote:
    >
    > > Note the fix also avoids comparing to None, which, as usual, is also
    > > icky and less typesafe!

    >
    > "Typesafe"? Are you trying to make a joke?

    Maybe he has a duck phobia. Maybe he denies the existence of ducks.
    Maybe he doesn't like the sound of ducks. Maybe he just weighs the same
    as a duck. In any case, duck tolerance is necessary to use Python
    effectively.


    On a side note, it turns out there's no word for the fear of ducks. The
    closest phobia is anatidaephobia, which is the fear of being /watched/
    by a duck.
    Andrew Berg, Jul 7, 2011
    #4
  5. Carl Banks

    Phlip Guest

    > On 2011.07.06 06:16 PM, Steven D'Aprano wrote:> Phlip wrote:
    >
    > > > Note the fix also avoids comparing to None, which, as usual, is also
    > > > icky and less typesafe!

    >
    > > "Typesafe"? Are you trying to make a joke?


    No, I was pointing out that passing a type is more ... typesafe.
    Phlip, Jul 7, 2011
    #5
  6. Carl Banks

    Andrew Berg Guest

    On 2011.07.07 08:11 AM, Phlip wrote:
    > No, I was pointing out that passing a type is more ... typesafe.

    None is a type.

    >>> None.__class__

    <class 'NoneType'>
    Andrew Berg, Jul 7, 2011
    #6
  7. Carl Banks

    Phlip Guest

    On Jul 7, 6:24 am, Andrew Berg <> wrote:
    > On 2011.07.07 08:11 AM, Phlip wrote:> No, I was pointing out that passinga type is more ... typesafe.
    >
    > None is a type.


    I never said it wasn't.
    Phlip, Jul 7, 2011
    #7
  8. Carl Banks

    Andrew Berg Guest

    On 2011.07.07 08:39 AM, Phlip wrote:
    > On Jul 7, 6:24 am, Andrew Berg <> wrote:
    > > On 2011.07.07 08:11 AM, Phlip wrote:> No, I was pointing out that passing a type is more ... typesafe.
    > >
    > > None is a type.

    >
    > I never said it wasn't.

    You are talking about this code, right?

    def file_to_hash(path, m=None):
    if m is None:
    m = hashlib.md5()

    What's not a type? The is operator compares types (m's value isn't the
    only thing compared here; even an separate instance of the exact same
    type would make it return False), and m can't be undefined.
    Andrew Berg, Jul 7, 2011
    #8
  9. Andrew Berg wrote:

    > On 2011.07.07 08:39 AM, Phlip wrote:
    >> On Jul 7, 6:24 am, Andrew Berg <> wrote:
    >> > On 2011.07.07 08:11 AM, Phlip wrote:> No, I was pointing out that
    >> > passing a type is more ... typesafe.
    >> >
    >> > None is a type.

    >>
    >> I never said it wasn't.


    Unfortunately, it isn't.

    None is not a type, it is an instance.

    >>> isinstance(None, type) # is None a type?

    False
    >>> isinstance(None, type(None)) # is None an instance of None's type?

    True

    So None is not itself a type, although it *has* a type:

    >>> type(None)

    <type 'NoneType'>
    >>> isinstance(type(None), type) # is NoneType itself a type?

    True


    > You are talking about this code, right?
    >
    > def file_to_hash(path, m=None):
    > if m is None:
    > m = hashlib.md5()
    >
    > What's not a type? The is operator compares types (m's value isn't the
    > only thing compared here; even an separate instance of the exact same
    > type would make it return False), and m can't be undefined.


    The is operator does not compare types, it compares instances for identity.
    There is no need for is to ever care about the type of the arguments --
    that's just a waste of time, since a fast identity (memory location) test
    is sufficient.

    This is why I initially thought that Phlip was joking when he suggested
    that "m is None" could be type-unsafe. It doesn't matter what type m
    has, "m is <anything>" will always be perfectly safe.




    --
    Steven
    Steven D'Aprano, Jul 8, 2011
    #9
  10. Carl Banks

    Andrew Berg Guest

    On 2011.07.07 08:46 PM, Steven D'Aprano wrote:
    > None is not a type, it is an instance.
    >
    > >>> isinstance(None, type) # is None a type?

    > False
    > >>> isinstance(None, type(None)) # is None an instance of None's type?

    > True
    >
    > So None is not itself a type, although it *has* a type:
    >
    > >>> type(None)

    > <type 'NoneType'>
    > >>> isinstance(type(None), type) # is NoneType itself a type?

    > True

    I worded that poorly. None is (AFAIK) the only instance of NoneType, but
    I should've clarified the difference.
    > The is operator does not compare types, it compares instances for identity.
    > There is no need for is to ever care about the type of the arguments --
    > that's just a waste of time, since a fast identity (memory location) test
    > is sufficient.

    "Compare" was the wrong word. I figured the interpreter doesn't
    explicitly compare types, but obviously identical instances are going to
    be of the same type.
    Andrew Berg, Jul 8, 2011
    #10
  11. Carl Banks

    Phlip Guest

    > I worded that poorly. None is (AFAIK) the only instance of NoneType, but
    > I should've clarified the difference.> The is operator does not compare types, it compares instances for identity.


    None is typesafe, because it's strongly typed.

    However, what's even MORE X-safe (for various values of X) is a method
    that takes LESS for its arguments. That's why I switched from passing
    an object to passing a type, because the more restrictive argument
    type is more typesafe.

    However, the MOST X-safe version so far simply passes a string, and
    uses hashlib the way it designs to be used:

    def file_to_hash(path, hash_type):

    hash = hashlib.new(hash_type)

    with open(path, 'rb') as f:

    while True:
    s = f.read(8192)
    if not s: break
    hash.update(s)

    return hash.hexdigest()
    Phlip, Jul 8, 2011
    #11
  12. Phlip wrote:

    >> I worded that poorly. None is (AFAIK) the only instance of NoneType, but
    >> I should've clarified the difference.> The is operator does not compare
    >> types, it compares instances for identity.

    >
    > None is typesafe, because it's strongly typed.


    Everything in Python is strongly typed. Why single out None?

    Python has strongly-typed objects, dynamically typed variables, and a
    philosophy of preferring duck-typing over explicit type checks when
    possible.


    > However, what's even MORE X-safe (for various values of X) is a method
    > that takes LESS for its arguments. That's why I switched from passing
    > an object to passing a type, because the more restrictive argument
    > type is more typesafe.


    It seems to me that you are defeating duck-typing, and needlessly
    restricting what the user can pass, for dubious or no benefit. I still
    don't understand what problems you think you are avoiding with this tactic.



    > However, the MOST X-safe version so far simply passes a string, and
    > uses hashlib the way it designs to be used:
    >
    > def file_to_hash(path, hash_type):
    > hash = hashlib.new(hash_type)
    > with open(path, 'rb') as f:
    > while True:
    > s = f.read(8192)
    > if not s: break
    > hash.update(s)
    > return hash.hexdigest()



    There is no advantage to this that I can see. It limits the caller to using
    only hashes in hashlib. If the caller wants to provide her own hashing
    algorithm, your function will not support it.

    A more reasonable polymorphic version might be:

    def file_to_hash(path, hash='md5', blocksize=8192):
    # FIXME is md5 a sensible default hash?
    if isinstance(hash, str):
    # Allow the user to specify the hash by name.
    hash = hashlib.new(hash)
    else:
    # Otherwise hash must be an object that implements the
    # hashlib interface, i.e. a callable that returns an
    # object with appropriate update and hexdigest methods.
    hash = hash()
    with open(path, 'rb') as f:
    while True:
    s = f.read(blocksize)
    if not s: break
    hash.update(s)
    return hash.hexdigest()





    --
    Steven
    Steven D'Aprano, Jul 8, 2011
    #12
  13. Carl Banks

    Phlip Guest

    On Jul 8, 12:42 am, Steven D'Aprano <steve
    > wrote:
    > Phlip wrote:
    > >> I worded that poorly. None is (AFAIK) the only instance of NoneType, but
    > >> I should've clarified the difference.> The is operator does not compare
    > >> types, it compares instances for identity.

    >
    > > None is typesafe, because it's strongly typed.

    >
    > Everything in Python is strongly typed. Why single out None?


    You do understand these cheap shots are bad for conversations, right?

    I didn't single out None. When did you stop raping your mother?
    Phlip, Jul 8, 2011
    #13
  14. Phlip wrote:

    > On Jul 8, 12:42 am, Steven D'Aprano <steve
    > > wrote:
    >> Phlip wrote:
    >> >> I worded that poorly. None is (AFAIK) the only instance of NoneType,
    >> >> but I should've clarified the difference.> The is operator does not
    >> >> compare types, it compares instances for identity.

    >>
    >> > None is typesafe, because it's strongly typed.

    >>
    >> Everything in Python is strongly typed. Why single out None?

    >
    > You do understand these cheap shots are bad for conversations, right?
    >
    > I didn't single out None.


    Phlip, I'm not an idiot, please don't pee on my leg and tell me it's
    raining. In the very sentence you quote above, you clearly and obviously
    single out None:

    "None is typesafe, because it's strongly typed."

    Yes, None is strongly typed -- like everything else in Python. I don't
    understand what point you are trying to make. Earlier you claimed that
    identity testing for None is type-unsafe (or at least *less* type-safe,
    whatever that means):

    "Note the fix also avoids comparing to None, which, as usual, is also icky
    and less typesafe!"

    then you say None is type-safe -- if there is a coherent message in your
    posts, it is too cryptic for me.


    > When did you stop raping your mother?


    What makes you think I've stopped?



    --
    Steven
    Steven D'Aprano, Jul 9, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John J Lee
    Replies:
    3
    Views:
    470
    bruno at modulix
    Dec 1, 2005
  2. Edward Loper
    Replies:
    0
    Views:
    457
    Edward Loper
    Aug 7, 2007
  3. Phlip
    Replies:
    6
    Views:
    1,371
    Phlip
    Jul 6, 2011
  4. Phlip
    Replies:
    6
    Views:
    437
    Andrew Berg
    Jul 6, 2011
  5. Phlip
    Replies:
    5
    Views:
    292
    Paul Rudin
    Jul 7, 2011
Loading...

Share This Page