Is it considered Harmful?

Discussion in 'Ruby' started by rolo, Jun 25, 2004.

  1. My point was that DL was already a part of the Ruby standard library,
    and as such, had opened the door to implement exactly what you're
    asking for without changes to the core Ruby interpreter. The cost of
    that flexibility, however, is that you must trust the programmer not
    to go and cause a segfault.

    I think that it's fine to have that access available for clever hacks,
    but only if you trust the author of that hack to really, really know
    what they're doing. It also could tie your code to a certain version
    of Ruby, as the internal data structures the interpreter uses could
    easily change in representation between versions.

    Making #class= or #become a part of the core library seems unecessary,
    exactly because it is potentially so dangerous. For those rare cases
    where people need that functionality, DL and evil.rb are available;
    for us mere mortals, the current (already wide-open) object semantics
    seem appropriate.
     
    Lennon Day-Reynolds, Jun 25, 2004
    #41
    1. Advertisements

  2. rolo

    Sean O'Dell Guest

    I have to admit when I'm wrong.

    At least doing it brute-force, changing an object's class DOES crash Ruby.
    But I've only done a rudimentary test where I simply replaced the klass VALUE
    (in C) with something else. I can't understand why Ruby would crash, except
    that internally there must be a whole lot of code that doesn't do any kind of
    checking of internal values.

    I think I can see why Guy says this can't be done. I haven't seen much of the
    internal Ruby implementations, but my initial guess is that Ruby makes an
    assumption about the presence, and state, of internal data kept with each
    object/class.

    If that's the case, it would probably be a mountain of work to back through it
    all and place checks on that internal data.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #42
    1. Advertisements

  3. rolo

    Sean O'Dell Guest

    I know what Ruby does, I just ran my own test.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #43
  4. [snip]
    I agree with you about it being sloppy programming, but then again,
    part of the reason that I write code in Ruby (and Python, Java, C#,
    Perl, etc.) is so that I can be a little sloppy, without getting
    segfaults or buffer overruns. Programming in C seems to consist mostly
    of wrapping unsafe calls in error checks and handlers; I would hate to
    see Ruby become similarly burdened due to low-level hacks like #class=
    or #become entering the langauge core.

    Lennon
     
    Lennon Day-Reynolds, Jun 25, 2004
    #44
  5. evil.rb's Object#become is restricted in several ways, to try to prevent
    crashes. Some of the checks include:

    * in general, refuse operations that would break implicit assumptions
    in Ruby (memory layout & internal structs):
    * refuse to swap "core classes"
    * refuse to swap modules with non-modules (cause they might be in a
    klass chain)
    * refuse to swap objects whose classes use incompatible "internal types"
    * refuse to operate on immediate values (Fixnum, symbols, true, false, nil...)
    * prevent cycles in the klass chains
    * check EXIVARs
    ...

    Additionally, the method cache is invalidated after two objects are
    swapped.

    --
    Running Debian GNU/Linux Sid (unstable)
    batsman dot geo at yahoo dot com

    i dont even know if it makes sense at all :) This is an experimental patch
    for an experimental kernel :))
    -- Ingo Molnar on linux-kernel
     
    Mauricio Fernández, Jun 25, 2004
    #45
  6. rolo

    Sean O'Dell Guest

    My guess is here:

    #define GetOpenFile(obj,fp) rb_io_check_closed((fp) =
    RFILE(rb_io_taint_check(obj))->fptr)

    ...because io_read, the C implementation of the File.read method, passes the
    "self" VALUE to this macro, which assumes the object is an RFILE, and
    immediately uses the fptr member, without checking that the "self" object
    contains a valid fptr value. If you change the class of an object to
    something else, it still does this, even when fptr is not a valid member
    (which, when used, causes the crash).

    Now, if Ruby does this in a lot of places, then yeah, changing an object's
    class is not going to be possible.

    What I think would settle this simply, is if R_CAST was changed to actually
    perform a type check and raise an exception when an attempt is made to cast a
    VALUE to something it's not.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #46
  7. rolo

    Sean O'Dell Guest

    I was actually speaking of Ruby C code, not so much Ruby script code. Ruby
    should be stable even in exceptional circumstances, although you might not
    get the results you're after. It still shouldn't crash. C code shouldn't
    use internal data without performing some checks, especially if the data
    makes the rounds to other libraries/extensions and such.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #47
  8. The point of this whole thread, though, is that adding #become and
    #class= to the language would effectively make Ruby code as
    potentially unsafe as C. Look at the .NET CLR -- they have "unsafe"
    blocks, in which you can do raw pointer-based operations, but which
    mark the entire assembly (a compiled module) as unsafe, and therefore
    platform (and perhaps even OS version) specific.

    I agree 100% that "pure Ruby" code should be safe from hard crashes
    and as platform-independent as possible. It's pretty good in that
    arena now, and I'd like to see it continue to improve, not get worse.

    Question: does the current implementation for $SAFE block imports of
    native modules at some level?

    Lennon

    P.S.: Okay, I just RTFM, and think I found the answer to my question
    above. Just in case anyone's interested:

    Answer: According to Pickaxe, at $SAFE=2, no loads are allowed from
    tainted path strings, and at $SAFE=3, all object are created tainted.
    That should mean that any code eval'd at $SAFE=3 or higher won't be
    able to import at all, but modules that were already loaded should
    still be available via 'require' (since 'load', not 'require' is the
    checked method), right?

     
    Lennon Day-Reynolds, Jun 25, 2004
    #48
  9. rolo

    Sean O'Dell Guest

    No, changing an object's class wouldn't cause Ruby code to be as prone to
    problems as C code. Not in theory, anyway. In theory, it would be as
    problematic as redefining methods, or including modules that conflict with
    existing object instance variables and methods.

    However, after looking a bit at the actual Ruby implementation, it's clear
    that theory and practice are two different worlds. Ruby uses internal data
    on assumption in a lot of places, and changing an object's class causes it to
    crash quite easily.

    But, I think simply placing checks in certain appropriate places would
    alleviate the problem. Sometime today I think I'll try putting type checks
    in the R_CAST macro and see how that works.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #49
  10. There's another small, but annoying problem with the proxy-approach:
    There are operations which can't be forwarded. I can only think of the
    truth value of an Object right now.

    It's a problem in this case:

    irb(main):001:0> [obj = false, ref = WeakRef.new(obj)]
    => [false, false]
    # We should now be able to treat obj and ref as
    # if they were actually the same thing
    irb(main):002:0> [obj.to_s, ref.to_s]
    => ["false", "false"] # This is okay, methods get forwarded
    irb(main):003:0> [if not obj then "foo" end, if not ref then "foo" end]
    => ["foo", nil] # truth value decisions can't be forwarded
    irb(main):004:0> exit

    I'm not sure how other languages handle -- as far as I know Perl has a
    way of overloading how Objects will be converted into Boolean values.

    Python also seems to have this via __nonzero__. (And __len__ == 0)

    I have no idea however if such a method (let's call it #to_bool) could
    be do in a way such that it doesn't disturb the performance of all Ruby
    scripts, but it looks more possible every day.

    Regards,
    Florian Gross
     
    Florian Gross, Jun 25, 2004
    #50
  11. rolo

    Jim Weirich Guest

    Florian Gross said:
    Comparisons are problematic because the first line of a comparison is
    often "Are you the same class as me?". Proxies arent't the right class so
    the comparisons fail.

    Actually, rather than a become or class= method, I would be interested in
    a swap_identitys method. For example, to make a proxy object real ...

    def make_real
    obj = read_real_object_from_the_database
    swap_identities(self, obj)
    end

    When identities are swapped, every reference in the system to the first
    object will magically become a reference to the second object, and
    vice-versa.

    There are no semantic problems with an object suddenly changing classes
    and finding inappropriate member variables. They just switch identities.
    This would be perfect for the proxy problem.

    Although it solves semantic problems of become, I doubt it would still fly
    with Ruby as it is today. All objects would have to occupy the same
    amount of memory for this to work (amoung other constraints). I don't
    think that's true of many of the built in classes.
     
    Jim Weirich, Jun 25, 2004
    #51
  12. rolo

    Sean O'Dell Guest

    With some small changes to object.c and ruby.h, this code:

    a = /cat/
    a.class = File
    a.read

    ...results in this error:

    testclass.rb:5:in `read': wrong argument type File (expected File) (TypeError)
    from testclass.rb:5

    Which is pretty solid, and now I can change the class of an object. A patch
    against the Ruby CVS follows, for reference.

    Sean O'Dell



    Index: object.c
    ===================================================================
    RCS file: /src/ruby/object.c,v
    retrieving revision 1.153
    diff -r1.153 object.c
    201a202,209
    Index: ruby.h
    ===================================================================
    RCS file: /src/ruby/ruby.h,v
    retrieving revision 1.105
    diff -r1.105 ruby.h
    415,420c415,420
    < #define RMODULE(obj) RCLASS(obj)
    < #define RFLOAT(obj) (R_CAST(RFloat)(obj))
    < #define RSTRING(obj) (R_CAST(RString)(obj))
    < #define RREGEXP(obj) (R_CAST(RRegexp)(obj))
    < #define RARRAY(obj) (R_CAST(RArray)(obj))
    < #define RHASH(obj) (R_CAST(RHash)(obj))
    ---
    422,424c422,424
    < #define RSTRUCT(obj) (R_CAST(RStruct)(obj))
    < #define RBIGNUM(obj) (R_CAST(RBignum)(obj))
    < #define RFILE(obj) (R_CAST(RFile)(obj))
    ---
     
    Sean O'Dell, Jun 25, 2004
    #52
  13. All of those places that you can assume the underlying datatype hasn't
    changed result in increased performance for us because there is no need
    to double check the type. If you have #class= then you would have added
    overhead on a significant portion of the code for a small use case.
    While this isn't a fundemental reason it is a logical reason why it may
    have been avoided in the first place. Why don't you try and make a
    patch that allows this on your own copy and compare performance and the
    like?
    Charles Comstock
     
    Charles Comstock, Jun 25, 2004
    #53
  14. rolo

    Sean O'Dell Guest

    Actually, the comparison is a function call that mostly just checks a bit
    mask. That's overhead, but fairly minimal I think.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #54
  15. It depends -- the answer to that question can be overloaded in most
    cases. (Though Object.instance_method:)class).bind(obj).call will always
    give you the real class and I think that some of the internal methods
    actually do something like that.)
    This is how Squeak implements this AFAIK -- it just iterates through all
    Object references and changes them.

    I'm not sure, but this could be a bit slow, maybe. (And I'm not sure if
    it would be a general solution for the problems with Proxy objects -- it
    would work in the lazy database example, but there might be other
    problematic use cases.)

    Regards,
    Florian Gross
     
    Florian Gross, Jun 25, 2004
    #55
  16. rolo

    Sean O'Dell Guest

    Ran some tests. Ran this Ruby code, which causes the check to occur 100,000
    times:

    f = File.new("testfile")

    time_start = Time.now
    (0..10000).each do | index |
    f.read
    f.seek(File::SEEK_SET, 0)
    end
    time_end = Time.now

    p time_end - time_start


    Most of the times were around 0.753561 seconds.

    Then I ran the same tests 100,000 times without the check and got times around
    0.763882 seconds. It really takes up no extra time at all, that I can
    figure. Totally negligible.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #56
  17. How about changing the object to a redirection pointer, and also
    updating references in the garbage collector's root set traversal?
    You'd get some overhead at first, but after one garbage collection, all
    references will be updated and the redirection object can be recycled.

    mikael
     
    Mikael Brockman, Jun 25, 2004
    #57
  18. rolo

    Bill Kelly Guest

    :)

    No ambulator should "expect" the rug to remain steady
    under their feet when it might be pulled out from under them.

    And yet - we live by many such expectations just to make
    it through every day life. One such expectation, for me,
    is that when my constructor or initializer (whether it
    be C or C++ or Ruby or Java or Perl or Python or.....)
    sets up some private member variables in my object - it
    is reasonable for me to expect that they won't be tampered
    with by unexpected external tomfoolery.
    I'm just getting started writing ruby extensions, so I have a
    lot to learn.


    Regards,

    Bill
     
    Bill Kelly, Jun 25, 2004
    #58
  19. rolo

    Sean O'Dell Guest

    You can if it's nailed down. You don't if it's a throw run with a rope tied
    to one end which runs out the front door and into the street.
    This is my practice as well in Ruby. Ruby handles such cases very well, and
    gives you good warnings when something your code needs isn't there. It's
    very simple to handle the exceptional cases by adding some checks later on if
    you find there's a real problem. Usually there isn't.

    C is quite different, though. You make assumptions to speed development and
    reduce code bloat, but sometimes you just can't get away with it.

    Sean O'Dell
     
    Sean O'Dell, Jun 25, 2004
    #59
  20. There is a theory that says that the speed of a language
    is the speed at which it manipulates Integer. I don't
    necessarily agree with that theory regarding higher level
    languages like Ruby. Yet I feel like it is relevant in
    the case of a low level change with pervasive impacts.

    As a result I feel like a benchmark with integers or
    some other low level objects would tell about the impact
    of your change in an interesting way compared to the
    test that you ran where I suspect a lot of the time is probably spent
    in the OS rather than in Ruby.

    Yours,

    JeanHuguesRobert
     
    Jean-Hugues ROBERT, Jun 26, 2004
    #60
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.