Ruby/Java strings solved the Ruby way

Discussion in 'Ruby' started by Charles Oliver Nutter, Jul 20, 2008.

  1. Hello all!

    I'm posting this here because it seem like a topic better discussed in
    the general Ruby community.

    JRuby allows you pretty seamless access to Java libraries through its
    Java integration layer. You can pull in just about any class,
    instantiate objects, call methods, and so on. In order to make this a
    bit easier, in many cases we automatically coerce particular Ruby types
    to their equivalent Java types. For example, Fixnum becomes either boxed
    integral or primitive integral values, Floats become boxed
    floating-point or primitive floating point values, and Strings are
    decoded from byte[] into Java String as UTF-8.

    But there's a problem with this...it adds a bit of overhead in the
    numeric cases, and a *lot* of overhead in the String case.

    Here's a comparison between calling a method that takes an int and a
    method that takes a String (best times out of five):

    with string 'hello': 1.068688
    with fixnum 1: 0.563014

    And this is a short string. The coercion cost for strings is at least O(n).

    It's about String coercion I'm writing.

    We'll never be able to eliminate the coercion cost entirely. Ruby
    Strings are byte[] and it has been a great move for us implementing our
    own String and related classes to use byte[] always. So there's never
    going to be a straight-through path from a Ruby String to a Java String.
    But I think we can reduce the impact for JRuby users by doing things the
    Ruby way.

    Ruby already has a protocol for coercion, via methods like to_str,
    to_ary and so on. This allows you to pass e.g. non-Strings to methods
    that act on Strings, and frequently (usually) they'll coerce and work
    fine. Often, if you want to avoid a coercion hit, you'll create the
    String ahead of time. And that's where we can learn from Ruby for Java
    String handling.

    So I propose that instead of always decoding incoming Ruby String into a
    Java String when calling a Java method, we introduce a new type--call it
    JString for now--that represents a Java string. When you require in the
    Java integration support, it would add to Ruby String a method
    to_jstring (or to_String or hey, toString?). So for calls from Ruby to
    Java, we'd follow Ruby coercion protocols and only accept either JString
    or objects that coerce to JString.

    Likewise, coming from Java to Ruby, we wouldn't automatically coerce;
    we'd return a JString object that implements to_str. You can then
    usually pass that to String APIs, or just coerce it immediately and go
    on with your business. Since this latter change would break some apps
    that expect Java strings to always be coerced, it would be saved for the
    next major release of JRuby and thoroughly discussed.

    I think this model provides the best possible experience when calling
    Java from Ruby but also allow JRuby users to take control of the
    coercion process, either be defining their own to_jstring methods on
    other types, or by pre-coercing strings they intend to use a lot.

    Thoughts?

    - Charlie
    Charles Oliver Nutter, Jul 20, 2008
    #1
    1. Advertising

  2. Charles Oliver Nutter

    Jim Menard Guest

    On Sun, Jul 20, 2008 at 1:36 PM, Charles Oliver Nutter
    <> wrote:
    [snip]
    > So I propose that instead of always decoding incoming Ruby String into a
    > Java String when calling a Java method, we introduce a new type--call it
    > JString for now--that represents a Java string. When you require in the Java
    > integration support, it would add to Ruby String a method to_jstring (or
    > to_String or hey, toString?). So for calls from Ruby to Java, we'd follow
    > Ruby coercion protocols and only accept either JString or objects that
    > coerce to JString.
    >
    > Likewise, coming from Java to Ruby, we wouldn't automatically coerce; we'd
    > return a JString object that implements to_str. You can then usually pass
    > that to String APIs, or just coerce it immediately and go on with your
    > business. Since this latter change would break some apps that expect Java
    > strings to always be coerced, it would be saved for the next major release
    > of JRuby and thoroughly discussed.


    This sounds like an excellent compromise. I vote for to_jstring
    because it looks most Ruby-esque.

    Jim
    --
    Jim Menard, ,
    http://www.io.com/~jimm/
    Jim Menard, Jul 21, 2008
    #2
    1. Advertising

  3. Jim Menard wrote:
    > On Sun, Jul 20, 2008 at 1:36 PM, Charles Oliver Nutter
    > <> wrote:
    > [snip]
    >> So I propose that instead of always decoding incoming Ruby String into a
    >> Java String when calling a Java method, we introduce a new type--call it
    >> JString for now--that represents a Java string. When you require in the Java
    >> integration support, it would add to Ruby String a method to_jstring (or
    >> to_String or hey, toString?). So for calls from Ruby to Java, we'd follow
    >> Ruby coercion protocols and only accept either JString or objects that
    >> coerce to JString.
    >>
    >> Likewise, coming from Java to Ruby, we wouldn't automatically coerce; we'd
    >> return a JString object that implements to_str. You can then usually pass
    >> that to String APIs, or just coerce it immediately and go on with your
    >> business. Since this latter change would break some apps that expect Java
    >> strings to always be coerced, it would be saved for the next major release
    >> of JRuby and thoroughly discussed.

    >
    > This sounds like an excellent compromise. I vote for to_jstring
    > because it looks most Ruby-esque.


    Also up for debate is whether boxed primitives from Java should behave
    the same way, with a JInteger, JFloat, and so on that can coerce to
    Fixnum or Float. But boxed primitives are considerably cheaper to coerce
    than Strings, so it may not be worth it.

    - Charlie
    Charles Oliver Nutter, Jul 21, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben

    Strings, Strings and Damned Strings

    Ben, Jun 22, 2006, in forum: C Programming
    Replies:
    14
    Views:
    724
    Malcolm
    Jun 24, 2006
  2. Michael Bacarella
    Replies:
    26
    Views:
    1,290
    harri
    Nov 20, 2007
  3. Ramon F Herrera
    Replies:
    10
    Views:
    464
    Gordon Beaton
    Dec 14, 2007
  4. dada

    I have solved my problem in some diffrend way...

    dada, Mar 5, 2004, in forum: ASP .Net Datagrid Control
    Replies:
    0
    Views:
    112
  5. Zed Shaw
    Replies:
    0
    Views:
    83
    Zed Shaw
    Aug 26, 2006
Loading...

Share This Page