The future of the character-encodings library

Discussion in 'Ruby' started by Nikolai Weibull, Mar 16, 2011.

  1. Hi!

    As some of you know the character-encodings library is a bit stale.
    It currently can=E2=80=99t be used from Ruby 1.9 (you may ask yourself why =
    you
    would, I suppose) because of the Encoding namespace being taken, there
    have been some compilation problems where gcc on Cygwin/MingW doesn=E2=80=
    =99t
    support the visibility attribute, and the tests depend on an ancient
    version of RSpec. I am in the process of fixing these wrongs, but I
    need your help.

    The big problem for me is figuring out how to namespace it. But
    before anyone tries to come up with a solution, let me describe my
    vision of this libraries future.

    Character-encodings will be a library that allows you to deal with
    UTF-8-encoded Strings in Ruby 1.8 and with collation, normalization,
    Unicode-table lookup and other Unicode-specific tasks in Ruby 1.[89].
    My original vision was that this library would support many more
    encodings, but the internet has spoken and UTF-8 is the future. (I
    also had a hope that Ruby programmers were going to begin namespacing
    their projects a bit better, but Ruby programmers prefer libraries
    called =E2=80=9CHpricot=E2=80=9D over libraries called =E2=80=9CParsers::HT=
    ML=E2=80=9D.) Ruby 1.9 adds
    support for a range of encodings that I=E2=80=99m not at all interested in =
    and
    I think that this library needs to be more focused to have any sort of
    future.

    Therefore, I would like to rename the library and its namespaces to
    reflect this change. The apt name =E2=80=9CUnicode=E2=80=9D is, sadly, alr=
    eady taken.
    I was thinking of =E2=80=9CRunicode=E2=80=9D, but that=E2=80=99s perhaps a=
    bit lame.

    A second question is one of API design. How should you, from Ruby
    1.8, be able to create a UTF-8-aware String? Currently you write
    either u"=C3=A4bc" or +"=C3=A4bc". I don=E2=80=99t like this style anymore=
    I don=E2=80=99t
    want to pollute Kernel or String unnecessarily. I would like to be
    able to provide an API that would allow you to run the same .rb file
    in both 1.8 and 1.9 and get the same results. This is, perhaps, not
    possible, given that 1.9 uses a dizzying array of methods to determine
    the encoding of a String. One could, of course, make Kernel#u a no-op
    for 1.9. Could any of the users of this library please provide me
    with some input on this point.

    I=E2=80=99m looking forward to receiving your input!
    Nikolai Weibull, Mar 16, 2011
    #1
    1. Advertising

  2. Nikolai Weibull

    Eric Hodel Guest

    On Mar 16, 2011, at 6:26 AM, Nikolai Weibull wrote:
    > Could any of the users of this library please provide me with some =

    input on this point.
    >=20
    > I=92m looking forward to receiving your input!


    There don't appear to be many users of character-encodings:

    https://rubygems.org/gems/character-encodings=
    Eric Hodel, Mar 16, 2011
    #2
    1. Advertising

  3. Eric, could you please reply to all in the future? I have =E2=80=9Cskip=E2=
    =80=9D set
    for this mailing list as, as you point out below, it=E2=80=99s rather high =
    in
    noise. It makes it rather hard to stitch things together when I can=E2=80=
    =99t
    easily reply to your reply.

    Eric Hodel wrote:
    > On Mar 16, 2011, at 6:26 AM, Nikolai Weibull wrote:
    > > Could any of the users of this library please provide me with some inpu=

    t on this point.
    > >
    > > I=E2=80=99m looking forward to receiving your input!


    > There don't appear to be many users of character-encodings:
    >
    > https://rubygems.org/gems/character-encodings


    I don=E2=80=99t see how this is relevant, but thank you for pointing out my
    failure in selling and maintaining my library.
    Nikolai Weibull, Mar 16, 2011
    #3
  4. Nikolai Weibull

    Eric Hodel Guest

    On Mar 16, 2011, at 3:15 PM, Nikolai Weibull wrote:
    > Eric, could you please reply to all in the future?


    No. I don't know two of the email addresses in your To header so I =
    can't judge if my response is topical for them.

    The third appears to be a mailing list to which I am not subscribed. I =
    don't wish to fend off possible "you must subscribe" bounces.

    > I have =93skip=94 set for this mailing list


    I don't know what this means.

    I think it means that you don't want to see messages from this mailing =
    list. If this is true why did you post to it?

    > as, as you point out below, it=92s rather high in noise.


    I don't see where I made this assertion.

    > It makes it rather hard to stitch things together when I can=92t =

    easily reply to your reply.

    I don't see why I should be inconvenienced to make it easier for you to =
    see responses you do not want to see.

    > Eric Hodel wrote:
    >> On Mar 16, 2011, at 6:26 AM, Nikolai Weibull wrote:
    >>> Could any of the users of this library please provide me with some =

    input on this point.
    >>>=20
    >>> I=92m looking forward to receiving your input!

    >=20
    >> There don't appear to be many users of character-encodings:
    >>=20
    >> https://rubygems.org/gems/character-encodings

    >=20
    > I don=92t see how this is relevant, but thank you for pointing out my
    > failure in selling and maintaining my library.


    I was attempting to suggest that since there aren't many downloads for =
    your gem maybe there's no need for you to continue to maintain it in its =
    current form (if at all).

    Some of the functionality of your gem has been taken up by ruby 1.9. =
    Anyone seriously considering handling encodings other than US ASCII =
    should move to 1.9. I would rebuild character-encodings atop 1.9 if I =
    were in the maintainer and had such a need.

    Due to the low number of downloads you have an excellent opportunity to =
    throw out your existing API and rebuild your library to integrate well =
    with the encoding features of ruby 1.9.

    I don't see why you would consider a low number of downloads to be any =
    failure on your part. I simply made a statement of fact. I have many, =
    many gems that nobody uses and I no longer maintain. It would be =
    ridiculous for me to attempt to attach any judgements to such a fact.=
    Eric Hodel, Mar 16, 2011
    #4
  5. On Thu, Mar 17, 2011 at 00:56, Eric Hodel <> wrote:
    > On Mar 16, 2011, at 3:15 PM, Nikolai Weibull wrote:
    >> Eric, could you please reply to all in the future?


    > No. =C2=A0I don't know two of the email addresses in your To header so I =

    can't judge if my response is topical for them.

    But I do and I made the judgment call for you.

    > The third appears to be a mailing list to which I am not subscribed. =C2=

    =A0I don't wish to fend off
    possible "you must subscribe" bounces.

    That is a valid point. I should have cross-posted my request for help inst=
    ead.

    >> I have =E2=80=9Cskip=E2=80=9D set for this mailing list


    > I think it means that you don't want to see messages from this mailing li=

    st.

    Correct.

    > If this is true why did you post to it?


    Because I wanted this to reach as many (interested) people as
    possible. If I=E2=80=99m going to make a big change here I want as many to
    know about it as possible.

    I know that people have used the library in the past, especially in
    back-ends, which makes it a lot harder to know how many users I
    actually have. I have, believe it or not, even been paid (minute
    amounts) to work on this library. I figured that perhaps there were
    some hidden users that I didn=E2=80=99t know about that were still using it
    and I therefore posted to the most public Ruby forum that I know of.

    >> as, as you point out below, it=E2=80=99s rather high in noise.


    > I don't see where I made this assertion.


    You implicitly made (I thought at the time, see below) it by saying
    that the library in question doesn=E2=80=99t have that many users and, as
    such, my posting wasn=E2=80=99t relevant to the majority of the readers of
    this list. This low level of relevancy is something that I have
    judged to be the case for many topics on this list.

    >> It makes it rather hard to stitch things together when I can=E2=80=99t e=

    asily reply to your reply.

    > I don't see why I should be inconvenienced to make it easier for you to s=

    ee responses you do not want to see.

    The inconvenience that you would have to endure by pressing Reply to
    all and removing the char-encodings list from the Cc list must surely
    not be as great as that which you have put me through by not including
    me in the Cc list so that I would receive your response to the posting
    that I made (that I, of course, do want).

    Either way, this is a moot point, as I=E2=80=99ve now set noskip. (I was
    hoping that either those that replied would include me or that the
    mailing list software would be intelligent enough to not skip replies
    to my postings. I was wrong.)

    >> Eric Hodel wrote:
    >>> On Mar 16, 2011, at 6:26 AM, Nikolai Weibull wrote:
    >>>> Could any of the users of this library please provide me with some inp=

    ut on this point.
    >>>>
    >>>> I=E2=80=99m looking forward to receiving your input!

    >>
    >>> There don't appear to be many users of character-encodings:
    >>>
    >>> https://rubygems.org/gems/character-encodings

    >>
    >> I don=E2=80=99t see how this is relevant, but thank you for pointing out=

    my
    >> failure in selling and maintaining my library.


    > I was attempting to suggest that since there aren't many downloads for yo=

    ur gem maybe there's no need for you to continue to maintain it in its curr=
    ent form (if at all).

    Then, for my sake, please say so. A short =E2=80=93 easily interpreted as
    snide =E2=80=93 remark like that can easily be misinterpreted.

    > Some of the functionality of your gem has been taken up by ruby 1.9. =C2=

    =A0Anyone seriously considering handling encodings other than US ASCII shou=
    ld move to 1.9. =C2=A0I would rebuild character-encodings atop 1.9 if I wer=
    e in the maintainer and had such a need.

    To what need are you referring?

    The whole point of the library was to provide UTF-8 support for 1.8.

    I now want to shift focus to both providing support for UTF-8 for
    those of us stuck with 1.8 (due to 1.9=E2=80=99s horrendous I/O and require
    performance on Windows) and as an extension to 1.9=E2=80=99s built-in Unico=
    de
    support.

    Looking at 1.9 it is now (because it sure wasn=E2=80=99t in 2006 when I beg=
    an
    developing this library) clear that Ruby won=E2=80=99t be supporting a lot =
    of
    features that would be desirable. You can, for example, not easily
    perform collation, normalization, or character-class lookup. Even
    such a thing as String#upcase doesn=E2=80=99t seem to be able do the right
    thing. I might be doing something wrong, but

    # -*- coding: utf-8 -*-

    puts "=C3=A4bc".upcase

    prints =E2=80=9C=C3=A4BC=E2=80=9D, not =E2=80=9C=C3=84BC=E2=80=9D.

    > Due to the low number of downloads you have an excellent opportunity to t=

    hrow out your existing API and rebuild your library to integrate well with =
    the encoding features of ruby 1.9.

    I know that this type of behavior is popular in the Ruby community,
    but I wanted to give my, albeit few, users a chance to have their say
    on this matter.

    > I don't see why you would consider a low number of downloads to be any fa=

    ilure on your part. =C2=A0I simply made a statement of fact.

    There are many statements of fact that one can make that are often
    best not made.

    As I already noted above, you need to contextualize such a statement
    so that it=E2=80=99s not open for interpretation. Had you written somethin=
    g
    along the lines of =E2=80=9CJudging from the statistics at rubygems.org,
    perhaps you can get away with your proposed changes without too many
    users becoming upset?=E2=80=9D, I would have known what you were trying to =
    get
    at. As you wrote it it only stands as a pointless remark.

    > I have many, many gems that nobody uses and I no longer maintain.


    I actually wish to continue maintaining this library and I actually do
    have active users.

    > It would be ridiculous for me to attempt to attach any judgements to such=

    a fact.

    Am I ridiculous for not sharing your level of detachment from your work?

    I don=E2=80=99t know if you actually looked at the source code, but it=E2=
    =80=99s
    actually quite a few lines of (sometimes rather complex) code, and for
    me to throw it away without at least considering its future utility is
    not something that I could easily do.
    Nikolai Weibull, Mar 17, 2011
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Safalra
    Replies:
    8
    Views:
    624
    Roedy Green
    Jun 15, 2004
  2. Kenneth McDonald
    Replies:
    1
    Views:
    311
  3. JKPeck
    Replies:
    6
    Views:
    294
    Martin Miller
    Nov 14, 2006
  4. A_H
    Replies:
    3
    Views:
    885
    Gary Herron
    May 20, 2008
  5. Replies:
    7
    Views:
    3,541
Loading...

Share This Page