ANN: home_run 0.9.1 Released

J

Jeremy Evans

= home_run

home_run is an implementation of ruby's Date/DateTime classes in C,
with much better performance (20-200x) than the version in the
standard library, while being almost completely compatible.

== Performance increase (microbenchmarks)

The speedup you'll get depends mostly on your version of ruby, but
also on your operating system, platform, and compiler. Here are
some comparative results for common methods:

# | i386 | i386 | i386 | i386 | amd64 |
# |Windows| Linux | Linux | Linux |OpenBSD|
# | 1.8.6 | 1.8.7 | 1.9.1 | 1.9.2 | 1.9.2 |
# |-------+-------+-------+------ +-------|
Date.civil | 82x | 66x | 27x | 21x | 14x |
Date.parse | 56x | 56x | 33x | 30x | 25x |
Date.today | 17x | 6x | 2x | 2x | 2x |
Date.strptime | 43x | 62x | 63x | 37x | 23x |
DateTime.civil | 252x | 146x | 52x | 41x | 17x |
DateTime.parse | 52x | 54x | 32x | 27x | 20x |
DateTime.now | 78x | 35x | 11x | 8x | 4x |
DateTime.strptime | 63x | 71x | 58x | 35x | 23x |
Date#strftime | 156x | 104x | 110x | 70x | 62x |
Date#+ | 34x | 32x | 5x | 5x | 4x |
Date#<< | 177x | 220x | 86x | 72x | 40x |
Date#to_s | 15x | 6x | 5x | 4x | 2x |
DateTime#strftime | 146x | 107x | 114x | 71x | 60x |
DateTime#+ | 34x | 37x | 8x | 6x | 3x |
DateTime#<< | 88x | 106x | 40x | 33x | 16x |
DateTime#to_s | 144x | 47x | 54x | 29x | 24x |

== Real world difference

The standard library Date class is slow enough to be the
bottleneck in much (if not most) of code that uses it.
Here's a real world benchmark showing the retrieval of
data from a database (using Sequel), first without home_run,
and then with home_run.

$ script/console production
Loading production environment (Rails 2.3.5)0.270000 0.020000 0.290000 ( 0.460604)
=> nil2.510000 0.050000 2.560000 ( 2.967896)
=> nil

$ home_run script/console production
Loading production environment (Rails 2.3.5)0.100000 0.000000 0.100000 ( 0.114747)
=> nil0.860000 0.010000 0.870000 ( 0.939594)

Without changing any application code, there's a 4x
increase when retrieving all employees, and a 3x
increase when retrieving all notifications. The
main reason for the performance difference between
these two models is that Employee has 5 date columns,
while Notification only has 3.

== Installing the gem

gem install home_run

The standard gem requires compiling from source, so you need a working
compiler toolchain. Since few Windows users have a working compiler
toolchain, a windows binary gem is available that works on both 1.8
and 1.9.

== Installing into site_ruby

This is only necessary on ruby 1.8, as on ruby 1.9, gem directories
come before the standard library directories in the load path.

After installing the gem:

home_run --install

Installing into site_ruby means that ruby will always use home_run's
Date/DateTime classes instead of the ones in the standard library.

If you ever want to uninstall from site_ruby:

home_run --uninstall

== Running without installing into site_ruby

Just like installing into site_ruby, this should only be necessary
on ruby 1.8.

If you don't want to install into site_ruby, you can use home_run's
Date/DateTime classes for specific programs by running your script
using home_run:

home_run ruby ...
home_run irb ...
home_run unicorn ...
home_run rake ...

This manipulates the RUBYLIB and RUBYOPT environment variables so
that home_run's Date/DateTime classes will be used.

You can also just require the library:

require 'home_run'

This should only be used as a last resort. Because rubygems requires
date, you can end up with situations where the Date instances created
before the require use the standard library version of Date, while the
Date instances created after the require use this library's version.
However, in some cases (such as on Heroku), this is the only way to
easily use this library.

== Running the specs

You can run the rubyspec based specs after installing the gem, if
you have MSpec installed (gem install mspec):

home_run --spec

If there are any failures, please report them as a bug.

== Running comparative benchmarks

You can run the benchmarks after installing the gem:

home_run --bench

The benchmarks compare home_run's Date/DateTime classes to the
standard library ones, showing you the amount of time an average
call to each method takes for both the standard library and
home_run, and the number of times home_run is faster or slower.
Output is in CSV, so an entry like this:

Date._parse,362562,10235,35.42

means that:

* The standard library's Date._parse averaged 362,562 nanoseconds
per call.
* home_run's Date._parse averaged 10,235 nanoseconds per call.
* Therefore, home_run's Date._parse method is 35.42 times faster

The bench task tries to be fair by ensuring that it runs the
benchmark for at least two seconds for both the standard
library and home_run's versions.

== Usage

home_run aims to be compatible with the standard library, except
for differences mentioned below. So you can use it the same way
you use the standard library.

== Differences from standard library

* Written in C (mostly) instead of ruby. Stores information in a
C structure, and therefore has a range limitation. home_run
cannot handle dates after 5874773-08-15 or before -5877752-05-08
on 32-bit platforms (with larger limits for 64-bit platforms).
* The Date class does not store fractional days (e.g. hours, minutes),
or offsets. The DateTime class does handle fractional days and
offsets.
* The DateTime class stores fractional days as the number of
nanoseconds since midnight, so it cannot deal with differences
less than a nanosecond.
* Neither Date nor DateTime uses rational. Places where the standard
library returns rationals, home_run returns integers or floats.
* Because rational is not used, it is not required. This can break
other libraries that use rational without directly requiring it.
* There is no support for modifying the date of calendar reform, the
sg arguments are ignored and the Gregorian calendar is always used.
This means that julian day 0 is -4173-11-24, instead of -4712-01-01.
* The undocumented Date#strftime format modifiers are not supported.
* The DateTime offset is checked for reasonableness. home_run
does not support offsets with an absolute difference of more than
14 hours from UTC.
* DateTime offsets are stored in minutes, so it will round offsets
with fractional minutes to the nearest minute.
* All public class and instance methods for both Date and DateTime
are implemented, except that the allocate class method is not
available and on 1.9, _dump and _load are used instead of
marshal_dump and marshal_load.
* Only the public API is compatible, the private methods in the
standard library are not implemented.
* The marshalling format differs from the one used by the standard
library. Note that the 1.8 and 1.9 standard library date
marshalling formats differ from each other.
* Date#step treats the step value as an integer, so it cannot handle
steps of fractional days. DateTime#step can handle fractional
day steps, though.
* When parsing the %Q modifier in _strptime, the hash returned
includes an Integer :seconds value and a Float :sec_fraction
value instead of a single rational :seconds value.
* The string returned by #inspect has a different format, since it
doesn't use rational.
* The conversion of 2-digit years to 4-digit years in Date._parse
is set to true by default. On ruby 1.8, the standard library
has it set to false by default.
* You can use the Date::Format::STYLE hash to change how to parse
DD/DD/DD and DD.DD.DD date formats, allowing you to get ruby 1.9
behavior on 1.8 or vice-versa. This is probably the only new
feature in that isn't in the standard library.

Any other differences will either be documented here or considered
bugs, so please report any other differences you find.

== Known incompatibilities

Some other libraries are known to be incompatible with this
extension due to the above differences:

* Date::performance - Date#<=> assumes @ajd instance variable
(unnecessary anyway, as home_run is faster)
* ruby-ole - Depends on DateTime.allocate/#initialize

== Reporting issues/bugs

home_run uses GitHub Issues for tracking issues/bugs:

http://github.com/jeremyevans/home_run/issues

== Contributing

The source code is on GitHub:

http://github.com/jeremyevans/home_run

To get a copy:

git clone git://github.com/jeremyevans/home_run.git

There are a few requirements:

* rake
* rake-compiler
* MSpec (not RSpec) for running the specs. The specs are based on
the rubyspec specs, which is why they use MSpec.
* RDoc 2.5.10+ if you want to build the documentation.
* Ragel 6.5+ if you want to modify the ragel parser.

== Compiling

To compile the library from a git checkout, after installing the
requirements:

rake compile

== Testing

The default rake task runs the specs, so just run:

rake

You need to compile the library and install MSpec before running the
specs.

== Benchmarking

To see the speedup that home_run gives you over the standard library:

rake bench

To see how much less memory home_run uses compared to the standard
library:

rake mem_bench

To see how much less garbage is created when instantiating objects
with home_run compared to the standard library:

rake garbage_bench

If you want to run all three benchmarks at once:

rake bench_all

== Platforms Supported

home_run has been tested on the following:

=== Operating Systems/Platforms

* Linux (x86_64, i386)
* Mac OS X 10.6 (x86_64, i386), 10.5 (i386)
* OpenBSD (amd64, i386)
* Solaris 10 (sparc)
* Windows XP (i386)
* Windows 7 (x64)

=== Compiler Versions

* gcc (3.3.5, 4.0.1, 4.2.1, 4.4.3, 4.5.0)
* Sun Studio Compiler (5.9)

=== Ruby Versions

* jruby cext branch (as of commit 1969c504229bfd6f2de1, 2010-08-23,
compiles and runs specs correctly, segfaults on benchmarks)
* rbx head (as of commit 0e265b92727cf3536053, 2010-08-16)
* ruby 1.8.6 (p0, p110, p398, p399)
* ruby 1.8.7 (p174, p248, p299, p302)
* ruby 1.9.1 (p243, p378, p429, p430)
* ruby 1.9.2 (p0)
* ruby head

If your platform, compiler version, or ruby version is not listed
above, please test and send me a report including:

* Your operating system and platform (e.g. i386, x86_64/amd64)
* Your compiler
* Your ruby version
* The output of home_run --spec
* The output of home_run --bench

== Author

Jeremy Evans <[email protected]>
 
C

Charles Oliver Nutter

We are very appreciative of the extra effort to make sure home_run
works on JRuby's cext branch. We plan to ship JRuby 1.6 with support
for C extensions, at least at a beta/experimental level.

We also hope someone will step up to write a Java-based version of
this library, probably using JodaTime (which we ship with JRuby) as
the base. C extensions are part of JRuby's long-term plan, but they
have many limitations (such as inability to support multiple JRuby
runtimes or concurrent execution), and eventually apps will want a
Java version that does not have the same issues.

Thanks for a great library and for the JRuby support, Jeremy :)

- Charlie

=3D home_run

home_run is an implementation of ruby's Date/DateTime classes in C,
with much better performance (20-200x) than the version in the
standard library, while being almost completely compatible.

=3D=3D Performance increase (microbenchmarks)

The speedup you'll get depends mostly on your version of ruby, but
also on your operating system, platform, and compiler. =C2=A0Here are
some comparative results for common methods:

=C2=A0# =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | i386 =
=C2=A0| i386 =C2=A0| i386 =C2=A0| i386 =C2=A0| amd64 |
=C2=A0# =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |Windows|=
Linux | Linux | Linux |OpenBSD|
=C2=A0# =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1.8.6 |=
1.8.7 | 1.9.1 | 1.9.2 | 1.9.2 |
=C2=A0# =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 |-------+=
-------+-------+------ +-------|
=C2=A0Date.civil =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 82x | =C2=A0 66x | =
=C2=A027x =C2=A0| =C2=A021x =C2=A0| =C2=A014x =C2=A0|
=C2=A0Date.parse =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 56x | =C2=A0 56x | =
=C2=A033x =C2=A0| =C2=A030x =C2=A0| =C2=A025x =C2=A0|
=C2=A0Date.today =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 17x | =C2=A0 =C2=A06=
x | =C2=A0 2x =C2=A0| =C2=A0 2x =C2=A0| =C2=A0 2x =C2=A0|
=C2=A0Date.strptime =C2=A0 =C2=A0 | =C2=A0 43x | =C2=A0 62x | =C2=A063x =
=C2=A0| =C2=A037x =C2=A0| =C2=A023x =C2=A0|
=C2=A0DateTime.civil =C2=A0 =C2=A0| =C2=A0252x | =C2=A0146x | =C2=A052x =
=C2=A0| =C2=A041x =C2=A0| =C2=A017x =C2=A0|
=C2=A0DateTime.parse =C2=A0 =C2=A0| =C2=A0 52x | =C2=A0 54x | =C2=A032x =
=C2=A0| =C2=A027x =C2=A0| =C2=A020x =C2=A0|
=C2=A0DateTime.now =C2=A0 =C2=A0 =C2=A0| =C2=A0 78x | =C2=A0 35x | =C2=A0=
11x =C2=A0| =C2=A0 8x =C2=A0| =C2=A0 4x =C2=A0|
=C2=A0DateTime.strptime | =C2=A0 63x | =C2=A0 71x | =C2=A058x =C2=A0| =C2=
=A035x =C2=A0| =C2=A023x =C2=A0|
=C2=A0Date#strftime =C2=A0 =C2=A0 | =C2=A0156x | =C2=A0104x | 110x =C2=A0=
| =C2=A070x =C2=A0| =C2=A062x =C2=A0|
=C2=A0Date#+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 34x | =C2=
=A0 32x | =C2=A0 5x =C2=A0| =C2=A0 5x =C2=A0| =C2=A0 4x =C2=A0|
=C2=A0Date#<< =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0177x | =C2=A0220=
x | =C2=A086x =C2=A0| =C2=A072x =C2=A0| =C2=A040x =C2=A0|
=C2=A0Date#to_s =C2=A0 =C2=A0 =C2=A0 =C2=A0 | =C2=A0 15x | =C2=A0 =C2=A06=
x | =C2=A0 5x =C2=A0| =C2=A0 4x =C2=A0| =C2=A0 2x =C2=A0|
=C2=A0DateTime#strftime | =C2=A0146x | =C2=A0107x | 114x =C2=A0| =C2=A071= x =C2=A0| =C2=A060x =C2=A0|
=C2=A0DateTime#+ =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 34x | =C2=A0 37x | =
=C2=A0 8x =C2=A0| =C2=A0 6x =C2=A0| =C2=A0 3x =C2=A0|
=C2=A0DateTime#<< =C2=A0 =C2=A0 =C2=A0 | =C2=A0 88x | =C2=A0106x | =C2=A0=
40x =C2=A0| =C2=A033x =C2=A0| =C2=A016x =C2=A0|
=C2=A0DateTime#to_s =C2=A0 =C2=A0 | =C2=A0144x | =C2=A0 47x | =C2=A054x =
=C2=A0| =C2=A029x =C2=A0| =C2=A024x =C2=A0|
 
J

Jeremy Evans

Roger said:

It's faster than both.

third_base was written by me in pure ruby. It's not as compatible or as
fast as home_run, and only provides the 1.8 API. It usually gives a
2-10x speedup over the 1.8 standard library version.

home_run is faster than Date::performance, even in the cases where
Date::performance is used, and Date::performance only handles a subset
of Date's functionality (and none of DateTime's). In general,
Date::performance is faster than third_base for the cases it handles,
and slower for the cases it does not.

There are a few reasons why home_run is faster, but it's mostly due to
the use of a different data structure and faster algorithms.

Jeremy
 
J

Jeremy Evans

Roger said:
Sounds good. Now if we can just figure out a way to make it fully
backward compatible and replace the one in stdlib...

I don't think it's possible to make it fully backward compatible without
making it significantly slower. To make it fully backward compatible
would require using rational, which is the main reason for the slow
performance of the stdlib version.

I do plan on bringing up the possibility of replacing the stdlib version
with home_run on the ruby-core mailing (either as is or with changes to
increase compatibility). Currently I'm waiting to be added to the list.
It's been a few days and no response from either the controller or admin
address. I didn't have a problem signing up for the ruby-cvs list.

If anyone here admins the ruby-core list (or knows who does), could you
get them to add me?

Jeremy
 
R

Roger Pack

I do plan on bringing up the possibility of replacing the stdlib version
with home_run on the ruby-core mailing (either as is or with changes to
increase compatibility). Currently I'm waiting to be added to the list.
It's been a few days and no response from either the controller or admin
address. I didn't have a problem signing up for the ruby-cvs list.

So there's no way of "falling back" to slow operation only when somebody
performs certain operations, I guess?
If anyone here admins the ruby-core list (or knows who does), could you
get them to add me?

doesn't it automatically subscribe you? It did for me...
 
J

Jeremy Evans

Roger said:
So there's no way of "falling back" to slow operation only when somebody
performs certain operations, I guess?

I suppose it's possible, but it would greatly increase the complexity.
There's quite a few cases you'd have to handle, off the top of my head,
at least:

1) Differences of less than a nanosecond
2) Ranges beyond those supported (more than 5 million years in the
future or past)
3) Offsets with fractional minutes

I wrote home_run so that it would cover all cases for 99% of rubyists.
For the 1% of rubyists that need features that the current stdlib has
and that home_run has not, they could still use the old stdlib version
(which could be moved elsewhere in the standard lib, or made a gem), or
Tadayoshi Funaba's date2 or date4 libraries
(http://www.funaba.org/en/ruby.html).

Jeremy
 
I

Intransition

If there is going to be consideration made of moving this to core, I
would like to suggest that it might also go beyond the current
capabilities, not just be a close subset. In particular if you look at
ActiveSupport, there is a lot of code related to beefing up Date/Time/
DateTime --with Rails needing so much bolted on it makes me think the
core libs could sorely use some of this functionality.
 
Y

Yossef Mendelssohn

If there is going to be consideration made of moving this to core, I
would like to suggest that it might also go beyond the current
capabilities, not just be a close subset. In particular if you look at
ActiveSupport, there is a lot of code related to beefing up Date/Time/
DateTime --with Rails needing so much bolted on it makes me think the
core libs could sorely use some of this functionality.


I'm definitely not one to say that some core classes couldn't use a
bit of extra niceness (see http://github.com/flogic/timely for a
related example). But at the same time, I don't think that Rails and
ActiveSupport are the right places to look for what's necessary for
Ruby in general.
 
I

Intransition

I'm definitely not one to say that some core classes couldn't use a
bit of extra niceness (see http://github.com/flogic/timely for a
related example). But at the same time, I don't think that Rails and
ActiveSupport are the right places to look for what's necessary for
Ruby in general.

Wouldn't it be as good a resources as any? I'm sure the timely lib is
too.

I agree these have to be looked at it with a discerning eye. The
question being, what extensions are designed merely for the
convenience of the framework at large, vs what are very general
extensions that have been added to compensate for a clear lack of
capabilities in the language itself.
 
J

jonty

Thanks for this !

However on my windows xp 1.8.7 installation if I do home_run install,
rubygems is broken and complains of a thread joining itself. and using
the home_run command results in the same error - I had to manually
remove the site-ruby files and folders.

Is there a work round?
 
J

Jeremy Evans

jonty said:
Thanks for this !

However on my windows xp 1.8.7 installation if I do home_run install,
rubygems is broken and complains of a thread joining itself. and using
the home_run command results in the same error - I had to manually
remove the site-ruby files and folders.

Is there a work round?

It's a bug in rubygems:

http://github.com/jeremyevans/home_run/issues/closed#issue/13
http://rubyforge.org/tracker/index.php?func=detail&aid=28561&group_id=126&atid=575

I suppose if you want a work around, you could copy
site_packages/1.8/date_ext.so to site_packages/date_ext.so.

Jeremy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top