Streaming Spreadsheet in Ruby

B

ben

(This posting can be ignored unless you live-or-die by a *nix command
line.)

http://rubyforge.org/projects/sss/

On the other hand, have you ever wanted to do some quick math on a CSV
without waiting to launch Excel or Gnumeric? (Or you're logged in
remotely.) Maybe you'd like to print the average timestamp across all
lines in a log file? Or you might wish "cut" split columns with a
regexp, not just a delimiter?

I just GPL'ed a project of mine for doing spreadsheet style
calculations on the command line. It's probably easiest to explain with
an example. Say you have a sample file called "data.csv", which looks
like this:

Year,Change,TOTAL
2001,34.5,100.1
2002,36.6,101.13
2003,-11,90.5
2004,0,95

And then you call the Streaming Spreadsheet like so:

$ cat data.csv | sss 'b=sum(b)' 'c=sd(c)' 'c1="full total"'

You'll get this on standard out:

Year Change full total
2001 34.5 100.1
2002 36.6 101.13
2003 -11 90.5
2004 0 95
60.1
4.91642400531117

Note that "cell" C1 has been changed, and those last two lines added
with the sum of the B column, and the standard deviation of the C.
Since it's really just a tarted up "eval," more complicated stuff also
works:

$ cat data.csv |sss 'd=(123**2.3).to_i'

On standard out:

Year Change TOTAL
2001 34.5 100.1
2002 36.6 101.13
2003 -11 90.5
2004 0 95
64088

The script is a moderately-clever 350 lines of Ruby (IMHO). For
instance, it takes advantage of a quirk in parsing this sort of thing:

eval("b32:c35")

This string of code ends up trying to call a method named "b32" with
one argument, the symbol ":c35". Perfect for returning a range of cells
via "def method_missing()"!

I hope it's useful for someone else right now, but the project is very
much still in beta.

http://rubyforge.org/projects/sss/
 
D

David Kastrup

On the other hand, have you ever wanted to do some quick math on a CSV
without waiting to launch Excel or Gnumeric? (Or you're logged in
remotely.) Maybe you'd like to print the average timestamp across all
lines in a log file? Or you might wish "cut" split columns with a
regexp, not just a delimiter?

I just GPL'ed a project of mine for doing spreadsheet style
calculations on the command line. It's probably easiest to explain with
an example. Say you have a sample file called "data.csv", which looks
like this:

Year,Change,TOTAL
2001,34.5,100.1
2002,36.6,101.13
2003,-11,90.5
2004,0,95

And then you call the Streaming Spreadsheet like so:

$ cat data.csv | sss 'b=sum(b)' 'c=sd(c)' 'c1="full total"'

You'll get this on standard out:

Year Change full total
2001 34.5 100.1
2002 36.6 101.13
2003 -11 90.5
2004 0 95
60.1
4.91642400531117

Looks like a case for awk to me...
 
B

ben

Do you really want to learn Awk's esoterica, in 2006? It doesn't do
aggregation well (SUM, MEAN), and "we know Ruby already."
 
A

ara.t.howard

(This posting can be ignored unless you live-or-die by a *nix command
line.)

http://rubyforge.org/projects/sss/

On the other hand, have you ever wanted to do some quick math on a CSV
without waiting to launch Excel or Gnumeric? (Or you're logged in
remotely.) Maybe you'd like to print the average timestamp across all
lines in a log file? Or you might wish "cut" split columns with a
regexp, not just a delimiter?

I just GPL'ed a project of mine for doing spreadsheet style
calculations on the command line. It's probably easiest to explain with
an example. Say you have a sample file called "data.csv", which looks
like this:

Year,Change,TOTAL
2001,34.5,100.1
2002,36.6,101.13
2003,-11,90.5
2004,0,95

And then you call the Streaming Spreadsheet like so:

$ cat data.csv | sss 'b=sum(b)' 'c=sd(c)' 'c1="full total"'

You'll get this on standard out:

Year Change full total
2001 34.5 100.1
2002 36.6 101.13
2003 -11 90.5
2004 0 95
60.1
4.91642400531117

Note that "cell" C1 has been changed, and those last two lines added
with the sum of the B column, and the standard deviation of the C.
Since it's really just a tarted up "eval," more complicated stuff also
works:

$ cat data.csv |sss 'd=(123**2.3).to_i'

On standard out:

Year Change TOTAL
2001 34.5 100.1
2002 36.6 101.13
2003 -11 90.5
2004 0 95
64088

The script is a moderately-clever 350 lines of Ruby (IMHO). For
instance, it takes advantage of a quirk in parsing this sort of thing:

eval("b32:c35")

This string of code ends up trying to call a method named "b32" with
one argument, the symbol ":c35". Perfect for returning a range of cells
via "def method_missing()"!

I hope it's useful for someone else right now, but the project is very
much still in beta.

http://rubyforge.org/projects/sss/

very cool! i've done similar things many times, i'll give it a whirl today!

-a
 
G

Gregory Brown

(This posting can be ignored unless you live-or-die by a *nix command
line.)

Ben, this is neat. You might get a lot of implementation details for
free by building this atop Ruport.

But the simplicity is pretty awesome. Cool stuff!
 
G

Gregory Seidman

On Fri, Nov 24, 2006 at 04:05:38AM +0900, (e-mail address removed) wrote:
} Do you really want to learn Awk's esoterica, in 2006? It doesn't do
} aggregation well (SUM, MEAN), and "we know Ruby already."
}
} David Kastrup wrote:
[...]
} > Looks like a case for awk to me...
} >
} > --
} > David Kastrup, Kriemhildstr. 15, 44793 Bochum

I've known awk for much longer than Ruby had existed. It is excellent for
line-based, field-based data manipulation. It feels a lot like C, with a
few other niceties (like associative arrays).

The point here is that you use the right tool for the job. For this job,
awk is decidedly the right job.

--Greg
 
M

M. Edward (Ed) Borasky

Gregory said:
On Fri, Nov 24, 2006 at 04:05:38AM +0900, (e-mail address removed) wrote:
} Do you really want to learn Awk's esoterica, in 2006? It doesn't do
} aggregation well (SUM, MEAN), and "we know Ruby already."
}
} David Kastrup wrote:
[...]
} > Looks like a case for awk to me...
} >
} > --
} > David Kastrup, Kriemhildstr. 15, 44793 Bochum

I've known awk for much longer than Ruby had existed. It is excellent for
line-based, field-based data manipulation. It feels a lot like C, with a
few other niceties (like associative arrays).

The point here is that you use the right tool for the job. For this job,
awk is decidedly the right job.

--Greg
I used "awk" when it was the only "scripting language" available to me.
As soon as I got my hands on Perl, however, I relished at the thought of
getting rid of a mess composed of a bunch of "the right tools for the
right jobs" -- ksh, awk, sed, cat, pipes, etc. That whole programming
style was a great one when it was new, and when there wasn't anything
better. But once there was *one* tool -- Perl -- that did all the jobs
and looked like a *real* Algol-like programming language, had arrays and
hashes, there was no way I was going back.

And now that there's Ruby, I don't really even want to go back to Perl.
 
M

Marc Heiler

Looks like a case for awk to me...
The point here is that you use the right tool for the job. For
this job, awk is decidedly the right job.

Ruby is beauty.
If you prefer awk, stick to it.
But if others want to use Ruby,
let them use Ruby.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top