String#de_inspect (and Kernel#suspicious)

E

Erik Veenstra

If you do an inspect on a collection of Ruby objects, like a
hash, you end up with a string. It's possible to store this
string in a file, read it again somewhere in the future,
evaluate it and end up with the same collection of Ruby objects
in core.

So I've written this String#de_inspect, which uses
Kernel#suspicious (slow!) to avoid any malicious code from
being evaluated.

A kind of human-readable marshaling. That is-human-readable is
important, for me, in this situation.

(You can only dump objects which inspect to Ruby code, e.g.
Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
false.)

I've attached the code and an example, though the example isn't
important.

Thoughts? Comments?

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

module Kernel
def suspicious(*parms, &block) # Just forget about the parms...
Thread.new(*parms) do |*parms|
$SAFE = 5

block.call(*parms)
end.value
end
end

class String
def de_inspect
suspicious do
eval(self, Module.new.module_eval{binding})
end
end
end

def journal(file)
File.open(file) do |f|
while (line = f.gets)
yield(line.de_inspect)
end
end
end

journal("journal") do |x|
p x
end

----------------------------------------------------------------
 
R

Robert Klemme

Erik said:
If you do an inspect on a collection of Ruby objects, like a
hash, you end up with a string. It's possible to store this
string in a file, read it again somewhere in the future,
evaluate it and end up with the same collection of Ruby objects
in core.

So I've written this String#de_inspect, which uses
Kernel#suspicious (slow!) to avoid any malicious code from
being evaluated.

A kind of human-readable marshaling. That is-human-readable is
important, for me, in this situation.

(You can only dump objects which inspect to Ruby code, e.g.
Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
false.)

I've attached the code and an example, though the example isn't
important.

Thoughts? Comments?

A question: what is the advantage of this over YAML?

Kind regards

robert
 
E

Erik Veenstra

A question: what is the advantage of this over YAML?

1) It's faster (see below). Probably because it uses the highly
optimized parser/lexer/whatever of the Ruby interpreter
itself. (You can turn off the suspicious mode if the data
can be trusted, which makes it faster then YAML. If the
suspicious is enabled, it's as fast as YAML.)

2) Memory (suspicious mode turned off) (see below).

3) It's small, whereas YAML is relatively huge. (Is being small
an advantage? Not necessarily, but I mention it anyway...)

4) You can store not only raw data, but code as well. (I know,
this is really DANGEROUS, like macros in Word. That's why I
introduced Kernel#suspicious.)

5) I my real situation, I raise an exception if the line, read
from the journal, doesn't end with \r, \n or both. This is
an indication for a corrupted journal. Half a line in the
journal could be valid Ruby code and, as such, appear to be
valid data. That's why I check for the "commit". (Maybe YAML
does this too. I don't know.)

In my case, where the data is only accessible via a dedicated
daemon on a server, I can turn off the suspicious mode. That's
the big win.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

$ wc test.rbo test.yaml # SAME DATA!
3077 26739 681698 test.rbo
29816 62709 697071 test.yaml

$ ruby test.rb 10 # 10 times
CPU ELAPSED COUNT CPU/INSTANCE LABEL
3.770000 4.016498 1 3.770000 :yaml
3.630000 3.862287 1 3.630000 :rbo
1.140000 1.140604 1 1.140000 :rbo_fast

$ ruby testmem.rb rbo # Disable GC, load testset once.
VmSize: 21988 kB

$ ruby testmem.rb rbo_fast # Disable GC, load testset once.
VmSize: 10904 kB

$ ruby testmem.rb yaml # Disable GC, load testset once.
VmSize: 18004 kB

----------------------------------------------------------------
 
M

Mauricio Fernandez

So I've written this String#de_inspect, which uses
Kernel#suspicious (slow!) to avoid any malicious code from
being evaluated.
[...]


### code by Mr. Evil

File.open("journal", "w") do |f|
f.puts <<-EOF.gsub("\n", ";")
def (o=Object.new).inspect
puts "gotcha! I'm running in $SAFE=\#{$SAFE}"
puts "Fear my rm -rf"
'"Just an innocent little string"'
end
o
EOF
end

# back to your code
module Kernel
def suspicious(*parms, &block) # Just forget about the parms...
Thread.new(*parms) do |*parms|
$SAFE = 5

block.call(*parms)
end.value
end
end

class String
def de_inspect
suspicious do
eval(self, Module.new.module_eval{binding})
end
end
end

def journal(file)
File.open(file) do |f|
while (line = f.gets)
yield(line.de_inspect)
end
end
end

journal("journal") do |x|
p x
end
# >> gotcha! I'm running in $SAFE=0
# >> Fear my rm -rf
# >> "Just an innocent little string"
 
R

Robert Klemme

Erik said:
1) It's faster (see below). Probably because it uses the highly
optimized parser/lexer/whatever of the Ruby interpreter
itself. (You can turn off the suspicious mode if the data
can be trusted, which makes it faster then YAML. If the
suspicious is enabled, it's as fast as YAML.)

2) Memory (suspicious mode turned off) (see below).

3) It's small, whereas YAML is relatively huge. (Is being small
an advantage? Not necessarily, but I mention it anyway...)

4) You can store not only raw data, but code as well. (I know,
this is really DANGEROUS, like macros in Word. That's why I
introduced Kernel#suspicious.)

5) I my real situation, I raise an exception if the line, read
from the journal, doesn't end with \r, \n or both. This is
an indication for a corrupted journal. Half a line in the
journal could be valid Ruby code and, as such, appear to be
valid data. That's why I check for the "commit". (Maybe YAML
does this too. I don't know.)

That's quite an impressive list. I'm glad I asked.

Kind regards

robert
 
E

Erik Veenstra

Okay, the block was defined in SAFE mode 0... :)

In the first version, I didn't introduce Kernel#suspicious (see
below). That one worked fine.

Then I naively abstracted the thread thing, which didn't do
what I expected. Oops...

Back to version 1...

Thanks.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

File.open("journal", "w") do |f|
f.puts "[:SAFE => $SAFE]"
end

# back to your code

class String
def de_inspect
Thread.new do
$SAFE = 5

eval(self, Module.new.module_eval{binding})
end.value
end
end

def journal(file)
File.open(file) do |f|
while (line = f.gets)
yield(line.de_inspect)
end
end
end

journal("journal") do |x|
p x
end

----------------------------------------------------------------
 
J

Joel VanderWerf

Erik said:
If you do an inspect on a collection of Ruby objects, like a
hash, you end up with a string. It's possible to store this
string in a file, read it again somewhere in the future,
evaluate it and end up with the same collection of Ruby objects
in core.

So I've written this String#de_inspect, which uses
Kernel#suspicious (slow!) to avoid any malicious code from
being evaluated.

A kind of human-readable marshaling. That is-human-readable is
important, for me, in this situation.

(You can only dump objects which inspect to Ruby code, e.g.
Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
false.)

For the object->string direction, it may be useful to use amarshal,
rather than inspect:

http://cvs.m17n.org/~akr/amarshal/

One advantage is with cyclic references. Using #inspect will not
preserve enough information to reconstruct the reference.
 
E

Erik Veenstra

For the object-> string direction, it may be useful to use
amarshal, rather than inspect:

http://cvs.m17n.org/~akr/amarshal/

One advantage is with cyclic references. Using #inspect will
not preserve enough information to reconstruct the reference.

% ruby -ramarshal -e 'AMarshal.dump([1,2,3], STDOUT)'
v = []
v[0] = Array.allocate()
v[0] << 1
v[0] << 2
v[0] << 3
v[0]

This idea is really nice, indeed.

gegroet,
Erik V. - http://www.erikveen.dds.nl/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top