Serious YAML bug

F

Frantisek Fuka

Hello,

I am using YAML to store around 100KB of strings (just simple arrays
and hashes) and I found out the "unpacking" of YAML data stops after
4096 (or so) characters of yaml file (without raising any errors) - the
parsed data just ends in the middle of the string. And it has nothing
to do with disk operations, it happens even when I read the whole yaml
file into string and then use "YAML.parse(longstring)" call. Maybe it
has something to do with the fact that my strings contains many
non-Ascii characters?

Here is the file (zipped) that gives me this problem:
http://fuxoft.cz/test.yaml.zip

I am using the latest Ruby included in Dapper Drake Linux distro and
YAML::Slyck version is "0.60".
 
J

James H.

What version of Ruby comes with Dapper? I installed it on a system the
other day and found that I needed to upgrade -- I believe Dapper comes
with 1.8.2 which appears to be somewhat problematic. The process for
apt-getting Ruby 1.8.4 is a little tricky, and long enough that I can't
remember it off of the top of my head.

Can you post your code?

James H
 
F

Frantisek Fuka

Version is: ruby 1.8.4 (2005-12-24) [i486-linux]

The code is rather simple:

str=File.open(fname).read
puts str #prints 100KB of data
data=YAML.parse(str)
puts data.emit #prints 4KB of data

However, using "data = YAML.load(File.open(fname))" (which I used
originally) produces the same error.
 
R

Robert Klemme

Frantisek said:
Version is: ruby 1.8.4 (2005-12-24) [i486-linux]

The code is rather simple:

str=File.open(fname).read

Note that you do not close the file descriptor properly which may cause
problems if you access that file later on in the same process. Better
do

str = File.read fname
puts str #prints 100KB of data
data=YAML.parse(str)

You probably used the wrong method:

irb(main):030:0> YAML.parse( "foo".to_yaml )
=> #<YAML::Syck::Scalar:0x488b388>
irb(main):031:0> YAML.load( "foo".to_yaml )
=> "foo"
puts data.emit #prints 4KB of data

However, using "data = YAML.load(File.open(fname))" (which I used
originally) produces the same error.

Again, here you do not close the file handle properly. When reading
from a file you can also use

data = YAML.load_file fname

Cheers

robert
 
T

ts

F> Version is: ruby 1.8.4 (2005-12-24) [i486-linux]

It's fixed in cvs

svg% ./ruby -v ~/b.rb
ruby 1.8.4 (2005-12-24) [i686-linux]
:subtitles
"WARNER BROS. uv\303\241d\303\255"
"Ud\304\233lala jsem chybu. V\303\255m, \305\276"
svg%

svg% ./ruby -v ~/b.rb
ruby 1.8.4 (2006-06-02) [i686-linux]
:subtitles
"WARNER BROS. uv\303\241d\303\255"
""
:saved_at
Fri May 26 19:25:06 CEST 2006
:custom_splits
{"Po \304\215lov\304\233ku, kter\303\275 mi p\305\231ipomn\304\233l p\303\241t\303\275 listopad."=>"Po \304\215lov\304\233ku, kter\303\275 mi\np\305\231ipomn\304\233l p\303\241t\303\275 listopad.", "A mn\304\233 se nest\303\275sk\303\241 po ideji, ale po \304\215lov\304\233ku."=>"A mn\304\233 se nest\303\275sk\303\241\npo ideji, ale po \304\215lov\304\233ku."}
svg%
 
R

Ross Bamford

Frantisek said:
Version is: ruby 1.8.4 (2005-12-24) [i486-linux]
The code is rather simple:
str=File.open(fname).read

Again, here you do not close the file handle properly. When reading
from a file you can also use

data = YAML.load_file fname

Hmm, the behaviour I'm seeing here would suggest this to be a bug. With
load_file, sometimes it works, sometimes it doesn't. Loading the file in
Ruby and passing it in always fails.

$ ruby -v
ruby 1.8.4 (2005-12-24) [i686-linux]

$ irb -ryaml
YAML::Syck::VERSION
# => "0.60"

YAML::load(File.read('test.yaml'))
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from (irb):3

YAML::load_file('test.yaml')
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:144:in `load_file'
from /usr/local/lib/ruby/1.8/yaml.rb:143:in `load_file'
from (irb):5

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:144:in `load_file'
from /usr/local/lib/ruby/1.8/yaml.rb:143:in `load_file'
from (irb):7

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}
 
F

Frantisek Fuka

ts said:
It's fixed in cvs

Glad to hear that. How exactly does the updating/packaging work? When
can I expect this to be fixed in Ubuntu? Or can I use some sort of
workaround meanwhile? I plan to use this code on several Ubuntu
machines.
 
T

ts

F> Glad to hear that. How exactly does the updating/packaging work? When
F> can I expect this to be fixed in Ubuntu? Or can I use some sort of
F> workaround meanwhile? I plan to use this code on several Ubuntu
F> machines.

Well, you can install it from cvs :

cvs -d :pserver:[email protected]:/src login

Just press RETURN when it ask for a password, then

cvs -z4 -d :pserver:[email protected]:/src co -r ruby_1_8 ruby

(without -r ruby_1_8, it will retrieve ruby 1.9)

cd ruby
autoconf
./configure
make
sudo make install

it will be installed in /usr/local/bin, or you can run

./configure --prefix=some_dir

if you want to put it in some_dir
 
F

Frantisek Fuka

Thanks. The thing is, the code has to be used on several machines that
don't have development environment on them (e.g. cannot compile) by
people who don't have admin rights there. I guess I'll have to wait
until it's automatically updated by Ubuntu. In the meantime - is there
some way I can get around this bug in my Ruby code? Maybe redefining
some method in Yaml??
 
T

ts

F> Thanks. The thing is, the code has to be used on several machines that
F> don't have development environment on them (e.g. cannot compile) by
F> people who don't have admin rights there. I guess I'll have to wait
F> until it's automatically updated by Ubuntu. In the meantime - is there
F> some way I can get around this bug in my Ruby code? Maybe redefining
F> some method in Yaml??

Well, YAML::load_file call a C method. Perhaps the problem is in syck (the
C extension) I've not looked at it.
 
F

Francis Hwang

YAML in Ruby relies on Syck, which is pretty remarkable but not yet, in
my experience, stable enough to hold large amounts of data, or data
that changes frequently. And _why's page on Syck (
http://whytheluckystiff.net/syck/ ) points out that Unicode support is
also not very good yet.

If you need to stick with YAML, I'd recommend getting Ruby from CVS
head; it's supposed to contain a newer version of Syck. If you can get
away with it, though, I'd suggest using Marshal instead. It's less cool
than YAML, but pretty solid otherwise.

Francis Hwang
http://fhwang.net/
 
W

wolfram

Ubuntu Dapper comes with Ruby 1.8.4. If you've installed Dapper
already, then you should simply be able to upgrade to the latest
packages with the Synaptics package manager.

I installed the Dapper Ruby packages on Breezy, because Breezy's Ruby
is 1.8.3 which was not compatible with Rails :-( To do that, add the
Dapper repository channels, update the package list and install only
the Ruby packages, then de-activate the Dapper channels again.

Wolfram
 
F

Frantisek Fuka

wolfram said:
Ubuntu Dapper comes with Ruby 1.8.4. If you've installed Dapper
already, then you should simply be able to upgrade to the latest
packages with the Synaptics package manager.

I'm not sure if I understand you. I have Dapper Drake and I update
daily. I was asking when can I expect this fix to appear in Dapper
repositories and will get to all "dumb users". I have no idea how often
the packages are updated but I'm afraid I don't remember my ruby
packages updating very often.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

What is YAML::Syck::Map? 1
YAML troubles 5
Deep YAML 5
YAML + ASCII Encoded Unicode 1
Using unicode in YAML 1
Yaml append and query 0
bug in YAML for Ruby? 3
Understanding YAML and this practice in general 11

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,173
Latest member
GeraldReund
Top