"[BUG] Segmentation fault" on import script.

S

sean.swolfe

Hi gang. Sorry I haven't been able to respond to my last post about
Stop Word dictionaries. Been all too busy, but the posted info was very
useful.

Anyways, I have a bit of a situation. I have a collection of files,
around 10,000, that I need to parse and then suck that data into a
database, along with their linked images. I've had a script that has
been working pretty much for over 99.5% of the articles. Both the
article data, and the images were getting imported fine.

The images also have to go through a few processing steps before being
put into the database. They are resized to meet a certain constraint,
of a new document format, and are also resized again a few more times
to create 2 sizes of Thumbnails. I've been using the RMagick library to
do the image resizing.

I then had to make a change, because I realized that I wasn't
accommodating Animated GIF's and the resulting images that were saved
only contained the first frame. So now I added just a routine around
the resizing to iterate through an ImageList and do all the resizing
for each frame.

Now with this change I get a "[BUG] Segmentation fault" error. I
thought maybe I was starving the memory resources and the GC couldn't
keep up. (I'm uncomfortably unfamiliar with how the Ruby GC works as
opposed to Java or .NET GC), so I explicitly call garbage_collect after
about 100 imports. Still get the same error.

This happens in both Ruby 1.8.2 and 1.8.4 on Mac OSX as well as Linux.
Also, if it possibly means anything, this script is run in a Rails
environment using the runner script. It will run successfully for about
4000 articles before it bombs out.

Here is an excerpt of the possibly offensive code:
# image_object is an ActiveRecord object created
# a little before this code.
# get the ImageList from the image file.
image_file_list = Magick::ImageList.new(@old_site_path + image_path)
# create ImageLists for the thumbnails copied from the original
ImageList
smaller_list = image_file_list.copy
smallest_list = image_file_list.copy
tiniest_list = image_file_list.copy

# loop thorugh the images in the list
for image_index in 0...image_file_list.length
image_file = image_file_list[image_index]
# resize the loaded image to the main constraints
image_file.change_geometry!('150x150') do |cols, rows, img|
img.resize!(cols, rows)
image_object.original_x = cols
image_object.original_y = rows
end

smaller_list[image_index] = image_file.change_geometry('110x110') do
|cols, rows, img|
image_object.thumb_x = cols
image_object.thumb_y = rows
img.resize(cols, rows)
end
smallest_list[image_index] = image_file.change_geometry('91x91') do
|cols, rows, img|
image_object.small_thumb_x = cols
image_object.small_thumb_y = rows
img.resize(cols, rows)
end
tiniest_list[image_index] = image_file.change_geometry('50x50') do
|cols, rows, img|
image_object.tiny_thumb_x = cols
image_object.tiny_thumb_y = rows
img.resize(cols, rows)
end
end

image_object.original_filename = File.basename(image_path)
image_object.title = "#{ review_data[:artist] }: #{ review_data[:title]
}"

image_object.image_data = image_file_list.to_blob
image_object.tiny_thumb = tiniest_list.to_blob # <-- segement fault
usually happens here.
image_object.big_thumb = smaller_list.to_blob
image_object.small_thumb = smallest_list.to_blob


Thanks in advance....
 
S

sean.swolfe

Hmm... Strange... I did try to run a GC.start after each pass, and I
now get the following errors:
ruby(13904) malloc: *** vm_allocate(size=1048576) failed (error code=3)
ruby(13904) malloc: *** error: can't allocate region
ruby(13904) malloc: *** set a breakpoint in szone_error to debug
../db/importer.rb:158: [BUG] Bus Error

I monitored the memory size and it would fluctuate between 15MB to 90MB
of Physical memory. I see the Physical memory grow and shrink mostly in
the 60-70MB range, but towards the end grew to 90MB. The virtual memory
on the other hand starts off at about 50MB and quickly grows to 200MB
and then slowly grows to 3.51GB before the program crashes. So i can
see that there is some sort of an out of memory issue. Do you believe
that there might be some sort of memory leak for certain operations in
RMagick, particularly ImageLists created with the copy method?

I have built my copy of ImageMagick to use 8bit quantum. Most of the
pictures that are imported are GIFs, and about 15-20% are animated with
probably only 2 or 3 frames. Each GIF is roughly about 8-30KB's in
size. And we are talking about importing nearly 10,000 GIFs. with the
animated frames that's roughly about 13,000 frames. Those also then
each get converted to 3 smaller images, one at 110x110, one at 91x91,
and one at 50x50.

I also modified things a little, and had the change_geometry method
work on the frame inside copied ImageList, since each list is a deep
copy, rather than resizing each off of the original, and then placing
the new frames over the existing ones. This seemed to have the same
results.

I'm wondering if there is a way to copy the ImageList object
properties, such as animation settings and such, without copying over
the actual Images? I looked in the docs, but all the copy methods
seemed to be deep copies.

I was thinking of possibly changing my application so that it would
create the smaller images from the main image the first time it is
called for, and then cache them on the filesystem. But seeing the
issues with RMagick running out of memory even when calling GC.start, I
don't think I can reliably use this method for long running web
applications. Plus I'd really like to offload all the processing at the
beginning, so the webserver will do less work when serving users.

Thanks for your help,

Sean
 
S

sean.swolfe

I have a temporary fix for now. I just made a bash script that would
then run the script for groups of sub directories, rather than running
it on the whole tree. So the Ruby script will exit before the memory
ever gets too large. It's a hack, but it'll work.

Sean
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top