"[BUG] Segmentation fault" on import script.

Discussion in 'Ruby' started by sean.swolfe@gmail.com, Jan 31, 2006.

  1. Guest

    Hi gang. Sorry I haven't been able to respond to my last post about
    Stop Word dictionaries. Been all too busy, but the posted info was very
    useful.

    Anyways, I have a bit of a situation. I have a collection of files,
    around 10,000, that I need to parse and then suck that data into a
    database, along with their linked images. I've had a script that has
    been working pretty much for over 99.5% of the articles. Both the
    article data, and the images were getting imported fine.

    The images also have to go through a few processing steps before being
    put into the database. They are resized to meet a certain constraint,
    of a new document format, and are also resized again a few more times
    to create 2 sizes of Thumbnails. I've been using the RMagick library to
    do the image resizing.

    I then had to make a change, because I realized that I wasn't
    accommodating Animated GIF's and the resulting images that were saved
    only contained the first frame. So now I added just a routine around
    the resizing to iterate through an ImageList and do all the resizing
    for each frame.

    Now with this change I get a "[BUG] Segmentation fault" error. I
    thought maybe I was starving the memory resources and the GC couldn't
    keep up. (I'm uncomfortably unfamiliar with how the Ruby GC works as
    opposed to Java or .NET GC), so I explicitly call garbage_collect after
    about 100 imports. Still get the same error.

    This happens in both Ruby 1.8.2 and 1.8.4 on Mac OSX as well as Linux.
    Also, if it possibly means anything, this script is run in a Rails
    environment using the runner script. It will run successfully for about
    4000 articles before it bombs out.

    Here is an excerpt of the possibly offensive code:
    # image_object is an ActiveRecord object created
    # a little before this code.
    # get the ImageList from the image file.
    image_file_list = Magick::ImageList.new(@old_site_path + image_path)
    # create ImageLists for the thumbnails copied from the original
    ImageList
    smaller_list = image_file_list.copy
    smallest_list = image_file_list.copy
    tiniest_list = image_file_list.copy

    # loop thorugh the images in the list
    for image_index in 0...image_file_list.length
    image_file = image_file_list[image_index]
    # resize the loaded image to the main constraints
    image_file.change_geometry!('150x150') do |cols, rows, img|
    img.resize!(cols, rows)
    image_object.original_x = cols
    image_object.original_y = rows
    end

    smaller_list[image_index] = image_file.change_geometry('110x110') do
    |cols, rows, img|
    image_object.thumb_x = cols
    image_object.thumb_y = rows
    img.resize(cols, rows)
    end
    smallest_list[image_index] = image_file.change_geometry('91x91') do
    |cols, rows, img|
    image_object.small_thumb_x = cols
    image_object.small_thumb_y = rows
    img.resize(cols, rows)
    end
    tiniest_list[image_index] = image_file.change_geometry('50x50') do
    |cols, rows, img|
    image_object.tiny_thumb_x = cols
    image_object.tiny_thumb_y = rows
    img.resize(cols, rows)
    end
    end

    image_object.original_filename = File.basename(image_path)
    image_object.title = "#{ review_data[:artist] }: #{ review_data[:title]
    }"

    image_object.image_data = image_file_list.to_blob
    image_object.tiny_thumb = tiniest_list.to_blob # <-- segement fault
    usually happens here.
    image_object.big_thumb = smaller_list.to_blob
    image_object.small_thumb = smallest_list.to_blob


    Thanks in advance....
    , Jan 31, 2006
    #1
    1. Advertising

  2. Guest

    Hmm... Strange... I did try to run a GC.start after each pass, and I
    now get the following errors:
    ruby(13904) malloc: *** vm_allocate(size=1048576) failed (error code=3)
    ruby(13904) malloc: *** error: can't allocate region
    ruby(13904) malloc: *** set a breakpoint in szone_error to debug
    ../db/importer.rb:158: [BUG] Bus Error

    I monitored the memory size and it would fluctuate between 15MB to 90MB
    of Physical memory. I see the Physical memory grow and shrink mostly in
    the 60-70MB range, but towards the end grew to 90MB. The virtual memory
    on the other hand starts off at about 50MB and quickly grows to 200MB
    and then slowly grows to 3.51GB before the program crashes. So i can
    see that there is some sort of an out of memory issue. Do you believe
    that there might be some sort of memory leak for certain operations in
    RMagick, particularly ImageLists created with the copy method?

    I have built my copy of ImageMagick to use 8bit quantum. Most of the
    pictures that are imported are GIFs, and about 15-20% are animated with
    probably only 2 or 3 frames. Each GIF is roughly about 8-30KB's in
    size. And we are talking about importing nearly 10,000 GIFs. with the
    animated frames that's roughly about 13,000 frames. Those also then
    each get converted to 3 smaller images, one at 110x110, one at 91x91,
    and one at 50x50.

    I also modified things a little, and had the change_geometry method
    work on the frame inside copied ImageList, since each list is a deep
    copy, rather than resizing each off of the original, and then placing
    the new frames over the existing ones. This seemed to have the same
    results.

    I'm wondering if there is a way to copy the ImageList object
    properties, such as animation settings and such, without copying over
    the actual Images? I looked in the docs, but all the copy methods
    seemed to be deep copies.

    I was thinking of possibly changing my application so that it would
    create the smaller images from the main image the first time it is
    called for, and then cache them on the filesystem. But seeing the
    issues with RMagick running out of memory even when calling GC.start, I
    don't think I can reliably use this method for long running web
    applications. Plus I'd really like to offload all the processing at the
    beginning, so the webserver will do less work when serving users.

    Thanks for your help,

    Sean
    , Feb 2, 2006
    #2
    1. Advertising

  3. Guest

    I have a temporary fix for now. I just made a bash script that would
    then run the script for groups of sub directories, rather than running
    it on the whole tree. So the Ruby script will exit before the memory
    ever gets too large. It's a hack, but it'll work.

    Sean
    , Feb 2, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shashank Date

    [BUG] Segmentation fault on Windows

    Shashank Date, Jul 8, 2003, in forum: Ruby
    Replies:
    0
    Views:
    90
    Shashank Date
    Jul 8, 2003
  2. Ujwal
    Replies:
    0
    Views:
    107
    Ujwal
    Dec 4, 2003
  3. Bob Gustafson

    pty.so: [BUG] Segmentation fault - still

    Bob Gustafson, Feb 9, 2004, in forum: Ruby
    Replies:
    9
    Views:
    147
    Daniel Berger
    Feb 12, 2004
  4. David Espada
    Replies:
    6
    Views:
    157
    David Espada
    Mar 5, 2004
  5. Lucas Nussbaum
    Replies:
    12
    Views:
    294
    Tanaka Akira
    Jul 23, 2005
Loading...

Share This Page