Extract a bordered, skewed rectangle from an image

Discussion in 'Python' started by Paul Hemans, May 7, 2010.

  1. Paul Hemans

    Paul Hemans Guest

    We have a scanned document on which a label has been attached. The label has
    been designed to have a border that makes it easy to determine the correct
    orientation and area of the label. The label portion of the scanned image
    needs to be extracted and deskewed as an image. The contents of the label
    will change, but the border won't
    I originally posted this onto RentAcoder as a project, but I am not getting
    a lot of responses. It might be that I requested it be done in Python, its
    too hard or I am too stingy. You can see the project here:

    It may not be feasible to do this project without the use of an image
    processing engine such as openCV. There is a routine in openCV called
    cvMinAreaRect2() that may do the job of returning a matching rectangle that
    is inclined. There is a Python to openCV interface available. So I think all
    the pieces are there, but this is out of my league as I have had very little
    experience with image processing.

    I am wondering whether there are any people here that have experience with
    openCV and Python. If so, could you either give me some pointers on how to
    approach this, or if you feel so inclined, bid on the project. There are 2
    How do I get openCV to ignore the contents of the label and just focus on
    the border?
    How to do this through Python into openCV? I am a newbie to Python, not
    strong in Maths and ignorant of the usage of openCV.

    Paul Hemans, May 7, 2010
    1. Advertisements

  2. Paul Hemans

    David Bolen Guest

    "Paul Hemans" <> writes:

    > I am wondering whether there are any people here that have experience with
    > openCV and Python. If so, could you either give me some pointers on how to
    > approach this, or if you feel so inclined, bid on the project. There are 2
    > problems:

    Can't offer actual services, but I've done image tracking and object
    identification in Python with OpenCV so can suggest some approaches.

    You might also try the OpenCV mailing list, though it's sometimes
    varies wildly in terms of S/N ratio.

    And for OpenCV specifically, I definitely recommend the book "Learning
    OpenCV" by O'Reilly. It's really hard to grasp the concepts and
    applications of the raw OpenCV calls from the API documentation, and I
    found the book (albeit not cheap) helped me out tremendously and was
    well worth it.

    I'll flip the two questions since the second is quicker to answer.

    > How to do this through Python into openCV? I am a newbie to Python, not
    > strong in Maths and ignorant of the usage of openCV.

    After trying a few wrappers, the bulk of my experience is with the
    ctypes-opencv wrapper and OpenCV 1.x (either 1.0 or 1.1pre). Things
    change a lot with the recent 2.x (which needs C++ wrappers), and I'm
    not sure the various wrappers are as stable yet. So if you don't have
    a hard requirement for 2.x, I might suggest at least starting with 1.x
    and ctypes-opencv, which is very robust, though I'm a little biased as
    I've contributed code to the wrapper.

    > How do I get openCV to ignore the contents of the label and just focus on
    > the border?

    There's likely no single answer, since multiple mechanisms for
    identifying features in an image exist, and you can also derive
    additional heuristics based on your own knowledge of the domain space
    (your own images). Without knowing exactly what the border design to
    make it easy to detect is, it's hard to say anything definitive.

    But in broad strokes, you'll often:

    1. Normalize the image in some way. This can be to adjust for
    brightness from various scans to make later processing more
    consistent, or to switch spaces (to make color matching more
    effective) or even to remove color altogether if it just
    complicates matters. You may also mask of entire portions of the
    image if you have information that says they can't possibly be
    part of what you are looking for.
    2. Attempt to remove noise. Even when portions of an image looks
    like a solid color, at the pixel level there can be may different
    variations in pixel values. Operations such as blurring or
    smoothing help to average out those values and simplify matching
    entire regions.
    3. Attempt to identify the regions or features of interest. Here's
    where a ton of algorithms may apply due to your needs, but the
    simplest form to start with is basic color matching. For edge
    detection (like of your label) convolutions (such as gradient
    detection) might also ideal.
    4. Process identified regions to attempt to clean them up, if
    possible weakening regions likely to be extraneous, and
    strengthening those more likely to be correct. Morphology
    operations are one class of processing likely to help here.
    5. Select among features (if more than one) to identify the best
    match, using any knowledge you may have that can be used to
    rank them (e.g., size, position in image, etc...)

    My own processing is ball tracking in motion video, so I have some
    additional data in terms of adjacent frames that helps me remove
    static background information and minimize the regions under
    consideration for step 3, but a single image probably won't have
    that. But given that you have scanned documents, there may be other
    simplifying rules you can use, like eliminating anything too white or
    too black (depending on label color).

    My own flow works like:

    1. Normalize each frame

    1. Blur the frame (cvSmooth with CV_BLUR, 5x5 matrix). This
    smooths out the pixel values, improving the color conversion.
    2. Balance brightess (in RGB space). I ended up just offsetting
    the image a fixed (x,x,x) value to maximize the RGB values.
    Found it worked better doing it in RGB before Lab conversion.
    3. Convert the image to the "Lab" color space. I used Lab because
    the conversion process was fastest, but when frame rate isn't
    critical, HLS is likely better since hue/saturation are
    completely separate from lightness which makes for easier color

    2. Identify uninteresting regions in the current frame

    This may not apply to you, but here is where I mask out static
    information from prior background frames, based on difference
    calculations with the current frame, or very dark areas that I
    knew couldn't include what I was interested in.

    In your case, for example, if you know the label is going to show
    up fairly saturated (say it's a solid red or something), you could
    probably eliminate everything that is below a certain saturation
    level. Or if they are black and white documents, but the label has
    a color, it might be very easy to filter out everything but the

    If you're lucky, some simple heuristics applied here might have the
    net effect of masking the majority of your document image away,
    leaving primarily the label.

    3. Color matching

    1. Mask off regions of the image not falling within a specific Lab
    pixel range, sufficient to encompass my object under a variety of
    lighting/camera conditions. I typically use cvInRangeS to set
    the mask bits for pixels within the range.
    2. Perform an erosion/dilation process - cvMorphologyEx against the
    mask as CV_MOP_CLOSE. What this does is apply an erosion
    followed by a dilation. The erosion removes very small features
    (likely unnecessary matches) while the dilation combines nearby
    features with each other. The net effect is to strengthen
    larger matched areas (and help them become contiguous) while
    removing tiny features.

    Note in my case I was looking for a relatively solid color ball (it
    had gaps since it was a whiffle ball), so if, for example, your
    label is alternating colors, or dashed lines or something like that
    it might not work as well. There are more complicated algorithms
    that can match more elaborate patterns, sometimes with initial
    training on target images.

    4. Object selection

    1. Locate all top level contours of any remaining solid areas
    in the mask (cvFindContours). This will identify connected
    areas in the mask, so in your case, ideally one of the located
    contours would be the label edge. This does assume that your
    feature identification in the prior step is likely to create
    contiguous areas. Even just a few pixels of gaps will net a
    non-closed contour which is harder to work with, though the
    morphology operation will sometimes close those gaps.
    2. Evaluate "best" contour when multiple choices exist. Very small
    areas are eliminated, and remaining areas are evaluated for
    average Lab value distance from a target point (somewhat
    arbitrarily chosen at this point to represent the "ideal" ball).
    The nearest (in color distance) contour is picked, except in the
    case of two "close" contours where the further contour can win
    if it is at least 4x (arbitrarily chosen) as large. In your
    case, for example, any contours located within the label itself
    would necessarily be smaller than the label, so you could
    probably just pick largest. Also, when calling cvFindContours
    you can prevent it from finding "interior" contours.
    3. Compute and return a minimum bounding circle (center, radius)
    for the selected contour. In your case, you'd likely just use
    the contour itself - you can use the contour (with 'n' line
    segments) as is, or convert into an approximate polygon.

    The nice thing about Python with OpenCV is the interactive
    experimentation you can do right in the interpreter. Open a highgui
    window, load in your image and then experiment. After performing
    various processes, just quickly show the new image in the existing or
    a new window. You can keep several windows up to date when you test
    process an image through several transforms to see the results.

    Hope this at least gives you some thoughts as to how to proceed.

    -- David
    David Bolen, May 7, 2010
    1. Advertisements

  3. Paul Hemans

    Paul Hemans Guest

    Thanks David, that is a 'tonne' of information. I am going to have a play
    with it, probably looking at masking out the contents of the label and
    finding the label border within the scanned document is the place to start.
    Looks like there is going to be a learning curve here.

    Thanks again for your help you really put a lot of effort into this.
    Paul Hemans, May 9, 2010
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    Roedy Green
    Sep 19, 2005
  2. help
  3. apoorv

    Image getting skewed

    apoorv, Feb 14, 2005, in forum: C++
    Ivan Vecerina
    Feb 14, 2005
  4. Elliot
    Knute Johnson
    Nov 14, 2007
  5. Replies:
    Jan 24, 2009
  6. toxee

    Image bordered div Vol. II

    toxee, Jan 25, 2009, in forum: HTML
    Jan 26, 2009
  7. Martin DeMello

    Bordered Gtk Label

    Martin DeMello, Jul 6, 2005, in forum: Ruby
    Martin DeMello
    Jul 7, 2005
  8. Loic

    Select a rectangle over an image

    Loic, Feb 9, 2004, in forum: Javascript