Working with GPUs in image processing: Taking the first hurdles

Introduction

After the first exploits into OpenCV and Java/Groovy bindings, enthusiasm was cooled considerably, as I discovered that the GPU-enabled OpenCV classes do not have Java bindings yet. This is a bit of a bummer, as Groovy-bound GPU accelerated OpenCV routines is the holy grail that we are after. Even more worrying is that Java bindings for the GPU routines did not make it into the list of features for the OpenCV 3.0 release (Q3/Q4 of 2013). Mid-term idea is to try to write those bindings for the most interesting examples myself, and commit them to the code. Short term, I need to check which is the interesting example (this post).

A short detour

As a first reaction, I made a quick review of JCuda and JavaCV, hoping to find some further insights in the Java binding issue and checking out what's available there. JCuda is, more or less, a direct Java binding on the CUDA code, which means that you program lower level routines or use the standard CuFFT or CUBLAS libraries directly. JavaCV examples are mostly related to the earlier OpenCV implementation, and do not include the GPU routines either, although it seems to be able to work with OpenCL.

Other interesting developments are some projects that work on direct use of GPU inside the JDK, such as Project Sumatra and The Root Beer compiler and Aparapi. These projects seem to still be at a very basic level, though, but I'll have a closer look at these later on.

Most of the Java binding involves the use of the Java Native Interface (JNI), which is more or less a translation layer between Java objects and native methods that can use those objects (both ways). This suggests that the Java bindings for the GPU enabled OpenCV routines should not be that different from their non-GPU versions, as calling conventions tend to be the same. This is for the mid-term, though.

An interesting example

One of our more interesting CUDA applications has been template matching. In satellite image processing, template matching is typically used in geo-referencing workflow. The newly acquired image is matched against a template which is taken from a known source with an exact geographical reference to determine a tiepoint (or ground control point). OpenCV has a pretty good template matching routine, which is described in one of the examples.

The example code is converted to a groovy script below. Note that the window management has been stripped out, as this is also not yet available as Java wrappers of the relevant Highgui library routines (the suggestion is, again, to work this out on your own). Make sure to check out the OpenCV Java API docs to understand some subtle differences, esp. for some method calls.

/**
 * @file MatchTemplate_Demo.groovy
 * @brief Sample code to use the function MatchTemplate, rewritten in Groovy
 * @author Guido Lemoine
 */

import org.opencv.core.*
import org.opencv.imgproc.*
import org.opencv.highgui.Highgui

// Always start with this one
LibLoader.load()

/// Load image and template
def img = Highgui.imread(args[0])
def templ = Highgui.imread(args[1])
def match_method = args[2] as int

/// Create the result matrix
int result_cols =  img.cols() - templ.cols() + 1
int result_rows = img.rows() - templ.rows() + 1

def result = new Mat(result_cols, result_rows, CvType.CV_32FC1)

/// Do the Matching and Normalize
Imgproc.matchTemplate( img, templ, result, match_method )
Core.normalize( result, result, 0, 1, Core.NORM_MINMAX, -1)

/// Localizing the best match with minMaxLoc
def match = Core.minMaxLoc( result )

/// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
if ( match_method  == Imgproc.TM_SQDIFF || match_method == Imgproc.TM_SQDIFF_NORMED )
  { printf "Low value match %.2f at %s%n", match.minVal, match.minLoc }
else
  { printf "High value match %.2f at %s%n", match.maxVal, match.maxLoc  }

Compile and run as before. We're trying to find back the left eye (right in the picture) of the type below.

groovyc -cp ../build/bin/opencv-245.jar:. MatchTemplate_Demo.groovy

java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 0
Low value match 0.00 at {262.0, 182.0}
java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 1
Low value match 0.00 at {262.0, 182.0}
java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 2
High value match 1.00 at {394.0, 24.0}
java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 3
High value match 1.00 at {262.0, 182.0}
java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 4
High value match 1.00 at {353.0, 305.0}
java -cp ../build/bin/opencv-245.jar:/usr/local/groovy/embeddable/groovy-all-2.1.4.jar:. -Djava.library.path=../build/lib MatchTemplate_Demo resources/NotSoAverageMaleFace.jpg resources/lefteye.jpg 5
High value match 1.00 at {262.0, 182.0}

The match method given by the last parameter (range 0-5) is as in the original example. Method 2 (CV_TM_CCORR) and 4 (CV_TM_CCOEFF) are the non-normalised versions, which fail to find the correct match location, exactly as in the example case (which uses different imagery).

Each run takes approximately 0.55 s on the HP Elitebook 8760w which has an Intel® Core™ i7-2820QM CPU @ 2.30GHz × 8 with 16 Gb RAM running Ubuntu 12.04 LTS 64-bits.

This is not bad, given the 451 by 452 size of the image and 56 by 56 of the template. However, we're normally interested in finding somewhat larger templates (e.g. 256 by 256) in a large image. We'll do some scaling in the next step

(to be continued)

Working with GPUs in image processing

Monday, June 24, 2013

Taking the first hurdles

Introduction

A short detour

An interesting example

No comments:

Post a Comment