[In this Intel-sponsored feature, part of the Gamasutra Visual Computing microsite, Lightspeed Publishing's Lee Purcell lays out deferred mode image processing, a new addition to the Intel Integrated Performance Primitives Library, which speeds up complex image-processing tasks with up to 3X performance increases.]
Wherever
you look, the graphical resolution of commonly used digital image
formats is steadily increasing, resulting in larger file sizes
and more intensive processing requirements. In several
fields of image processing-digital photography, high-definition digital
moviemaking, medical diagnostics, surveillance imaging, and others-frame
sizes are increasing substantially.
In the case of digital video
formats, such as Cinema 2K and 4K, the color space is also being expanded,
further increasing the file sizes. File sizes for Cinema 4K content can
be as much as one terabyte per hour of video. On the other end of
the scale, even mobile handheld devices routinely capture images that
can be several megapixels in size. With image sizes of this magnitude,
fresh approaches are needed to maintain performance when
manipulating and processing image data.
In response
to a requirement from a strategic Intel customer involved in
large-scale computer tomography images, Intel software engineers
began conceptualizing a framework for more efficiently using the
extensive library of image-processing algorithms available in Intel
Integrated Performance Primitives (Intel IPP)
library.
The resulting
solution, which is featured in
the Intel IPPP version 6.0 release, is called deferred mode image
processing (DMIP). DMIP effectively handles large image data arrays that
don't fit entirely within the processor L2 cache.
DMIP, now an
integral part of the Intel IPP package, performs pipelined
sequences of fast functions to process image data in manageable
portions, whether organized by tile, block, slice, or another
element. This approach effectively combines the benefits of pipelined
processing with manually optimized code of the Intel IP library.
A
directed acyclic graph (DAG) defines inputs (from image data
sources), outputs (destination images or data destined for memory),
and operations and represents each as nodes on the graph. These
nodes correspond to image-processing functions and their
inputs and outputs. For operations that can be handled concurrently,
parallel threads are
generated to enhance
performance. Using DMIP can
accelerate image-processing
tasks between one
and a half to three times
compared to a
non-pipelined approach.
One key
benefit: DMIP provides
formula-level access to the
vast library of Intel IP
functions. Within the
Intel IP version 6.0 release,
developers can choose
from among thousands of
C functions that encompass a large span of data operations.
Those not familiar with full range of options in the Intel IP library can
sometimes be discouraged from employing the algorithms in their
applications. By removing the need to focus on the details of low-level
programming, DMIP simplifies access to library functions, letting
developers integrate advanced, proven routines into their code and take
advantage of data alignment performance gains, particularly on Intel
processors. These gains typically result in significantly faster instruction
processing times for aligned data, commonly achieving speed
increases of two to three times.