Java Imaging
Basics
I’ve been trying out the imaging libraries available for Java for the past couple of days. I’m happy to say that I’ve liked what I’ve seen thus far. I’m sticking to this platform to develop the server side technology for my project.
The first task was to install the latest and greatest from Sun. This
included Netbeans 5.5 and JDK 6 available as a single
package. I also had to get the media libraries,
which isn’t a part of the standard java kit for some reason. I had
absolutely no problems installing all of this on a stock Debian Linux
system. Now that Java is Free, in the future all of this should
just be an apt-get away.
Netbeans is a kick-ass environment to develop with. I really like how unit testing is an integral part of the development process. Unlike Eclipse, Netbeans is not sluggish at all on a machine with 512 MB of RAM.
Experiment
I didn’t want to solve “toy problems” to test this environment. I decided to write a general image magnification function. Once again, instead of basing this on a simple super-sampling interpolation algorithm, I decided to try something more sophisticated.
In signal processing, a simple way to “zoom” into signals is by
padding the frequency spectra of the signal with zeros. This method of
zooming reduces the computational complexity from
to
due to the use of the Fast Fourier Transform.
Outline
We’ll assume that our domain of interest is always a power of two. Without loss of generality, we’ll also assume that the width and height are the same. Here’s how you’d use the FFT to do the zooming. I’ve also included the Matlab/Octave command equivalents in brackets to make it easier to understand:
- Take the 2d-fft of the image. (
fft2) - Make sure the DC components are in the center. If not shift-cycle it. (
fftshift) - Add a border of zeros around the image. The final dimensions of the image is still a power of two.
- Take the 2d-inverse fft of the image. (
ifft2) Again, you might have to shift-cycle the image as in step 2. - This is the zoomed in image.
In programs like Matlab, the finite precision of the floating points leads to non-zero complex values in the inverse transform. These can be safely ignored.
Step three is the key step. What it does is add a bunch of zeros in the middle of the frequency spectrum. You could have added the pad right in the middle of the image, but as we’ll see in the next section, it’s simpler to add a border of zeros to the boundaries of the image.
Methodology
The Java Advanced Imaging (JAI) APIs define something known as an operator. This is the core of your signal processing. You have point operators, area operators, color quantization operators, edge extraction operators, statistical operators…you get the idea. All of these operators are constructed in a general way using a ParameterBlock. Here’s an example of using the AddConst operator for adding a constant “2.0″ to every pixel of an image:
// lang Java
ParameterBlock pb = new ParameterBlock ();
pb.addSource (src);
double [] val = {2.0};
pb.add (val);
PlanarImage dst = JAI.create ("AddConst", pb);
PlanarImage is the most general class for representing a 2-dimensional image. The sweet thing about JAI is that it builds a directed acyclic graph of operations, much like GEGL. This means that if you have an image reader, the image isn’t actually read until the pixel value is written to disk much later in the pipeline of operations. This is very powerful.
Using the outline, we build a general pipeline using the operators
fileload, dft, PeriodicShift, border, PeriodicShift, idft,
MultiplyConst, filesave. Done. We require the multiplication of
the image by a constant because of the scaling factors in the forward
and inverse transforms. Below, we see two images of sizes 256×256 and
the zoomed in 512×512 pixel images. This means I zero-padded the
frequency spectra by 128 pixels on each side.
The performance was not bad at all. During installation, I had to drop
in a .so file, so I’m pretty sure the math intensive parts of JAI
are optimized in a native language, and not Java.
This is pretty exciting because it emphasizes the UNIX approach to
solving problems. Just like how you can combine sort, wc, grep
and maybe other small tools to create a powerful end-product, various
operators can be combined to build really cool image processing
pipelines.