Java Imaging

Basics

I’ve been trying out the imaging libraries available for Java for the past couple of days. I’m happy to say that I’ve liked what I’ve seen thus far. I’m sticking to this platform to develop the server side technology for my project.

The first task was to install the latest and greatest from Sun. This included Netbeans 5.5 and JDK 6 available as a single package. I also had to get the media libraries, which isn’t a part of the standard java kit for some reason. I had absolutely no problems installing all of this on a stock Debian Linux system. Now that Java is Free, in the future all of this should just be an apt-get away.

Netbeans is a kick-ass environment to develop with. I really like how unit testing is an integral part of the development process. Unlike Eclipse, Netbeans is not sluggish at all on a machine with 512 MB of RAM.

Experiment

I didn’t want to solve “toy problems” to test this environment. I decided to write a general image magnification function. Once again, instead of basing this on a simple super-sampling interpolation algorithm, I decided to try something more sophisticated.

In signal processing, a simple way to “zoom” into signals is by padding the frequency spectra of the signal with zeros. This method of zooming reduces the computational complexity from O(n^2) to O(n*log(n) due to the use of the Fast Fourier Transform.

Outline

We’ll assume that our domain of interest is always a power of two. Without loss of generality, we’ll also assume that the width and height are the same. Here’s how you’d use the FFT to do the zooming. I’ve also included the Matlab/Octave command equivalents in brackets to make it easier to understand:

  1. Take the 2d-fft of the image. (fft2)
  2. Make sure the DC components are in the center. If not shift-cycle it. (fftshift)
  3. Add a border of zeros around the image. The final dimensions of the image is still a power of two.
  4. Take the 2d-inverse fft of the image. (ifft2) Again, you might have to shift-cycle the image as in step 2.
  5. This is the zoomed in image.

In programs like Matlab, the finite precision of the floating points leads to non-zero complex values in the inverse transform. These can be safely ignored.

Step three is the key step. What it does is add a bunch of zeros in the middle of the frequency spectrum. You could have added the pad right in the middle of the image, but as we’ll see in the next section, it’s simpler to add a border of zeros to the boundaries of the image.

Methodology

The Java Advanced Imaging (JAI) APIs define something known as an operator. This is the core of your signal processing. You have point operators, area operators, color quantization operators, edge extraction operators, statistical operators…you get the idea. All of these operators are constructed in a general way using a ParameterBlock. Here’s an example of using the AddConst operator for adding a constant “2.0″ to every pixel of an image:

// lang Java
ParameterBlock pb = new ParameterBlock ();
pb.addSource (src);
double [] val = {2.0};
pb.add (val);
PlanarImage dst = JAI.create ("AddConst", pb);

PlanarImage is the most general class for representing a 2-dimensional image. The sweet thing about JAI is that it builds a directed acyclic graph of operations, much like GEGL. This means that if you have an image reader, the image isn’t actually read until the pixel value is written to disk much later in the pipeline of operations. This is very powerful.

Using the outline, we build a general pipeline using the operators fileload, dft, PeriodicShift, border, PeriodicShift, idft, MultiplyConst, filesave. Done. We require the multiplication of the image by a constant because of the scaling factors in the forward and inverse transforms. Below, we see two images of sizes 256×256 and the zoomed in 512×512 pixel images. This means I zero-padded the frequency spectra by 128 pixels on each side.

Source Image Zoomed in image

The performance was not bad at all. During installation, I had to drop in a .so file, so I’m pretty sure the math intensive parts of JAI are optimized in a native language, and not Java.

This is pretty exciting because it emphasizes the UNIX approach to solving problems. Just like how you can combine sort, wc, grep and maybe other small tools to create a powerful end-product, various operators can be combined to build really cool image processing pipelines.

Comments are closed.