Entries in Compressed Sensing (2)

Wednesday
22Jul2009

Optimized Dictionaries

Some months ago, I was assigned to start looking at Compressed Sensing (CS), specifically, CS reconstruction. At the time, Sungkwang and Dr. Fowler were working on some IHT reconstruction techniques, and I was given a paper to look at that took a whole different approach: a generalized, non-causal form of the Total Variation (TV) metric. With TV, the sparse domain is assumed to be the gradient domain, and the gradients are calculated along the vertical and horizontal casual directions. The paper I was looking at took the approach that we should look at all of a pixel's neighbors and use some weight assigned to each one to calculate the gradient in the following manner:

gradient(P) = P - (w1*P_up + w2*P_left + w3*P_right + w4*P_down)

So, that’s great, you say, but where do these weights come from? The authors had some methods for calculating the weights and proposed that during the reconstruction process, if there weren't some oracle weights, then you should use an EM approach to optimize the reconstruction result. This sounds good on paper, but in implementation, it was a whole different story.

I want to talk about the infeasibility of using an approach like this. For convex optimization reconstruction algorithms, performing a single reconstruction takes enough time, even for a small image. If we are taking the time to reconstruct any interestingly sized image, doing it once takes long enough, doing it 5 or so times waiting for the metrics to converge gets pretty rediculous. Waiting on this algorithm to run further reinforced to me the need for A) Blocking strategies, and B) faster reconstruction algorithms, such as IHT methods.

Anyway, to avoid this problem of retraining, the authors suggested that the weights be found once for an entire class of images, and then applied broadly to future images that fall within that class. Perhaps there might have been adequate results in applying this method, but the reconstruction results for this approach were so poor that I discontinued my research into the method.

But, at the time, it got me thinking about the entire "learning" approach to CS, which brought me to the idea of optimizing the random matrices to get better results, and in the same vein, to optimize them to a class of images. After thinking about it for a little longer, I shelved the thought. Why? I find the implied assumptions of oracle information about a problem unpalatable. There are already many papers within the body of CS work that rely on assumptions of a certain level of sparsity and, in the case of optimized weightings and dictionaries, of the actual source signal.

In my opinion, there is still too much work to be done on creating and improving reconstruction algorithms to begin degenerating into these sorts of focused optimizations. So, like the previous thought about a weighted gradient space, I shelved the idea of optimizing dictionaries. I figured that it was just the musings of an early graduate student that didn't know much about the field, yet.

You can image my surprise, then, when I opened this month's SPS Content Gazette and found the following work:

  • J.M. Duarte-Carvajalino, G. Sapiro. "Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization."

And within that work is a reference to an earlier one that I was not aware of:

  • M. Elad. "Optimized Projections for Compressed Sensing."

I'm not calling into question the quality or robustness of these works, but I find myself torn. I find the concept of optimizing not only the random projections, but also the transformation into the sparse basis, very, very interesting, but I am wondering about practicality of these methods. As an engineer, I like optimizations, I like the idea of taking a general idea and applying it to a specific situation, but as a researcher, I like general solutions that are guaranteed to work for a wide range of problems.

What do you all think?

 

Monday
13Jul2009

The Starting Point

I figure that the best place to start this discussion is with the work that our group already done. One of the major problems with Compressed Sensing is the time required to reconstruct the final signal from its dimensionality reduced form. Let me pause here to say that for a more basic review what exactly Compressed Sensing is and what it is trying to accomplish, there are many researchers that have done an excellent job explaining the fundamentals. I suggest referring to the following resources:

The reconstruction problem is formulated as either an equality or inequality constrained Linear Program, usually solved via a convex optimization process. The problem is this: to find the best value and arrangement of sparse basis coefficients so as to minimize the ell_1 magnitude of the sparse basis coefficients while maintaining an adequate 'closeness', in the case of inequality constrained optimizations, of the resulting image signal in the measurement basis. In other words, we want to find the sparsest signal that could have allowed for the observations we received from the encoder.

The difficulty with this decoding is that we are attempting to optimize an N-dimensional'vector', a time consuming, exponentially complex process that quickly becomes intractable for interestingly sized signals, say, images of size greater than 128x128. For one involved in the development of the theory behind Compressed Sensing, the pure and applied mathematicians, the proof that one can find a solution is enough to consider the solution found, but for engineers seeking to apply the theory to practical implementations, the computational barrier is one that destroys most of Compressed Sensing's potential usefulness.

As engineers, our group is interested in Compressed Sensing for its applications in Remote Sensing. It has already been shown that the theory can be physically implemented in such a way that a single sensor can capture an entire image of arbitrary resolution. This is excellent for applications, such as Remote Sensing within the lower frequency bands of the EM spectrum, where sensors are expensive to build and maintain, where high-resolution banks of them would be monetarily infeasible. However, applying current Compressed Sensing reconstruction techniques to modern Remote Sensing datacube sizes would be an exercise in futility, as the decoding procedure's time to run would be on the order of years.

In order to rein in time complexity of the decoding process, Dr. Fowler and Sungkwang, starting with Gan's work, have been progressing the area of Block Compressed Sensing. This method modifies the standard Compressed Sensing encoding process by enforcing a blocking rule, splitting the image domain into separate subproblems which can be stitched together at the decoder side. The blocking rule has two effects, firstly it restricts the size of the measurement matrix, therefore requiring less memory space on the encoder, and secondly, reducing the decoding time complexity by a few orders of magnitude. In physical implementation of the encoder, this would require a little bit more work, a rotating mirror and some additional lensing, but it would make Compressed Sensing viable for real data set sizes.

So far, we have recorded some significant results in this area and submitted a paper that we hope will soon be accepted and shared with you all. Suffice to say, we are seeing some very high quality reconstructions using this method, and all occurring at very high speed thanks to the low time complexity of the Projected Landweber and Thresholding algorithms.

Next time, I hope to update you all with some of the areas that I am specifically investigating and how it applies to the work we've already done.