Entries in ICASSP2010 (6)

Thursday
Mar182010

Day 4 -- A Veritable Cornucopia of Compressed Sensing

If you were going to come to ICASSP for one day, this would probably have been the day you chose to come (if you read these blogs, anyway). The morning consisted of a CS poster session and a talk by Michael Wakin. This afternoon consisted of an entire lecture session: Compressive Sensing: Theory and Methods. Yes, it seems IEEE has decided on the -ive rather than the -ed. Also, I think today will consist of fewer pictures, because most of the ones I got tended towards blurry and the back of folk's heads. Those poster crowds can get rough.

The first poster I saw was "Empirical Quantization for Sparse Sampling Systems"  by Michael Lexa out of the University of Edinburgh, formerly out of Rice. (Rice really pumps out the DSP don't they?) He's a Nuit Blanche-er, so Igor can go ahead and give a little fist pump :P As for the work, I find it pretty cool that folks are beginning to look at the quantization problems. This method of quantization, Michael said, is best understood in the context of "quantization for classification." It seems that the target for this technique is something akin to  Mishali & Eldar's Sub-Nyquist wideband sampler. I'm pretty far from throughly understanding the work, so I'm afraid that my further explanation won't do much justice to it. Be sure to check it out. If you're handy with the Kullback-Leibler, you should feel pretty comfortable with this paper. Great work, Michael, hopefully that "spouse's pay check" grant will let us see more of the same :)

After talking for a bit about what we were working on, Michael pointed me towards Jason Luska, from Rice, and Marco Duarte, formerly of Rice now at Princeton, who both worked on the single-pixel camera. I'm eager to see one of these things in action, and maybe I won't have to wait very long. Jason told me that a recent startup has formed for the production of these devices, and that recently they have put together a portable version, so the Single-Pixel Camera is finally out of the lab and capturing IR in the natural world. I'm glad to hear that work is progressing on it! The new versions are apparently running at high resolution (1024x768 I believe) and thanks to some optimization folks, Jason says the reconstructions are "under a minute". I believe they're still using the l1 BP approach, but I can't be sure. He also hinted at some future work on "in the loop" reconstruction feedback to the encoder for distilled image enhancement, which sounds complicatedly wonderful.

Marco also reminisced about the first six months of building the camera, saying they spent they spent the first six months reconfiguring the reconstructions, inserting proposals from new research all along the way trying to correct the erroneous results they were getting only to find out that the problem all along was faulty optics. Ouch! Thank you both for the conversation.

Wakin's talk, "Concentration of Measure for Block Diagonal Measurement Matrices", I found to particularly interesting, since the blocked-approach we've been using amounts to this same approach. I imagine that we'll be making use of this theoretical framework in the future, and I'd love to see it expanded into a journal article. I've been looking at some similar phenomena and wonder if they can be fit into this same framework. The only difference is that I've been phrasing what I've been doing as a correlation between block energies rather than a concentration. I'll have to fiddle around some more.

Marco also gave a talk at the afternoon session, "Kronecker Product Matrices for Compressed Sensing", which was also very interesting because of his kind of "fusion" of basis functions to act as a sparse 3D basis for the joint reconstruction of hyperspectral datacubes. I'm wondering if this would be useful at all with Dr. Fowler's Compressive Projection PCA. I also really liked Silvia Gandy & Isao Yamada's "Alternating Minimization Techniques for the Efficient Recovery of a Sparsely Corrupted Low-Rank Matrix". Its a little bit of a different application of an l1 minimization, but the end result is very interesting, and I would like to see more examples of it. The one given was separating shadowing and specular effects from faces given a database of a face under different lighting conditions. I had seen something based on this last year and I'm wondering if this was the same work.

I also took a look at the "Bayesian Compressed Sensing Imaging Using a Gaussian Scale Mixture" poster shown by George Tazagkarakis. It seemed pretty solid, though I didn't get many details on the reconstruction time, and the quality measures were a little bit hard to directly compare to some other methods. However, this was one of a few papers that I saw today that make use of the GSM as a sparsifying prior, rather than a laplace or gaussian (which does a terrible job) prior. There was another interesting prior for the Bayesian crowd that was shown in the CS lecture session: the Inverse Gamma prior, used in "Efficient Sparse Bayesian Learning via Gibbs Sampling" by Xing Tan, Jian Li, and Peter Stoica. I think there was a Jefferys prior thrown into the mix sometime today, as well. All I can say is, all you BCS folks are insane. 

Okay, thats all for now, check back later this evening or tomorrow :)

 

Wednesday
Mar172010

Day 3 -- A Break

There was one more talk at the Dictionary Learning session that I wasn't able to post about thanks to some really wonky internet at the conference. Right at the end it gave out and I lost all the notes that I made...guess I learned my lesson!

"Ultrasound Tomography with Learned Dictionaries" Ivana Tosic, Ivana Jovanovic, Pascal Frossard, Martin Vetterli, Neb Duric

It was on the reconstruction of sound speed through ultrasound tomography samples that used l1 regularization as an added constraint to offer better performance than a least squares approach.

I have some apologies to make:

Firstly, I missed the session on Video and Motion Analysis last night, which is unfortunate because there were two papers presented during that session of particular interest to us (though Sungkwang did attend, and I'm pressing him to make a write up :P ). The two papers were

"Motion Estimation from Compressed Linear Measurements" Vijayarghavan Thirumalai, Pascal Frossard

"Compressive Sensing and Differential Image Motion Estimation" Nathan Jacobs, Stephen Schuh, Robert Pless

Also, this morning I had intended to attend a talk in the Target Detection & Localization session by the folks over at Duke:

"Hyperspectral Target Detection from Incoherent Projections" Kalyani Krishnamurthy, Maxim Raginsky, Rebecca Willett

Unfortunately, I got caught up in the Dictionary Learning session and didn't make it in time for this talk, but Zach Harmany, one of their colleagues at Duke, told me it was a pretty good presentation, and I'm sorry I missed it.

Last night, Sungkwang and I were having pizza out at this great little Italian bar down the road and talking about CS and the nature of research. At a big event like this, its easy to see how "far" research has come. Yes, I did use quotations there. Maybe my brain is wired a bit differently, but I'm a big fan of quality over quantity, which is a topic that I think many researchers have been moaning about for many decades. The problem, of course, is how to quantify the effectiveness of research dollars spent, and ultimately quantity measure has been adopted because of its objectiveness. I'm not sure its something that can be easily changed, and solution is not exactly well defined.

Also, after looking at many posters the past couple of days and mulling over everything, I'm beginning to wonder about the nature of the transition in research from finding answers to challenging problems to having challenging and complex solutions and finding problems to put them on. There seems to be this major disconnect with what the "point" of all this is. Perhaps I'm talking much more like a PhD student and much less like a future professor, and thats okay, I suppose. But I have an investment in all this, too, its my literal bread and butter as well. 

What I'm getting at is this: I can see myself getting very tired, very fast with some of the self-serving nature of research. In the end, it cannot be about the complexity of your contribution but its meaningfulness. Of course, many significant contributions to the community and humanity as a whole are inherently complex. The theory underlying wavelets can be thought of as "complex", but the final solution is, in fact, an elegant one. I suppose I would just like to see the pursuit of elegance in engineering research as it is in many scientific pursuits. 

I want research to consist of wooded vales, tobacoo pipes, benches, good outside conversation and collaboration (without the nagging consistent fear of theft), conferences of under 200, and a constant focus on solving the present problems definitively. A pipe dream, I suppose. I'm probably about 70 or 80 years too late for that.

This kind of got me thinking about a conference for strictly CS related topics. I hadn't heard of one yet, maybe its already happened and I've missed the boat. Many things in the field are still in what I'd call a "high-entropy" state and haven't yet coalesced into some standard models of which problems we're solving and what context we are solving them in. It seems like every paper has a different results metric and its not clear what to take seriously and what not. I can't tell you how many times I've started reading a CS paper with a very intriguing abstract only to find it fairly light and completely impractical. There seems to be a lot of people walking around with their hammers giving a one-go at the CS "nail."

I'm still very much in favor of seeing more thought going into the hardware behind CS. Until there are some very robust sampling methodologies for a particular signal type, CS will remain on the sidelines of that field. The medical imaging community has done an excellent job making a case for CS, which I think is great. The imaging community, I think, needs a bit more work. Sure, the single pixel camera has been made, but thats far from "solving" the image and video signal sampling problem. There are so many other things to look at and validate, especially in the field of video. I'm actually really surprised that there hasn't been more hardware work to come out of Rice after the single pixel camera. I am becoming more and more convinced that we've got to be tying CS in with some kind of hardware framework. In other words, can we stop sampling in the wavelet domain, please? How does that make any physical sense? Where are these magical coefficient capturing cameras? 

CS is now beginning to be more well understood analytically, which is great, and provides us with insight into future implementations and what CS is capable of, but our empirical & physical results need a lot of catching up. Perhaps over the next year we'll see more of this :)

Okay, thats all for now. I hope you can bear with my idealism from time to time!

Wednesday
Mar172010

Day 3 -- In the Mix

Posting live from the Dictionary Learning for Sparse Representation special session:

 

 “An l1 Criterion for Dictionary Learning by Subspace Identification”  Florent Jaillet, Remi Gribonval, Mark D. Plumbley, Hadi Zayyani

  • First on the topic of sparse representation.
  • Usefulness of sprase representations “Such as Compressed sensing, which is a very active area
  •  Challenge is finding sparse representations for the new kinds of data we encounter: “What is a sparse representation for social network data?”
  •  How do we find sparse representations for data sets without any kind of a priori knowledge.
  • The idea for dictionary learning (of course) is that we've got some set of data, we assume that there is some underlying "dictionary" which represents the data sparsely that we can recover from some subset/training data.
  • In subspace models, we desire to find a set of orthogonal vectors which can be used as a basis for transforming the data into a sparse domain.
  • Proposed is the identification of normal vectors, vectors which are orthogonal to many training samples. I.e. finding a vector that is orthogonal to a large enough subset of the training samples will probably be orthogonal to the set as a whole.
  • The propose approach is find the best set of orthogonal vectors which represent the training data with the smallest l1 magnitude
  • We're looking at some 2D empirical examples now, which consist of something like a gaussian blob of random points with some "lines" of random points drawn through the center of the 2D gaussian. A vector is placed at the mean of this set and then rotated around (like you do in polar coordinate system) and calculate the l1 magnitude at each angle and make a plot and find the minimum
  • This is cool, he is showing the same thing in 3D and you get this crazy looking blob plot. 
  • The point is being made with all this is that the problem is non-convex, which of course makes the minimization problematic, so the method must rely on steepest descent, which is implemented as a quadratic program in the paper. I.e. the method is hampered by local minima.
  • Some theoretical garuntees: If the number of training samples is greater than a function of the dimensionality, then the local minima exist at the correct solution with high probability.
  • The authors are now progressing on to higher dimension problems and to develop a full algorithmic framework.
  • What happens when the subspaces are close how robust is the method? "The greater the coherence, the less robust the system is to outliers."
  • Why try to identify orthogonal vectors rather than the atoms of the dictionary? "The algorithm needs to be derived to perform the optimization tends to be too complex in that situation, and looking at the orthogonal vectors provides an easier analysis."
  • What about the number of atoms or subspaces, is it assumed to be known? "We are assuming here that we know it, it is a given input of the problem. What I believe is that based on our theoretical guarantee so far...provided the subspaces are not too coorelated, and given a sufficent number of training samples, we can identify the local minima. So if we do not know the dimensionality, given enough training samples, you should be able to identify the number of dimensions from the number of local minimia [but not in a robust fashion]. This is an open problem."

 

"Structured and Incoherent Parametric Disctionary Design" Mehrdad Yaghooobi, Laurent Daudet, Michael Davies

  • Defining the problem as trying to find the dictionary, D, which gives us the best sparsity.
  • Some previous methods: concatenation of orthonormal bases, dictionary learning using a set of training samples, but for this paper they focus on dictionary design subject to a certain property such as the RIP, minimum coherence, etc.
  • One important goal for this work is to find a dictionary whose atoms have very small coherence.
  • One kind of incoherent basis is the Equiangular Tight Frame (ETF), and is given as an "ideal" dictionary.
  • Parametric dictionaries: a mapping from a set of parameters to a set of atoms (i.e. a functionally produced dictionary)
  • Structured dictionaries: a dictionary is structured if the atoms are geometrically related.
  • And the, of course, they put these two together with the structured property of being shift-invariant.
  • The formulation is to minimize the difference between a  couple of Gramm matrices in the l-inf norm.
  • Okay a lot of crazy math just happened when it came to relaxing the formulation but the usm of that is the fact "that this can be solved very efficiently." I'll have to take his word on it
  • How close to being an ETF has this methodology gotten? "We are still quite far".
  • But there is still gain the performance of signal recovery using the parametric dictionaries.
  • Future work: investigating structures other than shift-invariance, and applying the parametric dictionary model to the dictionary learning model.

 

 "A Union of Incoherent Spaces Model for Classification" K. Schnass, Pierre Vandergheynst

  • Some times, classification of large dimensionality problems can be problematic due to training time and storage space, so some kind of dimensionality reduction usually needs to be employed.
  • In general, when we do this, we apply the same matrix over the entire training set for dimensionality reduction
  • What is proposed is a more general model, where each class is embedded into the lower dimension using class-specific features.
  • So how do we choose which features are the best features to check for each class? This is the core of the proposed method, and the given formulation is relaxed using the Grassmanian space theory for low rank feature matrices.
  • The optimal feature matrix is found via alternate projections between two sets, which we want to find the intersection of. I'm not really quite clear on what these two sets are...ahh one is the set of orthonormal matrices, and the other is the set of matrices that give stability to the classification.
  • They tested this method on the Yale B and AR databased on identifying subjects. Out of 2414 images, the best success rate of classification was 98.87%...slightly better result than some state of the art method run on the Yale set.
  • There is no convergence proof (the objective is non-convex in nature).

 

 "Adaptive Compressed Sensing: New Class of Self-Organizing Models for Neuroscience" Will Coulter, Christopher Hillar, Guy Isely, Friedrich Sommer

  • Problem: do agents (populations of neurons) communicate structured information across regions of space in the brain.
  • Some population of neurons is teasing out the sparse structure information of some visual signal in some way. 
  • Can this be applied to CS? I.e. is reasonable for CS to be run in the environment of the brain.
  • Here, we aren't really interested in reconstructing the whole image, just kind of getting the impression of it.
  • The big idea in classification in past few years isn't carving out some complex decision boundry, but rather to rotate the information into a higher dimensional space where the classes might be linearly seperable...which is something that we can see biologically.
  • The idea is that encoding data in pixels is "unnecessary", we just need sparse elements, and very few of those if that.
  • This neurological compression destroys the original signal, but how can we recover the structure from the original signal.
  • In the brain, neurons respond to different "structures", just as in sparse coding. 
  • In computation, we can alternate between estimating the sparse basis and the minimization/reconstruction and come up with something that looks like the neuron perception fields found biologically.
  • There is a big discussion over whether or not information is sparsely represented in the brain or not. This model makes the statement that it in fact, does both. 
  • Use receptive fields to measure the quality of their work, i.e. the same way the biological tests are employed. I.e. characterized with a 2D stimulus response.
  • They used this also to train up  Psi which sparsifies the best in an unsupervised procedure.
  • Main point, even though the original signal isn't being preserved, the structure of the signal is somehow being preserved through multiple stages of neuro-processing.
  • The content of images changes to learned sparse dictionaries, i.e. sending scenes of buildings will give the first stage a "building" basis, and this will propagate to all other bases in the pipeline.

 

"Improved Sparse Recovery Thresholds with Two Step Reweighted l1 Minimization" Weiyu Xu, Amin Khajehnejad, Amir Salman Avestimehr, Babak Hassibi

  • This is a much more CS oriented talk..going through the standard background procedure now.
  • For a given subrate (delta), if you use l1 minimization to discover the original signal, there is some kind of critical threshold of sparsity, mu(delta), so that almost all k-sparse x are recoverable via l1 minimization.
  • So the question is is there a linear time algorithm that has a weak threshold strictly better than l1 minimization?
  • For an arbitrary distribution, that is for a universal improvement, it is an open problem. But for certain classes of distributions, we can find a threshold improvement.
  • The procedure is a two step one, first, the signal is recovered using standard l1 minimization, and then an approximation of the support set of x corresponding to the significant coefficients, then the reweighted method is employed using this set of weights, penalizing values outside the set of significant coefficients.
  • What is shown by this algorithm is that for a fixed subrate, there exists some value, eps, which is greater than zero such that the threshold is moved to (1 + eps)*mu(delta). 
  • Candes, Wakin, & Boyd proposed a different reweighted method, but the one used in this method is a little bit different because of theirs doesn't exactly lend itself to mathematical analysis.
  • There is a weak sense of robustness in this case, so there is an analytical boundary on the reocverablility of a signal given that it isn't exactly sparse.
  • This is pretty cool, the phase transition diagram for this method looks more like a nike-swoosh than diagonal (i.e. much greater threshold performance at low subrate/sparsity pairs)
  • The quality of the phase transition boundary is relative to the distribution of the coefficient magnitudes, specifically the distribution near the origin. If there are a large number of near-zero coefficients (but not quite) this will be problematic for the reconstruction (as these will be lost along the way).

 

 

 

Tuesday
Mar162010

Day 2 -- Part 1

Today began with a plenary talk by Dr. Ronald W. Schafer of HP Labs, "A Celebration of the Science and Technology of DSP" which I found to be pretty interesting. I noticed a few nodding off here and there, and Dr. Schafer isn't exactly the most engaging speaker, but the history of DSP was very informative. For example, I had no idea that the first VLSI implementation of a DSP process was in fact the Speak & Spell. Sort of the foreshadowing of DSP's influence in the multimedia/entertainment domain. In speaking of the future of DSP, Dr. Schafer mentioned "who knows, maybe Compressed Sensing will be the new FFT." We can hope, right? I sort of wanted to give a little shout, but I supressed the urge. I'm a total fanboy, its true.

 

Afterwards, I talked to Ahmad Akl, who was presenting a poster by his colleagues Veria Havary-Nassab and Sadiq Hassan under Dr. Valaee at the University of Toronto, "Compressive Detection for Wide Band Spectrum Sensing". This paper was pretty interesting, more for the novelty rather than complexity of its application. The basic idea is that you have some set of transmitters that want to communicate on a shared medium, in this case at some wireless frequency. Basically, the total medium bandwidth, W, is split into a set of discrete communication channels. Traditionally, to know on which of the channels to transmit, each device would have to scan through the entirety of W, checking the current signal energy at each communication band. Using CS, the approach is altered by sampling the entire band and sending it through a bank of K filters, where K < N. This acts as the columns of a measurement matrix, \Phi. Ahmad wasn't sure about which method was used to reconstruct the signal, but assumed it to be something like the standard \l_1 BP. 

The implication here is that the channel usage is in fact sparse on the whole. I asked about the situation in which the communication medium is in fact full, i.e. in the case of network congestion and Ahmad said "it wouldn't work." I can see this as being a potential negative to this process, but otherwise interesting. Ahmad will be presenting his work "Accelerometer-Based Gesture Recognition via Dynamic-Time Warping, Affinity Propagation, & Compressive Sensing" on Friday, so I'll try and stop by for his poster.

Also talked to Markus Flierl about his paper "A l1-Norm Preserving Motion Compensated Transform for Sparse Approximation of Image Sequences" which was a pretty heady work and I would be dishonest if I said that I followed all of it, but I'll try to convey the gist. He suggested to start with his previous work "A Motion-Compensated Orthogonal Transform with Energy Concentration Constraint". He said that the idea was "not practical" but more of an exercise in what is possible. I only started to begin to understand this while eating lunch at Chen's Kitchen, just down the street from the hotel. Sungkwang and were trying to think about what you could actually do with an l1 preserving transform, but we came up empty. I'm not really sure what this would be useful for, but I think that was Markus meant by "not practical". I asked him if he thought there was any confluence between these ideas and Compressive Sensing, but he was of the opinion that they would be unsuited due to CS's emphasis on linear projections into the measurement basis. I'm sure someone will find use for this, however!


Next, I spoke with Dimitri Milioris about his group's work "Multiple-Measurement Bayesian Compressed Sensing Using GSM Priors for DOA Estimation". Before today, I hadn't really ever heard of Arrival (DOA), but it seems to be pretty popular with the beamforming crowd. Also, Bayesian CS seems to have a really good showing at ICASSP this year. There are quite a few papers being presented that make use of this methodology. To be honest, I've kind of stayed far away from BCS because of some its complexity, and the simplicity of iterated methods just make sense to me, but the Bayesian approach seems very powerful in the noisy setting. I'll have to sit down and really get into it a bit more. Anyway, back to the paper at hand. Basically, in this work, they are trying to determine an angle in single degree precision (though I'm sure it could be upped fairly easily) that is the correct angle between the receiver and the source. There are 180 angles to choose from in this example. This is represented by a 180 element vector, consisting of 0's at the incorrect angles and a 1 at the correct angle. This is the target of the reconstruction. Also, their method makes use of multiple receivers to add a bit more informational power to the system. Dimitri said they found optimal performance using Gaussian Scale Mixtures due to their closeness to Dirac's, but with heavy tails. 

Dimitri also tipped me off to a paper by his colleague, George Tazagkarakis "Bayesian Compressed Sensing Imaging Using a Gaussian Scale Mixture" which will be presented as a poster on Thursday. I talked to George a little bit about the complications of having all your techniques being restricted to the decoder side and the problematic nature of CS in that it becomes very non-intuitive to apply the power of modern DSP methods when you simply don't have the original data to work with (modelling, motion compensation, etc). He also told me that these papers were presented a little out of order, as the DOA paper is an extension of the imaging technique. I look forward to asking some questions about BCS with GSM on Thursday.

Unfortunately, I was not able to attend the lecture on "Using Reed-Muller Sequences as Deterministic Compressed Sensing Matrices for Image Reconstruction" or "Non-Convex Group Sparsity: Application to Color Imaging" as both of these took place during the Biomedical Imaging session which took place concurrent to the Scalable and Multiview Coding session I attended. However, it seems I would have been better served going to the Biomed sessions since 4/5 of the authors did not show up to the session I attended. Dr. Fowler was chairing the session, and to quote him it was "an embarrassment." He told us that perhaps one in ten authors might not attend, but four in five was kind of a new low. It seems that it is becoming more frequent that authors are submitting their papers to ICASSP/ICIP and paying the attendance fee, but never intending on actually showing up for their presentations in the hopes that the paper will slide under the door and still make it into the Proceedings. This gets kind of stale after a while and makes me wonder if the IEEE will begin actually enforcing its policies and become stricter on blacklisting. But, perhaps its just a condition of the economic times and it'll be better in the future. We'll have to see.

Thats all I have for now! I'll be making an update later tonight after attending the Video Motion Analysis session this evening which will feature two quite interesting papers...that is, if they come ;)

 

 


Tuesday
Mar162010

Day 2 -- It Begins

If you are at the conference and don't know where to go for dinner, go to Sol Irlandés. Its has a weird name, but its delicious.

 

I was able to take a look at the schedule last night, and I've put together a preliminary schedule for today.

Tuesday: 

  • 9:00 - 10:00: Plenary Talk
  • 11:30 - noon: "Using Reed-Muller Sequences as Deterministic Compressed Sensing Matrices for Image Reconstruction"
  • noon - 12:30: "Multiple-Measurement Bayesian Compressed Sensing Using GSM Priors for DOA Estimation"
  • 2:00 - 2:30: "Compressed Sensing for Bandwidth Constrained Systems"
  • 4:00 - 6:00: Video Motion Analysis & Object Tracking Session, including the talk "Motion Estimation from Compressed Linear Measurements"

Looks like it will be a fairly full day! I'll update as I can throughout the day with more information about these and other talks.