Sunday, January 31, 2010

Going Hollywood with Compressive Sensing

As some of you know, Justin Romberg has been an advisor to the Numb3rs television series. Well it looks like the screenwriters of the new Battlestar Galactica TV series could have used a scientific advisor of that caliber (by the way, I did not mean to say that screenwriters for Numb3rs actually used all of Justin's inputs). Chris Bilder points out how performing Group Testing to detect Cylons could have helped the script and change the storyboard (I mentioned this study before). First, let us give some context about what the Cylons are:

Then let us introduce Chris Bilder's presentation (first 10 minutes of the presentation) on the subject:

Chris Bilder's page on his Chance paper is here. The full video of the talk is here. Let us recall that group testing is connected to compressive sensing in that it solves the same problem, finding a sparse element in a large batch of elements with the fewest number of tests (measurements). As it turns out, my webcrawler found Mark Iwen's latest presentation on the subject entitled Interpolation with Sparsity Assumptions: From Syphilis Testing to Sparse Fourier Transforms.

But coming back to the title of this entry, wouldn't it be neat if one could use a random lens imager in one of those CSI plots ?

Saturday, January 30, 2010

CS: Q&A with Esther Rodriguez-Villegas on a Compressive Sensing EEG

[I have updated this entry after some thoughts]

In Thursday's entry, I mentioned the paper by Amir Abdulghani, Alexander Casson and Esther Rodriguez-Villegas entitled Quantifying the Feasibility of Compressive Sensing in Portable Electroencephalography Systems where the authors look at an EEG compressive sensing system. Then, I stated the following:

....The result for 20 channels springs from applying a compressed sensing design such as the one shown above. The question I have is really how can we change this design entirely: For instance, instead of applying a simple \phi measurement to the signals after they have been amplified and digitized, what about providing some randomness in the reference voltage ? what about removing entirely both the Op-amps ?...

I had a small conversation with Esther Rodriguez-Villegas and here is her response to the question above republished with her permission (-as usual-)

Dear Igor,

Thanks for your email. We've heard of your blog before. It is very interesting.


On a different note we found the EEG discussion on your blog interesting. I personally don't think that the randomness in the reference voltage would work though. The latter would somehow be equivalent to "downsampling the signal with dither", or in different words, the randomness in the reference voltage would create a similar effect to the input noise already introduced by the amplifiers themselves and then I presume one would be downsampling. However I don't think this would be possible because the minimum sampling rate was determined to be what it is for different applications considering the existing amplifiers noise, so I doubt it would be viable to further reduce this sampling rate after adding a signal which, from the circuit point of view, would play a similar role to this noise. The other alternative of switching on and off the reference would not be a practical option either because of slew-rate limitations of the amplifiers, which could only be overcome by pumping otherwise unnecessary biasing currents.

At the same time, eliminating the amplifiers is not an option because they are needed to "adjust" the already very weak signal levels to the ones required by the A/D converters, and in some cases also act as filters for, for example, the offset drift introduced by the electrodes, as well as high frequency noise. One could argue that having a better converter to cope with an increased range may do the job, but the problem with that would be that converters themselves require considerable amount of power, and as the number of bits of resolution they provide increases they become much more power hungry (apart from considerably more difficult to design). Also, if one thinks of a system that ideally should be operated on small batteries which provide a very limited voltage, this on its own represents a limitation for the converter range. So in the end, in the context of a low power portable system, for the front end, having an amplifier followed by a lower resolution converter is going to be the best option as long as the blocks are intelligently designed and customised, which from the circuit designer perspective is not a simple issue at all.

Now, your overall suggestion is interesting because even if those two options are not possible, there may be alternative ones that are. Fundamentally, I would say that thinking of a system in which part of the processing which is now thought of at a digital level, is "transferred back" to the front end and carried out in the analog domain may potentially significantly reduce both, the "requirements" of the converter and also the overall power consumption of the approach. This is because analog can offer a massively better tradeoff in terms of computational complexity versus power than digital, as long as speed of the operations is not a major issue and we are not trying to deal with resolutions of more than about 10 bits. So as a conclusion, I would say that I definitely agree with you in the fact that we should "think out of the box" and make an effort to redefine the system. I would go further than that to say that most probably that "redefinition" should follow an approach in which part of the compression algorithm is implemented earlier on in the "system chain" possibly following an analog based approach.

Anyway, to this end, we have a paper under review at the moment showing the performance of a straightforward CS application to EEG, tailored to establish a baseline performance of the approach to aid future EEG designers to make system level decisions. From that paper I would say that in general better performance is needed, although it is left to the application specific designer to decide. Certainly however there is a lot of potential here for EEG-tailored system level oriented compressive sensing research.

Best regards,


I did not realize that the op-amp was also acting as a filter, this is good to know. From this description, it seems obvious that at some point one should investigate an electrode based solution. While roaming through the openEEG list, I was made aware of the fact that there are two types of electrodes : active and passive. The passive ones require that they be wetted with eye drops in order to make a good contact with the scalp. In case, they are not wetted, there seems to be an extreme sensitivity of the electrode to the environmental EMI coming from your everyday appliances. In that case, one could conceive the use of shielded electrodes. The active electrodes remove the need for this eye drop liquid agent but there is the danger of having a "live" electrode. With this in mind, one could probably think of the' early in the chain' comment that Esther suggests, whereby a very low voltage active electrode could provide some type of random modulation.

Since we are thinking outside the box and we are dealing with amplifiers (or their potential removal) there is also the possibility of extreme quantization of the signal. Work along those lines include that of Petros Boufounos and Richard Baraniuk in 1-Bit Compressive Sensing and that of Laurent Jacques, David Hammond and Jalal Fadili, as featured here.

Finally, with regards to Esther's other comment on the reference voltage, it seems that the concept of Garbage in, Garbage out does not really apply in compressive sensing when done right. I am making a particular reference to an early paper by Joel Tropp, Michael Wakin, Marco Duarte, Dror Baron, and Richard Baraniuk entitled Random Filters For Compressive Sampling And Reconstruction where one can read the following sentence:
At first glance, one might think this method would convert a signal into garbage.
As I said before, you don't often see that type of assessment in publications. I tried these random filters, and it worked for the limited case I computed.

Finally, to give some perspective with regards to low cost EEGs used in BCI-like systems, a current low cost commercially available Emotiv EEG system, with 14 electrodes, is limited by its battery capability of about 12 hours. It also uses passive electrodes and therefore liquid drops as the liquid needed to make good contact with the scalp. One can expect this liquid to evaporate over a 12 hour time period.

Thanks Esther for sharing your views!

Credit Photo: Emotiv, Emotiv Headset.

Friday, January 29, 2010

CS: On Low Rank Matrix Approximations with Applications to Synthesis Problem in CS, Learning Sparse Systems, ADM for NMF, Proximal Splitting Methods,

To finish this week, here are several papers directly or not so directly related to compressive sensing. Enjoy!

On Low Rank Matrix Approximations with Applications to Synthesis Problem in Compressed Sensing by Anatoli Juditsky, Fatma Kilinc Karzan, Arkadii S. Nemirovski. The abstract reads:
We consider the synthesis problem}of Compressed Sensing - given s and an MXn matrix A, extract from it an mXn submatrix A', certified to be s-good, with m as small as possible. Starting from the verifiable sufficient conditions of s-goodness, we express the synthesis problem as the problem of approximating a given matrix by a matrix of specified low rank in the uniform norm. We propose randomized algorithms for efficient construction of rank k approximation of matrices of size mXn achieving accuracy bounds O(1)sqrt({ln(mn)/k) which hold in expectation or with high probability. We also supply derandomized versions of the approximation algorithms which does not require random sampling of matrices and attains the same accuracy bounds. We further demonstrate that our algorithms are optimal up to the logarithmic in m and n factor. We provide preliminary numerical results on the performance of our algorithms for the synthesis problem.

We propose a novel algorithm for sparse system identification in the frequency domain. Key to our result is the observation that the Fourier transform of the sparse impulse response is a simple sum of complex exponentials, whose parameters can be efficiently determined fromonly a narrowfrequency band. From this perspective, we present a sub-Nyquist sampling scheme, and show that the original continuous-time system can be learned by considering an equivalent low-rate discrete system. The impulse response of that discrete system can then be adaptively obtained by a novel frequency-domain LMS filter, which exploits the parametric structure of the model. Numerical experiments confirm the effectiveness of the proposed scheme for sparse system identification tasks.

An Alternating Direction Algorithm for Nonnegative Matrix Factorization by Yin Zhang, The abstract reads:
We extend the classic alternating direction method for convex optimization to solving the non-convex, nonnegative matrix factorization problem and conduct several carefully designed numerical experiments to compare the proposed algorithms with the most widely used two algorithms for solving this problem. In addition, the proposed algorithm is also briefly compared with two other more recent algorithms. Numerical evidence shows that the alternating direction algorithm tends to deliver higher-quality solutions with faster computing times on the tested problems. A convergence result is given showing that the algorithm converges to a Karush-Kuhn-Tucker point whenever it converges.

Proximal Splitting Methods in Signal Processing by Patrick L. Combettes, and Jean-Christophe Pesquet. The abstract reads:
The proximity operator of a convex function is a natural extension of the notion of a projection operator onto a convex set. This tool, which plays a central role in the analysis and the numerical solution of convex optimization problems, has recently been introduced in the arena of signal processing, where it has become increasingly important. In this paper, we review the basic properties of proximity operators which are relevant to signal processing and present optimization methods based on these operators. These proximal splitting methods are shown to capture and extend several well-known algorithms in a unifying framework. Applications of proximal methods in signal recovery and synthesis are discussed.

Piotr Indyk has a talk on Sparse Recovery Using Sparse Matrices, given in various forms in various places.

Holger Rauhut has some lecture notes entitled Compressive sensing and structured random matrices. These lecture Notes are still under construction. If you have comments of any sort, he would be happy to receive them.

Thursday, January 28, 2010

CS: CS for Missing Data Imputation and Digit Recognition, CS EEG, CS 60 GHz UWB, Training Over Sparse Multipath Channels, CFP

Jort Gemmeke just sent me the following:
Hi Igor,

I finally found time to update my website ( There are two new papers which are probably of interest for this community. The first is "Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition", which is a more detailed description of my work on compressive sensing for imputing missing data in speech.

In "Noise robust exemplar-based connected digit recognition" we use the same concept of representing speech as a sparse linear combination of exemplars, but in this work we model the noisy speech as a linear combination of both speech and noise exemplars. Rather than use the activations to do source seperation, we do classification directly on the clean speech exemplar activations. Interestingly, although not discussed in detail in the paper, we obtained better (much) results using a sparse solver which minimizes the Kulback-Leibler divergence than the customary sparse solvers which minimize the Euclidean distance.


The emphasis was mine, I am sure that at some point, we will see more of these KL based solvers. The two papers are: Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition by Jort Florent Gemmeke, Hugo Van hamme, Bert Cranen, Lou Boves.The abstract reads:
An effective way to increase the noise robustness of automatic speech recognition is to label noisy speech features as either reliable or unreliable (missing), and to replace (impute) the missing ones by clean speech estimates. Conventional imputation techniques employ parametric models and impute the missing features on a frame-by-frame basis. At low SNR’s these techniques fail, because too many time frames may contain few, if any, reliable features. In this paper we introduce a novel non-parametric, exemplar based method for reconstructing clean speech from noisy observations, based on techniques from the field of Compressive Sensing. The method, dubbed sparse imputation, can impute missing features using larger time windows such as entire words. Using an overcomplete dictionary of clean speech exemplars, the method finds the sparsest combination of exemplars that jointly approximate the reliable features of a noisy utterance. That linear combination of clean speech exemplars is used to replace the missing features. Recognition experiments on noisy isolated digits show that sparse imputation outperforms conventional imputation techniques at SNR = −5 dB when using an ideal ‘oracle’ mask. With error-prone estimated masks sparse imputation performs slightly worse than the best conventional technique.
and Noise robust exemplar-based connected digit recognition by Jort Gemmeke and Tuomas Virtanen. The abstract reads:
This paper proposes a noise robust exemplar-based speech recognition system where noisy speech is modeled as a linear combination of a set of speech and noise exemplars. The method works by finding a small number of labeled exemplars in a very large collection of speech and noise exemplars that jointly approximate the observed speech signal. We represent the exemplars using melenergies, which allows modeling the summation of speech and noise, and estimate the activations of the exemplars by minimizing the generalized Kullback-Leibler divergence between the observations and the model. The activations of the speech exemplars are directly being used for recognition. This approach proves to be promising, achieving up to 55.8% accuracy at signal-to-noise ratio −5 dB on the AURORA-2 connected digit recognition task.

You probably recall this paper entitled Sparse Event Detection in Wireless Sensor Networks using Compressive Sensing, now there is a video presentation by one of the author and it is in Chinese.

The EEG for use in augmented cognition produces large amounts of compressible data from multiple electrodes mounted on the scalp. This huge amount of data needs to be processed, stored and transmitted and consumes large amounts of power. In turn this leads to physically large EEG units with limited lifetimes which limit the ease of use, and robustness and reliability of the recording. This work investigates the suitability of compressive sensing, a recent development in compression theory, for providing online data reduction to decrease the amount of system power required. System modeling which incorporates a review of state-of-the-art EEG suitable integrated circuits shows that compressive sensing offers no benefits when using an EEG system with only a few channels. It can, however, lead to significant power savings in situations where more than approximately 20 channels are required. This result shows that the further investigation and optimization of compressive sensing algorithms for EEG data is justified.
The result for 20 channels springs from applying a compressed sensing design such as the one shown above. The question I have is really how can we change this design entirely: For instance, instead of applying a simple \phi measurement to the signals after they have been amplified and digitized, what about providing some randomness in the reference voltage ? what about removing entirely both the Op-amps ?

In other news, the googlebot found the following paper:

60 GHz ultra wide-band (UWB) communication is an emerging technology for high speed short range communications. However, the requirement of high-speed sampling increases the cost of receiver circuitry such as analog-to-digital converter (ADC). In this paper, we propose to use a compressive sensing framework to achieve a significant reduction of sampling rate. The basic idea is based on the observation that the received signals are sparse in the time domain due to the limited multipath effects at 60 GHz wireless transmission. According to the theory of compressive sensing, by carefully designing the sensing scheme, sub-Nyquist rate sampling of the sparse signal still enables exact recovery with very high probability. We discuss an implementation for a low-speed A/D converter for 60 GHz UWB received signal. Moreover, we analyze the bit error rate (BER) performance for BPSK modulation under RAKE reception. Simulation results show that in the single antenna pair system model, sampling rate can be reduced to 2.2% with 0.3dB loss of BER performance if the input sparsity is less than 1%. Consequently, the implementation cost of ADC is significantly reduced.

I also found the following preprints on Arxiv: Training Over Sparse Multipath Channels in the Low SNR Regime by Elchanan Zwecher, Dana Porrat. The abstract reads:
Training over sparse multipath channels is explored. The energy allocation and the optimal shape of training signals that enable error free communications over unknown channels are characterized as a function of the channels' statistics. The performance of training is evaluated by the reduction of the mean square error of the channel estimate and by the decrease in the uncertainty of the channel. A connection between the entropy of the wideband channel and the required energy for training is shown. In addition, there is a linkage between the sparsity and the entropy of the channel to the number of required channel measurements when the training is based on compressed sensing. The ability to learn the channel from few measurements is connected to the low entropy of sparse channels that enables training in the low SNR regime.

We present a computationally-efficient method for recovering sparse signals from a series of noisy observations, known as the problem of compressed sensing (CS). CS theory requires solving a convex constrained minimization problem. We propose to transform this optimization problem into a convex feasibility problem (CFP), and solve it using subgradient projection methods, which are iterative, fast, robust and convergent schemes for solving CFPs. As opposed to some of the recently-introduced CS algorithms, such as Bayesian CS and gradient projections for sparse reconstruction, which become prohibitively inefficient as the problem dimension and sparseness degree increase, the newly-proposed methods exhibit a marked robustness with respect to these factors. This renders the subgradient projection methods highly viable for large-scale compressible scenarios.
The code implementing the CFP is here. Thanks Jort.

Credit: NASA, Spirit, Navigation Camera, Sol 2156. As you all know the big news tuesday was that Spirit is likely not moving any time soon.

Wednesday, January 27, 2010

CS: A note, "Compressed" CS, Statistical Mechanical Analysis of CS

Found on Math Overflow a question and answer on an inequality related to compressed sensing. In the comment section of yesterday's entry, Laurent Jacques made the following statement:

...In connection with the last paper mentioned in your post [On the Systematic Measurement Matrix for Compressed Sensing in the Presence of Gross Errors by Zhi Li, Feng Wu and John Wright], if I'm not wrong, an RIP analysis of matrix [Phi I] (and even [Phi Omega] for any MxM orthonormal matrix Omega) has been realized before in:

Mark Davenport, Jason Laska, Petros Boufounos, and Richard Baraniuk, A simple proof that random matrices are democratic. (Rice University ECE Department Technical Report TREE-0906, November 2009)

and used here:

Jason Laska, Mark Davenport, and Richard Baraniuk, Exact signal recovery from sparsely corrupted measurements through the pursuit of justice. (Asilomar Conf. on Signals, Systems, and Computers, Pacific Grove, California, November 2009) (see Theorem 2 inside)

It seems that Zhi Li, Feng Wu and John Wright. do not know these (previous/simultaneous?) works, and they reproved RIP for [Phi I] in their Section V. It seems however that their proof is slightly different from that of Davenport et al., and the bound on the minimal number of measurement not exactly the same. Notice also that Davenport et al. proved this result for any RIP matrix enlarged with the identity, and not only enlarged Gaussian matrices....

Thanks Laurent ! I hope this has nothing to do with the fact some of our readers in China cannot read this blog directly ?

Dick Gordon, one of the discoverer of the ART algorithm, also added something to the Radiative Transfer papers in the comment section:

re: Inverse radiative transfer and melanoma

Here’s some more prior literature on the inverse radiative transfer problem:

Chandrasekhar, S. (1960). Radiative Transfer. New York, Dover Publications.

Nevoscopy, the 3D imaging of nevi (beauty marks), which are all potential skin melanomas, has been addressed from the point of view of forward and inverse radiative transfer:

Patwardhan, S.V., A.P. Dhawan & P.A. Relue (2003). Monte Carlo simulation of light-tissue interaction: three-dimensional simulation for trans-illumination-based imaging of skin lesions. IEEE Transactions on Biomedical Engineering 52(7), 1227-1236.

Kini, P. & A.P. Dhawan (1992). Reconstruction of embedded absorbers in random media with applications in non-invasive 3D imaging of skin lesions. Proc. SPIE 1767, 360-371.

The initial approach was standard 3D computed tomography, albeit from only 3 views:

Dhawan, A.P., R. Gordon & R.M. Rangayyan (1984). Nevoscopy: three-dimensional computed tomography for nevi and melanomas in situ by transillumination. IEEE Trans. Med. Imaging MI-3(2), 54-61.

The intent is to obtain the nevus thickness, which is the most prognostic factor for metastasis: once the melanoma penetrates the dermis, cure rate on excision drops from 95% to 5%. Differences in thickness of 0.1mm matter, so this is not something your partner can detect. Actual shape doesn’t matter much, making this an ideal CS application. I’d like to see nevoscopy developed into a whole body robotic skin scanner for detecting premetastasis melanoma. Contact me at if interested. Thanks.
Yours, -Dick Gordon

Thanks Dick for the insight! The 2003 paper still seem to be looking at the forward problem which really says to me that the 1992 paper did not do well in other cases than the ones looked in the paper. Thanks to yesterday's series of papers we can hope to have better reconstruction capabilities in the future.

Found on arxiv, the following three preprints: "Compressed" Compressed Sensing by Galen Reeves and Michael Gastpar. The abstract reads:
The field of compressed sensing has shown that a sparse but otherwise arbitrary vector can be recovered exactly from a small number of randomly constructed linear projections (or samples). The question addressed in this paper is whether an even smaller number of samples is sufficient when there exists prior knowledge about the distribution of the unknown vector, or when only partial recovery is needed. An information-theoretic lower bound with connections to free probability theory and an upper bound corresponding to a computationally simple thresholding estimator are derived. It is shown that in certain cases (e.g. discrete valued vectors or large distortions) the number of samples can be decreased. Interestingly though, it is also shown that in many cases no reduction is possible.

We use the replica method of statistical mechanics to examine a typical performance of correctly reconstructing $N$-dimensional sparse vector $\bx=(x_i)$ from its linear transformation $\by=\bF \bx$ of dimension $P$ on the basis of minimization of the $L_p$-norm $||\bx||_p=\lim_{\epsilon \to +0} \sum_{i=1}^N |x_i|^{p+\epsilon}$. We characterize the reconstruction performance by the critical relation of the successful reconstruction between the ratio $\alpha=P/N$ and the density $\rho$ of non-zero elements in $\bx$ in the limit $P, N \to \infty$ while keeping $\alpha \sim O(1)$ and allowing asymptotically negligible reconstruction errors. When $\bF$ is composed of independently and identically distributed random variables of zero mean and a fixed variance, the critical relation $\alpha_c(\rho)$ for $p=1$ accords with that obtained by Donoho ({\em Discrete and Computational Geometry}, vol. 35, pp. 617--652, 2006) who utilized a technique of combinatorial geometry and allowed no reconstruction errors. We also show that the critical relation does not change as long as $\bF^{\rm T}\bF$ can be characterized by a rotationally invariant random matrix ensemble and $\bF \bF^{\rm T}$ is typically of full rank.

We investigate a reconstruction limit of compressed sensing for a reconstruction scheme based on the L1-norm minimization utilizing a correlated compression matrix with a statistical mechanics method. We focus on the compression matrix modeled as the Kronecker-type random matrix studied in research on multi-input multi-output wireless communication systems. We found that strong one-dimensional correlations between expansion bases of original information slightly degrade reconstruction performance.

Image Credit: NASA/JPL/Space Science Institute. Saturn as seen by Cassini on Jan 25th, 2010.

Tuesday, January 26, 2010

CS: Mutilevel Bioluminescence tomography based on Radiative Transfer Equation, Measurement Matrix for CS in the presence of Gross Errors

As some of you know, I have a continuing interest in the use of compressive sensing for the linear transport equation or the radiative transfer equation as it is sometimes called in a different field. Here are some of the first few papers taking a stab at it. Let us first note that these papers aim at taking care of difficult geometries. As it turns out this generally means that some slack is given to the need for accuracy. For papers that deal with the RTE with very high accuracy (but in 1-D) see Chuck Siewert's papers especially this one. Anyway, here is the l_1 version of RTE problems, let us hope these can guide us into building relevant sensing hardware:
Multilevel bioluminescence tomography based on radiative transfer equation Part 1: l1 regularization by Hao Gao and Hongkai Zhao. The abstract reads:
In this paper we study an l1-regularized multilevel approach for bioluminescence tomography based on radiative transfer equation with the emphasis on improving imaging resolution and reducing computational time. Simulations are performed to validate that our algorithms are potential for efficient high-resolution imaging. Besides, we study and compare reconstructions with boundary angular-averaged data, boundary angular resolved data and internal angular-averaged data respectively.

Multilevel bioluminescence tomography based on radiative transfer equation Part 2: total variation and l1 data fidelity by Hao Gao and Hongkai Zhao. The abstract:
In this paper we study the regularization with both l1 and total variation norm for bioluminescence tomography based on radiative transfer equation, compare l1 data fidelity with l2 data fidelity for different type of noise, and propose novel interior-point methods for solving related optimization problems. Simulations are performed to show that our approach is not only capable of preserving shapes, details and intensities of bioluminescent sources in the presence of sparse or non-sparse sources with angular-resolved or angular-averaged data, but also robust to noise, and thus is potential for efficient high-resolution imaging with only boundary data.

from that paper:

....As a remedy for poor reconstruction due to forward modeling by DA, we use radiative transfer equation (RTE) [13-16] as the forward model. Physically, RTE is an accurate model for photon migration in tissues. Moreover, with RTE one can utilize much richer data, such as boundary angular-resolved measurements, whose dimension can match that in the interior region. Mathematically, it was shown [17, 18] that the RTE-based inverse source problem with angular-resolved boundary measurements almost always has a unique solution with certain stability. Numerically it means that a better conditioned and less underdetermined system resulted from RTE-based BLT than from DA-based BLT. So in those applications in non-diffusive regime, it is more appropriate to use RTE model and better image
reconstruction is expected....

I need to write down the idea I mentioned earlier on the subject. The solver was featured in A Fast Forward Solver of Radiative Transfer Equation by Hao Gao and Hongkai Zhao. The abstract reads:
In this paper we develop an efficient forward solver for steady-state or frequency-domain radiative transfer equation (RTE) on 2D and 3D structured and unstructured meshes with vacuum boundary condition or reflection boundary condition. In our algorithm we use a direct angular discretization and upwind type of spatial discretization that preserves properties of both scattering and differential operators of RTE. To solve the large linear system after discretization, we construct an efficient iterative scheme based on Gauss-Seidel and proper angular dependent ordering. With this iterative scheme as the relaxation we implement various multigrid methods in both angular and physical space. Our algorithm can deal with different scattering regimes efficiently. Although we emphasize applications in optical imaging, our method may be applied to more general RTEs, e.g., other types of scattering function. Efficiency and accuracy of our algorithm is demonstrated by comparison with both analytical solutions and Monte Carlo solutions, and various numerical tests in optical imaging.

The code can be found here (or here if you are in a country that does not support Google)

Finally, here is another very nice paper: On the Systematic Measurement Matrix for Compressed Sensing in the Presence of Gross Errors by Zhi Li, Feng Wu and John Wright. The abstract reads:
Inspired by syndrome source coding using linear error-correcting codes, we explore a new form of measurement matrix for compressed sensing. The proposed matrix is constructed in the systematic form [A I], where A is a randomly generated submatrix with elements distributed according to i.i.d. Gaussian, and I is the identity matrix. In the noiseless setting, this systematic construction retains similar property as the conventional Gaussian ensemble achieves. However, in the noisy setting with gross errors of arbitrary magnitude, where Gaussian ensemble fails catastrophically, systematic construction displays strong stability. In this paper, we prove its stable reconstruction property. We further show its l_1-norm sparsity recovery property by proving its restricted isometry property (RIP). We also demonstrate how the systematic matrix can be used to design a family of lossy-to-lossless compressed sensing schemes where the number of measurements trades off the reconstruction distortions.
The proof in the paper are here.

Friday, January 22, 2010

CS: Message passing on dense graphs with applications to compressed sensing , NIPS'09 videos

The dynamics of message passing on dense graphs, with applications to compressed sensing by Mohsen Bayati and Andrea Montanari. The abstract reads:
Approximate message passing' algorithms proved to be extremely effective in reconstructing sparse signals from a small number of incoherent linear measurements. Extensive numerical experiments further showed that their dynamics is accurately tracked by a simple one-dimensional iteration termed `state evolution'. In this paper we provide the first rigorous foundation to state evolution. We prove that indeed it holds asymptotically in the large system limit for sensing matrices with iid gaussian entries.
While our focus is on message passing algorithms for compressed sensing, the analysis extends beyond this setting, to a general class of algorithms on dense graphs. In this context, state evolution plays the role that density evolution has for sparse graphs.
also on the same subject, the 15th Annual LIDS student conference at MIT will feature Message-passing For Compressed Sensing by Venkat Chandar. The abstract of the talk is:

We propose a message-passing algorithm to recover a non-negative vector x from given linear measurements y=Ax, where A is an m-by-n matrix. The algorithm is very similar to the belief propagation algorithm(s) utilized in the context of decoding low density parity check codes. We establish that when A corresponds to the adjacency matrix of a bipartite graph with sufficient expansion, the algorithm produces a reconstruction r(x) of x satisfying

Formula,where x(k) is the best k-sparse approximation of x. The algorithm performs Formula computation in total, and the number of measurements required is Formula. In the special case when x is k-sparse, the algorithm recovers x exactly in time Formula. Conceptually, this work provides a rigorous connection between the theory of message-passing algorithms and compressed sensing that has been alluded to in many of the recent prior works.

Finally, the NIPS'09 videos are out. The ones that I mentioned before can be found below:
Of interest:
Thanks David Miraut Andres for the pointer.

Image Credit: NASA/JPL/Space Science Institute, Titan as seen by Cassini the day before yesterday on January 20th, 2010.

Thursday, January 21, 2010

CS: Recovering Lost Sensor Data through CS, A Question to Anders Hansen

You nay recall Zainul Charbiwala's question on computing the Restricted Isometry Constant (part 1, part 2, part 3, part 4). Well he just gave a talk entitled Recovering Lost Sensor Data through Compressed Sensing on January 15th. The abstract read:

Data loss in wireless sensing applications is inevitable, both due to communication impairments as well as faulty sensors. We introduce an idea using Compressed Sensing (CS) that exploits knowledge of signal model for recovering lost sensor data. In particular, we show that if the signal to be acquired is compressible, it is possible to use CS not only to reduce the acquisition rate but to also improve robustness to losses.This becomes possible because CS employs randomness within the sampling process and to the receiver, lost data is virtually indistinguishable from randomly sampled data.To ensure performance, all that is required is that the sensor over-sample the phenomena, by a rate proportional to the expected loss.

In this talk, we will cover a brief introduction to Compressed Sensing and then illustrate the recovery mechanism we call CS Erasure Coding(CSEC). We show that CSEC is efficient for handling missing data in erasure (lossy) channels, that it parallels the performance of competitive coding schemes and that it is also computationally cheaper. We support our proposal through extensive performance studies on real world wireless channels.

The slides are here. I note the use of the trademarked words "big picture - compressed sensing" :-)

Back when I read about by Anders Hansen's article on Compressive Sampling in Infinite Dimensions several days ago, Because he has done some work on pseudospectra and I broached on the subject here before, I went out and asked him the following dumb question:

... I just noticed your work on CS in infinite dimensional spaces. I also noticed from your publication your interest in non-normality. I have a really dumb question for you. As you know, at some engineering level, people want to know if their underdetermined systems found through the discretization of the physics of their problem could fall into something accessible to CS. There are current attempts undertaken by different researchers to look for a way to computationally check if a sufficient condition based on the RIP or the nullspace condition is checkable in a relatively fast fashion. However these condition checking exercises fail when the matrices become large.

Here is my really dumb question, do you know if there is some type of connection between the pseudo-spectrum of a linear operator and any of the conditions (RIP, KGG, NullSpace) required for finding sparse solutions using l1 minimization (the typical CS problems.)
Anders Hansen very kindly responded with the following:

...Thanks so much for your interest in my work. I must apologize for the late replay, but I have been traveling. The question you ask is certainly not dumb, in fact the answer is not obvious at all. However, before I continue with the question let me mention that the RIP condition is a very strong requirement. In fact in most of the reconstruction problems that I encounter, that is never satisfied, but reconstruction is still possible. It is really the randomness of the measurements that allow for this. If you check out my paper on this you'll see that there is no mentioning of the RIP and there are other conditions (much weaker) that one has to check in order to get recovery (these can also be checked numerically). Anyway, there are tons of work to be done on this.

Now for the pseudospectrum question. This must be investigated, but my gut feeling is that there is no obvious connection. Consider the following problem. Let A be a self-adjoint and U the discrete fourier tranform (DFT). Now U is unitary and by randomly choosing row from U (how many one must choose depends on the sparsity and the dimension of the space) it is likely that one can recover a sparse signal (or even get the RIP), but it is the incoherence of the DFT that is crucial. Now U is unitary, so normal, hence the epsilon-pseudospectrum is just the epsilon neighborhood around the spectrum. Also, A is self-adjoint thus the pseudospectral qualities are the same as for U, however A may very well be the identity, and that will not be useful for any recovery purpose. Thus, one can have two operators with very different recovery properties and same pseudospectral qualities. Now this example does not say that there may not be any information from the pseudospectrum. Consider two operators with the same incoherence and different pseudospectrum. Can one deduce something from that observation. I don't think so, but as I say, this is not obvious. The reason why I am slightly sceptical is that incoherence depends on the choice of basis. Pseudospectra do not change by changing the basis.

So it looks like this was a dumb question, except maybe if one put some constraint on what to look for in the pseudospectrum.... Thanks Anders !

As it turns out both items in today's entry are about the RIP....

Bon Voyage Papa

Wednesday, January 20, 2010

CS: Advances in Multirate Filter Bank Structures and Multiscale Representations, Concentration of Measure

Laurent Duval sent me the following announcement:

As we celebrate the centenary of the Haar wavelet, we wish you a happy 2010 and kindly invite you to contribute to the following special issue of Elsevier Signal Processing (deadline: 15/02/2010):

Advances in Multirate Filter Bank Structures and Multiscale Representations

The purpose of this special issue is to report on recent progresses performed and emerging trends in the domain of multirate filter banks and multiscale representations of signals and images. Answers to the challenge of handling an increasing demand of information extraction and processing from large data sets will be explored.

Sampling theory, compressive sensing . Sparse representations . Multiscale models . Multiscale processing: interpolation, inpainting, restoration . Wavelet shrinkage and denoising . Oversampled filter banks, discrete frames . Rational and non-uniform multirate systems . Directional, steerable filter banks and wavelets . Nonlinear filter banks . (Multidimensional) filter bank design and optimization . Hybrid analog/digital filter banks . Fast and low-power schemes (lifting, integer design) . Multiscale and multirate applications to source and channel coding, equalization, adaptive filtering,...

Guest editors:
Prof. Thierry Blu
Dr. Laurent Duval
Prof. Truong Q. Nguyen
Prof. Jean-Christophe Pesquet

Additional information:

In other news, Terry Tao has a series of post on the concentration of measure and the central limit theorem and there are two blog posts, one about Emmanuel Candes at SODA and the other one about David Gross at QIP.

Image Credit: NASA/JPL/Space Science Institute

Tuesday, January 19, 2010

CS: Dick Gordon's Op-Ed, Lianlin Li's Open Question on Compressible Priors

Dick Gordon, one of the first to develop ART for CT tomography just sent me the following (a follow up to a previous entry). I am reprinting it as is (except for some edit on the form and hyperlinks)

How to get CT out of its high dose rut: CS for CT x-ray dose reduction is a political issue

The article:

Pan, X., E.Y. Sidky & M. Vannier (2009). Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction? Inverse Problems 25, 36p. #123009.

places the blame for what Igor Carron calls “the inability of ART to topple FBP“ on the theoreticians and the manufacturers, somehow expecting the latter’s engineers, stuck in the middle, to overcome a number of sociological problems. These are scattered through the
paper, so I pull them together here: makes little sense to spend endless time and effort refining an inversion formula for an idealized imaging model when there are some compromising factors earlier in the data-flow chain....

Most of the research devoted to image reconstruction aims at developing new solutions to solving imaging model problems. The idea being that the imaging model problems evolve toward realistic situations and that this theory eventually finds its way into application. We submit that this style of research is not effective for translation based simply on the empirical evidence that not much of the work published in Inverse Problems over the past couple of decades is actually used in a CT scanner.

The bulk of the applied-mathematics research effort seems to go toward developing advanced techniques that will likely never be used in practice or address problems that have little application.

And presently users of CT are used to some of the streak artifacts in FBP-based reconstructions, and there is not really a huge motivation of exchanging FBP streaks for some other type of artifact.

Parameter explosion is one of the challenges that face optimization-based algorithms for image reconstruction in commercial products.

The real progress in moving past FBP-based reconstruction will occur when engineers have real experience with advanced image-reconstruction algorithms and can use this knowledge to design more efficient and effective CT scanners. This development will likely occur first in dedicated CT systems such as head/neck CT, dental CT and breast CT.

Among the main reasons for slow progress on introduction of algorithmic innovations into clinical CT scanners may also be the proprietary nature of CT technologies and the subjective evaluation of performance by its users.

The door to innovation in CT algorithms requires an efficient and practical route to develop and test new algorithms. It is unlikely that any investigator, mathematician or engineer could effectively correct for the raw detector measurements from the CT scanner into a form suitable for algorithm testing without the manufacturer’s blessing and assistance. However, manufacturers of CT scanners, in general, have not made the raw projection data available to anyone outside their organizations, even to customers that use their CT scanners.

There is no Digital Image COMmunication (DICOM) standard for CT raw data [86, 87]. In fact, there are no standards at all. Corrected raw projection-data sets are almost impossible to find in the public domain. Without this data, clinically realistic examples of images cannot be reconstructed. As a consequence, CT-algorithm developers almost universally use highly idealized numerical phantoms for their work and seldom show results obtained with ‘real’ experimental or clinical data. This is surely a major impediment to progress.

If the system was ‘open’ and could be refined by algorithm developers in the field, such algorithms could be tested and clinically validated. The benefits of new technology, especially in image-reconstruction algorithms, could reach patients without waiting for their possible inclusion in future generations of CT scanners (because this has not happened for the past 25 years and may not in the near future).

Perhaps the need for an open-source-community-developed CTreconstruction toolkit (CTK) with a database of corrected raw projection data from real CT scanners will be recognized.

I have been asking clinicians on and off for 40 years to put “open access” into their purchasing contracts with CT companies. But for most of them the role of the CT algorithm is invisible, and as is generally true in medicine and biology, the role of the theoretician is simply not recognized, let alone paid a salary or listened to. So how to break this impasse?

I have a simple proposal: CS (Compressive Sampling) people need to demonstrate to politicians concerned with x-ray dose that they can get the dose down by an order of magnitude or more. This, I think, is the answer to Igor Carron’s analysis:”Therefore a CS system would have to provide something else“. The politicians will then, in effect, legislate a role for CS people in CT design. The companies won’t do it themselves. Nor will their customers, the radiologists. It won’t happen in breast CT, because everyone will just use the dose ceiling already established in projection mammography.

The only radiologist that I know of who vigorously lectures other radiologists on the need for dose reduction in CT is David A. Leswick of the University of Saskatchewan. His papers are a little less fiery:

Leswick, D.A., N.S. Syed, C.S. Dumaine, H.J. Lim & D.A. Fladeland (2009). Radiation dose from diagnostic computed tomography in Saskatchewan. Can Assoc Radiol J 60(2), 71-78.

Leswick, D.A., C.S. Dumaine, N.S. Syed & D.A. Fladeland (2009). Computed tomography radiation dose: a primer for administrators. Healthcare Quarterly 12(Special Issue), 15-22.

Nevertheless, the latter starts:

“The use of computed tomography (CT) is growing, and, consequently, the associated radiation dose to patients is increasing as well. There is increasing evidence linking the radiation dose within the range of diagnostic CT with a significantly increased risk of malignancy. These two factors combine to make radiation dose from diagnostic CT a public health concern.”

Yours, -Dick Gordon

Thanks Dick, I agree with you that in the context of CT, maybe, the political whip is the only constraint that can force a dose reduction and opens the door to more efficient algorithm like ART or other nonlinear reconstruction techniques introduced with compressive sensing.

Lianlin Li mentioned this to me last week and he just wrote about this on his blog. Lianlin is particularly confused with three papers that seem to contradict each other on the subject of compressible priors for sparse bayesian estimation. The papers are:
[1]M.E. Tipping. Sparse Bayesian learning and the relevance vector machine. The Journal of Machine Learning Research, 1, 211–244, 2001.
[2] David Wipf, Jason Palmer, and Bhaskar Rao. Perspectives on sparse Bayesian learning, Advances in Neural Information Processing Systems, 16, 2003
[3] Volkan Cevher. Learning with compressible Priors, Preprint, 2009.
The slides in the blog entry are in English. The introduction is in Chinese but it is not relevant to the confusion at hand. If you want to clear up the misunderstanding between the papers as presented by Lianlin and feel you cannot comment in the blog with instructions in Chinese then you are welcome to put a comment here and I could make an entry of all y'alls responses.

Thanks Lianlin for following up on this.

Credit: NASA, Spirit sol 2147

Monday, January 18, 2010

CS: This week's big entry.

This week we have several papers looking at what can be detected in the measurement world before performing any type of reconstruction, enjoy:

Compressed sensing (CS) has the potential to reduce MR data acquisition time. There are two fundamental tenets to CS theory: (1) the signal of interest is sparse or compressible in a known representation, and (2) the measurement scheme has good mathematical properties (e.g., restricted isometry or incoherence properties) with respect to this representation. While MR images are often compressible, the second requirement is often only weakly satisfied with respect to commonly used Fourier encoding schemes. This paper investigates the possibility of improving CS-MRI performance using random encoding, in an effort to emulate the "universal'' encoding schemes suggested by the theoretical CS literature. This random encoding is achieved experimentally with tailored spatially-selective RF pulses. Simulation and experimental results show that the proposed encoding scheme has different characteristics than more conventional Fourier-based schemes, and could prove useful in certain contexts.
Hyperspectral target detection from incoherent projections by Kalyani Krishnamurthy, Maxim Raginsky and Rebecca Willett. The abstract reads:
This paper studies the detection of spectral targets corrupted by a colored Gaussian background from noisy, incoherent projection measurements. Unlike many detection methods designed for incoherent projections, the proposed approach a) is computationally efficient, b) allows for spectral backgrounds behind potential targets, and c) yields theoretical guarantees on detector performance. In particular, the theoretical performance bounds highlight fundamental tradeoffs among the number of measurements collected, the spectral resolution of targets, the amount of background signal present, signal-tonoise ratio, and the similarity between potential targets in a dictionary.
Compressive-sensing cameras are an important new class of sensors that have different design constraints than standard cameras. Surprisingly, little work has explored the relationship between compressive-sensing measurements and differential image motion. We show that, given modest constraints on the measurements and image motions, we can omit the computationally expensive compressive-sensing reconstruction step and obtain more accurate motion estimates with significantly less computation time. We also formulate a compressive-sensing reconstruction problem that incorporates known image motion and show that this method outperforms the state-of-the-art in compressive-sensing video reconstruction.
On unusual pixel shapes and image motion by Nathan Jacobs, Stephen Schuh, and Robert Pless.The abstract reads:
We introduce the integral-pixel camera model, where measurements integrate over large and potentially overlapping parts of the visual field. This models a wide variety of novel camera designs, including omnidirectional cameras, compressive sensing cameras, and novel programmable-pixel imaging chips. We explore the relationship of integral-pixel measurements with image motion and find (a) that direct motion estimation using integral-pixels is possible and in some cases quite good, (b) standard compressive-sensing reconstructions are not good for estimating motion, and (c) when we design image reconstruction algorithms that explicitly reason about image motion, they outperform standard compressive-sensing video reconstruction. We show experimental results for a variety of simulated cases, and have preliminary results showing a prototype camera with integral-pixels whose design makes direct motion estimation possible.
We consider the problem of sampling piecewise sinusoidal signals. Classical sampling theory does not enable perfect reconstruction of such signals since they are not bandlimited. However, they can be characterized by a finite number of parameters namely the frequency, amplitude and phase of the sinusoids and the location of the discontinuities. In this paper, we show that under certain hypotheses on the sampling kernel, it is possible to perfectly recover the parameters that define the piecewise sinusoidal signal from its sampled version. In particular, we show that, at least theoretically, it is possible to recover piecewise sine waves with arbitrarily high frequencies and arbitrarily close switching points. Extensions of the method are also presented such as the recovery of combinations of piecewise sine waves and polynomials. Finally, we study the effect of noise and present a robust reconstruction algorithm that is stable down to SNR levels of 7 [dB].
This paper studies the low-rank matrix completion problem from an information theoretic perspective. The completion problem is rephrased as a communication problem of an (uncoded) low-rank matrix source over an erasure channel. The paper then uses achievability and converse arguments to present order-wise optimal bounds for the completion problem.

Friday, January 15, 2010

CS: Collaborative Spectrum Sensing for CR Networks, Estimation with Random Linear Mixing, Asymptotic Analysis of Node-Based Recovery

I think it is important to always remember that the end goal of some of what we are doing, however theoretical sometimes, has clear and present implication on our, sometimes immediate, surroundings. To remind yourself of that, you may want to watch this whole 47 minutes video or simply the amazing excerpt that starts at 7 minutes and 42 seconds where Barbara Schulz recounts that "the worst word of it all was the word permanent...". Finally, I note that most of the sensor technology featured in this video is already included in most iPhones.

In other news, Lianlin Li has decided to copy most of the posts featured here on his blog. Thanks Lianlin. One can always come to the blog and use the "Subscribe by E-MAIL to Nuit Blanche" form on the right hand side of this page to receive new posts directly by E-mail. But if one could do that, then it would mean that one could access the blog in the first place. It's called a Catch-22.

In cognitive radio, spectrum sensing is a key component to detect spectrum holes (i.e., channels not used by any primary users). Collaborative spectrum sensing among the cognitive radio nodes is expected to improve the ability of checking complete spectrum usage states. Unfortunately, due to power limitation and channel fading, available channel sensing information is far from being sufficient to tell the unoccupied channels directly. Aiming at breaking this bottleneck, we apply recent matrix completion techniques to greatly reduce the sensing information needed. We formulate the collaborative sensing problem as a matrix completion subproblem and a joint-sparsity reconstruction subproblem. Results of numerical simulations that validated the effectiveness and robustness of the proposed approach are presented. In particular, in noiseless cases, when number of primary user is small, exact detection was obtained with no more than 8% of the complete sensing information, whilst as number of primary user increases, to achieve a detection rate of 95.55%, the required information percentage was merely 16.8%.

We consider the general problem of estimating a random vector from measurements generated by a linear transform followed by a componentwise probabilistic measurement channel. We call this the linear mixing estimation problem, since the linear transform couples the components of the input and output vectors. The main contribution of this paper is a general algorithm called linearized belief propagation (LBP) that iteratively "decouples" the vector minimum mean-squared error (MMSE) estimation problem into a sequence of componentwise scalar problems. The algorithm is an extension of Guo and Wang's relaxed BP method to the case of general (non-AWGN) output channels. Although standard BP is generally computationally tractable only for sparse mixing matrices, LBP (like relaxed BP) uses Gaussian approximations and can be applied to arbitrary mixing matrices.
We extend the density evolution (DE) analysis of relaxed BP to obtain a simple scalar characterization of the componentwise behavior of LBP in the limit of large, random transforms. The scalar equivalent model provides simple and asymptotically exact formulae for various performance metrics of LBP. We also strengthen earlier results on the convergence of the DE updates. As with relaxed BP, the convergence result in turn shows that LBP is provably asymptotically mean-squared optimal for AWGN channels when a certain fixed point of the DE equations is unique. Applications are presented to compressed sensing and estimation with bounded noise.
In this paper, we propose a general framework for the asymptotic analysis of node-based verification-based algorithms. In our analysis we tend the signal length $n$ to infinity. We also let the number of non-zero elements of the signal $k$ scale linearly with $n$. Using the proposed framework, we study the asymptotic behavior of the recovery algorithms over random sparse matrices (graphs) in the context of compressive sensing. Our analysis shows that there exists a success threshold on the density ratio $k/n$, before which the recovery algorithms are successful, and beyond which they fail. This threshold is a function of both the graph and the recovery algorithm. We also demonstrate that there is a good agreement between the asymptotic behavior of recovery algorithms and finite length simulations for moderately large values of $n$.

Wednesday, January 13, 2010

CS; CS in infinite dimensional space, Ditributed Bearing Estimation via matrix Completion, Optimal incorporation of sparsity information

If you thought some of the techniques used to reconstruct signals took long, you've never tried "infinite dimensional convex optimization" :-). This is the subject of today's first paper: Compressive Sampling in Infinite Dimensions by Anders C. Hansen. The abstract reads;
We generalize the theory of Compressive Sampling in Cn to infinite dimensional Hilbert spaces. The typical O(log(n)) estimates (where n is the dimension of the space) are manipulated to fit an infinite dimensional framework.

Looking at the other interest of the author, I wonder aloud if there is a connection between the computation pseudospectra and the RIP/NullSpace conditons ? For more information on computing pseudospectra of rectangular matrices, one can check Eigenvalues and Pseudospectra of rectangular matrices by Thomas Wright and L. Nick Trefethen. You'd think there is a connection and that theorem 2 would help (since multiplying a matrix with a class of sparse vector is really about reducing the number of columns of that matrix). There is also a connection between DOA and pseudospectra and DOA has been the subject of several papers mentioned here before.

Volkan Cevher let me know of a new paper Ditributed Bearing Estimation via matrix Completion by Andrew Waters and Volkan Cevher. The abstract reads:
We consider bearing estimation of multiple narrow-band plane waves impinging on an array of sensors. For this problem, bearing estimation algorithms such as minimum variance distortionless response (MVDR), multiple signal classification, and maximum likelihood generally require the array covariance matrix as sufficient statistics. Interestingly, the rank of the array covariance matrix is approximately equal to the number of the sources, which is typically much smaller than the number of sensors in many practical scenarios. In these scenarios, the covariance matrix is low-rank and can be estimated via matrix completion from only a small subset of its entries. We propose a distributed matrix completion framework to drastically reduce the inter-sensor communication in a network while still achieving near-optimal bearing estimation accuracy. Using recent results in noisy matrix completion, we provide sampling bounds and show how the additive noise at the sensor observations affects the reconstruction performance. We demonstrate via simulations that our approach sports desirable tradeoffs between communication costs and bearing estimation accuracy.
Finally, today's arxiv new addtion: Optimal incorporation of sparsity information by weighted $L_1$ optimization by Toshiyuki Tanaka and Jack Raymond. The abstract reads:
Compressed sensing of sparse sources can be improved by incorporating prior knowledge of the source. In this paper we demonstrate a method for optimal selection of weights in weighted $L_1$ norm minimization for a noiseless reconstruction model, and show the improvements in compression that can be achieved.