Tuesday, July 07, 2009

CS: Stuff, Boolean CS, CS Limts in L_p, A Plurality of Sparse Representations and RandOMP, Sparse Null Space, Improving Compressed Counting


The papers of the conference on Abstraction in Reinforcement Learning are out. The same goes for the FOCS abstracts. Terry Tao has a new entry on Benford’s law, Zipf’s law, and the Pareto distribution. As I explained before in the Sparsity in Everything series of entries, as soon as you have a power law and a thresholding factor of convenience, you have sparsity and you can begin to think of ways one can apply compressed sensing techniques when sampling in these areas. It looks like the plenary presentation by Rich Baraniuk went well at ISIT. By the way, who has a good idea on how to do an CS IR camera by improving the sampling technique used in this Poor Man's Thermographic Camera ?

Arxiv has two new papers on CS: Boolean Compressed Sensing and Noisy Group Testing by George Atia, Venkatesh Saligrama. The abstract reads:
The fundamental task of group testing is to recover a small distinguished subset of items from a large group while efficiently reducing the total number of tests (measurements). The key contribution of this paper is in adopting a new information-theoretic perspective on group testing problems. Establishing its connection to Shannon-coding theory, we formulate the group testing problem as a channel coding/decoding problem and derive a unifying result that reduces many of the interesting questions to computation of a mutual information expression. First, we derive an achievable bound on the total number of tests using random constructions and typical set decoding. This result is fairly general; it allows us to verify existing bounds for some of the known scenarios and extend the analysis to many new interesting setups including noisy versions of group testing. We obtain tradeoffs between the number of tests, sparsity and the group size for a wide range of models. Among the models we consider are the deterministic noise-free case, approximate reconstruction with bounded distortion, additive measurement noise, and dilution models. Then, we show that the achievable bound provides fairly tight asymptotic scaling by deriving an information theoretic lower-bound using Fano's inequality.

Of related interest is the recent paper entitled Noise-Resilient Group Testing: Limitations and Constructions by Mahdi Cheraghchi.

and Typical reconstruction limit of compressed sensing based on Lp-norm minimization by Y. Kabashima, T. Wadayama and T. Tanaka. The abstract reads:
We consider the problem of reconstructing an $N$-dimensional continuous vector $\bx$ from $P$ constraints which are generated by its linear transformation under the assumption that the number of non-zero elements of $\bx$ is typically limited to $ ho N$ ($0le ho le 1$). Problems of this type can be solved by minimizing a cost function with respect to the $L_p$-norm $||\bx||_p= lim_{epsilon o +0}sum_{i=1}^N |x_i|^{p+epsilon}$, subject to the constraints under an appropriate condition. For several $p$, we assess a typical case limit $alpha_c( ho)$, which represents a critical relation between $alpha=P/N$ and $ ho$ for successfully reconstructing the original vector by minimization for typical situations in the limit as $N,P o infty$ with keeping $alpha$ finite, utilizing the replica method. For $p=1$, $alpha_c( ho)$ is considerably smaller than its worst case counterpart, which has been rigorously derived by existing literature of information theory.
The following paper makes the case for the fact that maybe being the only solution may be too stringent for engineering applications: A Plurality of Sparse Representations is Better than the Sparsest One Alone by Michael Elad and Irad Yavneh. The abstract reads:
Cleaning of noise from signals is a classical and long-studied problem in signal processing. Algorithms for this task necessarily rely on an a-priori knowledge about the signal characteristics, along with information about the noise properties. For signals that admit sparse representations over a known dictionary, a commonly used denoising technique is to seek the sparsest representation that synthesizes a signal close enough to the corrupted one. As this problem is too complex in general, approximation methods, such as greedy pursuit algorithms, are often employed.
In this line of reasoning, we are led to believe that detection of the sparsest representation is key in the success of the denoising goal. Does this mean that other competitive and slightly inferior sparse representations are meaningless? Suppose we are served with a group of competing sparse representations, each claiming to explain the signal di®erently. Can those be fused somehow to lead to a better result? Surprisingly, the answer to this question is positive; merging these representations can form a more accurate (in the mean-squared-error sense), yet dense, estimate of the original signal even when the latter is known to be sparse. In this paper we demonstrate this behavior, propose a practical way to generate such a collection of representations by randomizing the Orthogonal Matching Pursuit (OMP) algorithm, and produce a clear analytical justification for the superiority of the associated Randomized OMP (RandOMP) algorithm. We show that while the Maximum a-posterior Probability (MAP) estimator aims to find and use the sparsest representation, the Minimum Mean-Squared-Error (MMSE) estimator leads to a fusion of representations to form its result. Thus, working with an appropriate mixture of candidate representations, we are surpassing the MAP and tending towards the MMSE estimate, and thereby getting a far more accurate estimation in terms of the expected `2-norm error (and thus, SNR as well) .

The next papers are not directly related to CS but there is some overlap:

Matrix sparsification and the sparse null space problem by Lee-Ad Gottlieb, Tyler Neylon. The abstract reads:
We revisit the matrix problems sparse null space and matrix sparsification, and show that they are equivalent. We then proceed to seek algorithms for these problems: We prove the hardness of approximation of these problems, and also give a powerful tool to extend algorithms and heuristics for sparse approximation theory to these problems.

Ping Li has new paper out on Improving Compressed Counting. The abstract reads:
Compressed Counting (CC) [22] was recently proposed for estimating the -th frequency moments of data streams, where 0 \lt \alpha \le 2. CC can be used for estimating Shannon entropy, which can be approximated by certain functions of the -th frequency moments as \alpha goes to 1. Monitoring Shannon entropy for anomaly detection (e.g., DDoS attacks) in large networks is an important task. This paper presents a new algorithm for improving CC. The improvement is most substantial when \alpha goes to 1. For example, when \alpha = 0:99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when \alpha = 0:5.


Credit:NASA/GSFC/Arizona State University, The subdued rim of Anaxagoras A, dramatically lit by the lunar sunrise as taken by the LRO Camera yesterday.

1 comment:

Anonymous said...

The paper by Michael Elad and Irad Yavneh has an odd aspect. It sounds like they've re-discovered the basis of ensemble methods, which themselves are quite in vogue today. It's odd that they havn't made the connection :-/

Printfriendly