Nuit Blanche: 08/01/2015

Monday, August 31, 2015

Nuit Blanche in Review ( August 2015 )

Since the last Nuit Blanche in Review ( July 2015 ) we saw the birth of GitXiv and a few implementations.

We also had some in-depth studies, with in particular some on-going clues of The Great Convergence in action as well as some compressive sensing hardware:

Unsupervised Feature Selection on Data Streams / Streaming Anomaly Detection Using Randomized Matrix Sketching

Jobs:

Theses:

Thesis: Message Passing and Combinatorial Optimization, Siamak Ravanbakhsh
Thesis: Shallow and Deep Learning for Audio and Natural Language Processing, Po-Sen Huang

Videos/Slides/proceedings

Credit: NASA/Johns Hopkins University Applied Physics Laboratory/Southwest Research Institute

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Unsupervised Feature Selection on Data Streams / Streaming Anomaly Detection Using Randomized Matrix Sketching

Today, we see the use of streaming algorithms to figure out anomaly detection or unsupervised feature selection.

Streaming Anomaly Detection Using Randomized Matrix Sketching by Hao Huang, Shiva Kasiviswanathan.

Data is continuously being generated from sources such as machines, network traffic, application logs, etc. Timely and accurate detection of anomalies in massive data streams have important applications such as in preventing machine failures, intrusion detection, and dynamic load balancing. In this paper, we introduce a novel anomaly detection algorithm, which can detect anomalies in a streaming fashion by making only one pass over the data while utilizing limited storage. The algorithm uses ideas from matrix sketching and randomized low-rank matrix approximations to maintain, in a streaming model, a set of few orthogonal vectors that form a good approximate basis for the data. Using this constructed orthogonal basis, anomalies in new incoming data are detected based on a simple reconstruction error test. We theoretically prove that our algorithm compares favorably with an offline approach based on global singular value decomposition updates. The experimental results show the effectiveness and efficiency of our approach over other popular fast anomaly detection methods.

Unsupervised Feature Selection on Data Streams by Hao Huang, Shinjae Yoo, Shiva Kasiviswanathan

Massive data streams are continuously being generated from sources such as social media, broadcast news, etc., and typically these datapoints lie in high-dimensional spaces (such as the vocabulary space of a language). Timely and accurate feature subset selection in these massive data streams has important applications in model interpretation, computational/storage cost reduction, and generalization enhancement. In this paper, we introduce a novel unsupervised feature selection approach on data streams that selects important features by making only one pass over the data while utilizing limited storage. The proposed algorithm uses ideas from matrix sketching to e ciently maintain a low-rank approximation of the observed data and applies regularized regression on this approximation to identify the important features. We theoretically prove that our algorithm is close to an expensive offline approach based on global singular value decompositions. The experimental results on a variety of text and image datasets demonstrate the excellent ability of our approach to identify important features even in presence of concept drifts and also its efficiency over other popular scalable feature selection algorithms.

Previously:

Streaming Anomaly Detection Using Online Matrix Sketching

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, August 28, 2015

Machine Learning and Homomorphic Encryption - implementation -

We have mentioned homomorphic encryption here on Nuit Blanche mostly because of Andrew McGregor et al's work on the subject (see references below). Today, we have a Machine Learning approach using this encoding strategy, which in effect is not really that far from the idea of homomorphic sketches or random projections for low dimensional manifolds. Without further ado:

A review of homomorphic encryption and software tools for encrypted statistical machine learning by Louis J. M. Aslett, Pedro M. Esperança, Chris C. Holmes

Recent advances in cryptography promise to enable secure statistical computation on encrypted data, whereby a limited set of operations can be carried out without the need to first decrypt. We review these homomorphic encryption schemes in a manner accessible to statisticians and machine learners, focusing on pertinent limitations inherent in the current state of the art. These limitations restrict the kind of statistics and machine learning algorithms which can be implemented and we review those which have been successfully applied in the literature. Finally, we document a high performance R package implementing a recent homomorphic scheme in a general framework.

Encrypted statistical machine learning: new privacy preserving methods by Louis J. M. Aslett, Pedro M. Esperança, Chris C. Holmes

We present two new statistical machine learning methods designed to learn on fully homomorphic encrypted (FHE) data. The introduction of FHE schemes following Gentry (2009) opens up the prospect of privacy preserving statistical machine learning analysis and modelling of encrypted data without compromising security constraints. We propose tailored algorithms for applying extremely random forests, involving a new cryptographic stochastic fraction estimator, and na\"{i}ve Bayes, involving a semi-parametric model for the class decision boundary, and show how they can be used to learn and predict from encrypted data. We demonstrate that these techniques perform competitively on a variety of classification data sets and provide detailed information about the computational practicalities of these and other FHE methods.

An implementation in R is available here:

References:

h/t Albert Swart

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Mash: Fast genome distance estimation using the MinHash algorithm - implementation -

In this Sunday Morning Insight: What Happens When You Cross into P Territory ?, I mentioned this article on using LSH for genome alignment from long read technology (PacBio RS II or Oxford Nanopore MiNion).

Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing by Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James Drake, Jane M Landolin, Adam M Phillippy

while assembling the genome is important, with cheap and fast long reads, the goalpost is now slowly moving to the unsupervised learning of groups of genomes. That type of unsupervised learning can only be enabled with the right dimensionality reduction technique, today it is MinHash

Many exciting applications for Mash (ANI, metagenome clustering, nanopore triage, ...) Preprint coming soon https://t.co/xXYtsjEuJb
— Adam Phillippy (@aphillippy) August 25, 2015

Here is Mash: Fast genome distance estimation using the MinHash algorithm from Adam Phillippy's group.

Wonder how MinHash works, check this write-up by Matthew Casperson on MinHash for dummies.

Related:

Videos: Summer School on Hashing: Theory and Applications, 2014
Slides: Summer School on Hashing: Theory and Applications
see also all the blog entries with the hash or sketching tag.
And so it begins ... Compressive Genomics

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, August 27, 2015

Toward a unified theory of sparse dimensionality reduction in Euclidean space

So we are now converging toward a theory for using sparse random projections for a variety of problems.

Toward a unified theory of sparse dimensionality reduction in Euclidean space by Jean Bourgain, Sjoerd Dirksen, Jelani Nelson

Let Φ∈Rm×n be a sparse Johnson-Lindenstrauss transform [KN14] with s non-zeroes per column. For a subset T of the unit sphere, ε∈(0,1/2) given, we study settings for m,s required to ensure $E Φ sup x \in T ∣ ∣ ∥ Φ x ∥ 22 - 1 ∣ ∣ < ε,$ i.e. so that Φ preserves the norm of every x∈T simultaneously and multiplicatively up to 1+ε. We introduce a new complexity parameter, which depends on the geometry of T, and show that it suffices to choose s and m such that this parameter is small. Our result is a sparse analog of Gordon's theorem, which was concerned with a dense Φ having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson-Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.

h/t Laurent Jacques
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Comparative Analysis of Scattering and Random Features in Hyperspectral Image Classification

The great convergence is happening right under our eyes. Here is a comparison between Random Features and Scattering Transform features for classification of hyperspectral datasets:

Comparative Analysis of Scattering and Random Features in Hyperspectral Image Classification, Nikhila Haridas, V. Sowmya, K.P. Soman

Hyperspectral images (HSI) contains extremely rich spectral and spatial information that offers great potential to discriminate between various land cover classes. The inherent high dimensionality and insufficient training samples in such images introduces Hughes phenomenon. In order to deal with this issue, several preprocessing techniques have been integrated in processing chain of HSI prior to classification. Supervised feature extraction is one such method which mitigates the curse of dimensionality induced by Hughes effect. In recent years, new strategies for feature extraction based on scattering transform and Random Kitchen Sink have been introduced, which can be used in context of hyperspectral image classification. This paper presents a comparative analysis of scattering and random features in hyperspectral image classification. The classification is performed using simple linear classifier such as Regularized Least Square (RLS) accessed through Grand Unified Regularized Least Squares (GURLS) library. The proposed approach is tested on two standard hyperspectral datasets namely, Salinas-A and Indian Pines subset scene captured by NASAs AVIRIS sensor (Airborne Visible Infrared Imaging Spectrometer). In order to show the effectiveness of proposed method, a comparative analysis is performed based on feature dimension, classification accuracy measures and computational time. From the comparative assessment, it is evident that classification using random features achieve excellent classification results with less computation time when compared with raw pixels(without feature extraction) and scattering features for both the datasets.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, August 26, 2015

Slides: Beating the Perils of Non-Convexity: Training Neural Networks using Tensor Methods

From Anima's twitter feed: Beating the Perils of Non-Convexity: Training Neural Networks using Tensor Methods by Anima Anandkumar

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thesis: Message Passing and Combinatorial Optimization, Siamak Ravanbakhsh

Message Passing and Combinatorial Optimization by Siamak Ravanbakhsh

Graphical models use the intuitive and well-studied methods of graph theory to implicitly represent dependencies between variables in large systems. They can model the global behaviour of a complex system by specifying only local factors. This thesis studies inference in discrete graphical models from an algebraic perspective and the ways inference can be used to express and approximate NP-hard combinatorial problems.
We investigate the complexity and reducibility of various inference problems, in part by organizing them in an inference hierarchy. We then investigate tractable approximations for a subset of these problems using distributive law in the form of message passing. The quality of the resulting message passing procedure, called Belief Propagation (BP), depends on the influence of loops in the graphical model. We contribute to three classes of approximations that improve BP for loopy graphs A) loop correction techniques; B) survey propagation, another message passing technique that surpasses BP in some settings; and C) hybrid methods that interpolate between deterministic message passing and Markov Chain Monte Carlo inference.
We then review the existing message passing solutions and provide novel graphical models and inference techniques for combinatorial problems under three broad classes: A) constraint satisfaction problems such as satisfiability, coloring, packing, set / clique-cover and dominating / independent set and their optimization counterparts; B) clustering problems such as hierarchical clustering, K-median, K-clustering, K-center and modularity optimization; C) problems over permutations including assignment, graph morphisms and alignment, finding symmetries and traveling salesman problem. In many cases we show that message passing is able to find solutions that are either near optimal or favourably compare with today's state-of-the-art approaches.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, August 25, 2015

Distributed Compressive Sensing: A Deep Learning Approach

Much like last week's "Deep Learning Approach to Structured Signal Recovery", we are beginning to see some folks who use deep learning as a way of crafting reconstruction solvers. In the case presented below, I am not sure they are solving a true MMV problem as they seem to solve an even tougher problem (elements have similar sparsity as opposed to exactly the same sparsity pattern). Way to go !

Distributed Compressive Sensing: A Deep Learning Approach by Hamid Palangi, Rabab Ward, Li Deng

We address the problem of compressed sensing with Multiple Measurement Vectors (MMVs) when the structure of sparse vectors in different channels depend on each other. "The sparse vectors are not necessarily joint sparse". We capture this dependency by computing the conditional probability of each entry of each sparse vector to be non-zero given "residuals" of all previous sparse vectors. To compute these probabilities, we propose to use Long Short-Term Memory (LSTM) [1], a bottom up data driven model for sequence modelling. To compute model parameters we minimize a cross entropy cost function. We propose a greedy solver that uses above probabilities at the decoder. By performing extensive experiments on two real world datasets, we show that the proposed method significantly outperforms general MMV solver Simultaneous Orthogonal Matching Pursuit (SOMP) and model based Bayesian methods including Multitask Compressive Sensing [2] and Sparse Bayesian Learning for Temporally Correlated Sources [3]. Nevertheless, we emphasize that the proposed method is a data driven method where availability of training data is important. However, in many applications, train data is indeed available, e.g., recorded images or video.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, August 24, 2015

Random Noise in the GoogleLeNet Architecture

Back in 1996, Sparse coding made a big splash in Science and Engineering because the elements of a dictionary looked very much like the wavelet functions that had been discovered a few years earlier. For the first time, there was a sense that an algorithm could produce some simple insight on how the machinery of the visual cortex.

Roelof Pieters investigates, in his spare time, how random noise applied to different layers of current deep neural architectures produces different types of imagery. In the past few month, this process has been called DeepDreaming and has produced a few dog pictures. It is fascinating because as Pierre Sermanet speculated yesterday, some of this imagery might constitute a good detection clue for degenerative disease.

All that is well, but yesterday, Roelof mentioned that applying random noise on a particular GoogleLeNet layer (2012) produced something else than dogs. Here are a few examples of what he calls "Limited deep dreaming" that starts with random noise activating single units from a particular layer of GoogLeNet (3a output) -all the other images are listed in this Flickr album -

They certainly look very natural to me: Sometimes they look like structures found in electron microscopy, sometimes, they look like the structures found in numerical simulation of our universe:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, August 22, 2015

Slides: KDD2015 Tutorials

From the KDD website, here are most of the slides for the tutorials that occured during KDD 2015.

From Big Data Analytics: Optimization and Randomization by Tianbao Yang, Qihang Lin, Rong Jin

3-Hour Tutorials

R1 (Tutorial 1) VC-Dimension and Rademacher Averages: From Statistical Learning Theory to Sampling Algorithms, Matteo Riondato, Eli Upfal
R2 (Tutorial 2) Graph-Based User Behavior Modeling: From Prediction to Fraud Detection, Alex Beutel Leman Akoglu Christos Faloutsos
R3 (Tutorial 3) A New Look at the System, Algorithm and Theory Foundations of Large-Scale Distributed Machine Learning Eric Xing, Qirong Ho
R4 (Tutorial 4) Dense subgraph discovery (DSD) , Aristides Gionis; Charalampos Tsourakakis
R5 (Tutorial 5) Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach , Xiang Ren, Ahmed El-Kishky, Chi Wang, Jiawei Han (code and more)
R6 (Tutorial 6) Big Data Analytics: Optimization and Randomization Tianbao Yang, Qihang Lin, Rong Jin
R7 (Tutorial 7) Big Data Analytics: Social Media Anomaly Detection: Challenges and Solutions, Sanjay Chawla, Yan Liu
R8 (Tutorial 8) Diffusion in Social and Information Networks: Problems, Models and Machine Learning Methods , Manuel Gomez Rodriguez, Le Song
R9 (Tutorial 9)Medical Mining (no slides) Myra Spiliopoulou, Pedro Pereira Rodrigues, Ernestina Menasalvas

1.5-Hour Tutorials

S1 (Tutorial10) Large Scale Distributed Data Science using Apache Spark, James G. Shanahan, Liang Dai (example Python notebooks ...)
S2 (Tutorial 11) Data-Driven Product Innovation, Xin Fu, Hernán Asorey
S3 (Tutorial 12) Web Personalization and Recommender Systems, Shlomo Berkovsky, Jill Freyne

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Proceedings SIGGRAPH 2015 and interesting papers

The Proceedings of SIGGRAPH 2015 are probably available for donwload for free for one more day. Here is the location to download it.

Ke-Sen Huang has listed all the SIGGRAPH paper directly available on the web. It is here:

http://kesen.realtimerendering.com/sig2015.html

Of interest to the readership of Nuit Blanche:

Learning Visual Similarity for Product Design with Convolutional Neural Networks Sean Bell, Kavita Bala
Image Based Relighting Using Neural Networks, Peiran Ren, Yue Dong, Stephen Lin Xin Tong, Baining Guo
A Machine Learning Approach for Filtering Monte Carlo Noise Nima Khademi Kalantari, Steve Bako, Pradeep Sen (University of California, Santa Barbara)
Fluid Volume Modeling from Sparse Multi-view Images by Appearance Transfer , Makoto Okabe Yoshinori Dobashi Ken Anjyo Rikio Onai

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Light Field Reconstruction Using Sparsity in the Continuous Fourier Domain - implementation -

Light Field Reconstruction Using Sparsity in the Continuous Fourier Domain by Lixin Shi Haitham Hassanieh Abe Davis Dina Katabi Fredo Durand

Sparsity in the Fourier domain is an important property that enables the dense reconstruction of signals, such as 4D light fields, from a small set of samples. The sparsity of natural spectra is often derived from continuous arguments, but reconstruction algorithms typically work in the discrete Fourier domain. These algorithms usually assume that sparsity derived from continuous principles will hold under discrete sampling. This paper makes the critical observation that sparsity is much greater in the continuous Fourier spectrum than in the discrete spectrum. This difference is caused by a windowing effect. When we sample a signal over a finite window, we convolve its spectrum by an infinite sinc, which destroys much of the sparsity that was in the continuous domain. Based on this observation, we propose an approach to reconstruction that optimizes for sparsity in the continuous Fourier spectrum. We describe the theory behind our approach and discuss how it can be used to reduce sampling requirements and improve reconstruction quality. Finally, we demonstrate the power of our approach by showing how it can be applied to the task of recovering non- Lambertian light fields from a small number of 1D viewpoint trajectories.

The project page and the implementation are here: http://groups.csail.mit.edu/netmit/LFSparseRecon/

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, August 21, 2015

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R - implementation

You know what is strange ? the URL for the C++ or the CRAN repository of this software is not listed once in this preprint while all other competing URLs are.

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R by Marvin N. Wright, Andreas Ziegler

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

The C++ implementation is here: http://www.imbs-luebeck.de/imbs/de/node/313 , the R implementation is on CRAN at: http://CRAN.R-project.org/package=ranger

Other packages mentioned:

Ishwaran H, Kogalur U (2015). randomForestSRC: Random Forests for Survival, Regression and Classifi cation. R package version 1.6.1, URL http://CRAN.R-project.org/package=randomForestSRC.
Seligman M (2015). Rborist: Extensible, Parallelizable Implementation of the Random Forest Algorithm. R package version 0.1-0, URL http://CRAN.R-project.org/package=Rborist

.
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

SAFFRON: A Fast, Efficient, and Robust Framework for Group Testing based on Sparse-Graph Codes

I am surprised to not see a reference to some other similar work in that area featured here back two years ago.

SAFFRON: A Fast, Efficient, and Robust Framework for Group Testing based on Sparse-Graph Codes by Kangwook Lee, Ramtin Pedarsani, Kannan Ramchandran

Group testing tackles the problem of identifying a population of $K$ defective items from a set of $n$ items by pooling groups of items efficiently in order to cut down the number of tests needed. The result of a test for a group of items is positive if any of the items in the group is defective and negative otherwise. The goal is to judiciously group subsets of items such that defective items can be reliably recovered using the minimum number of tests, while also having a low-complexity decoding procedure.
We describe SAFFRON (Sparse-grAph codes Framework For gROup testiNg), a non-adaptive group testing paradigm that recovers at least a $(1-\epsilon)$-fraction (for any arbitrarily small $\epsilon > 0$) of $K$ defective items with high probability with $m=6C(\epsilon)K\log_2{n}$ tests, where $C(\epsilon)$ is a precisely characterized constant that depends only on $\epsilon$. For instance, it can provably recover at least $(1-10^{-6})K$ defective items with $m \simeq 68 K \log_2{n}$ tests. The computational complexity of the decoding algorithm of SAFFRON is $\mathcal{O}(K\log n)$, which is order-optimal. Further, we robustify SAFFRON such that it can reliably recover the set of $K$ defective items even in the presence of erroneous or noisy test results. We also propose Singleton-Only-SAFFRON, a variant of SAFFRON, that recovers all the $K$ defective items with $m=2e(1+\alpha)K\log K \log_2 n$ tests with probability $1-\mathcal{O}{\left(\frac{1}{K^\alpha}\right)}$, where $\alpha>0$ is a constant. By leveraging powerful design and analysis tools from modern sparse-graph coding theory, SAFFRON is the first approach to reliable, large-scale probabilistic group testing that offers both precisely characterizable number of tests needed (down to the constants) together with order-optimal decoding complexity.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !