Saturday, January 31, 2015

Sunday Morning Insight: The Hardest Challenges We Should be Unwilling to Postpone

Last Wednesday, at the Paris Machine Learning meetup, we had some of the most interesting talks around Data Science, Machine Learning and Non Profits. For the purpose of providing a panorama of non profits, we included Universities, Foundations, NGOs and an Open Source project. And it is no wonder that they are non profits, they tackle the hardest challenges in Data Science and Machine Learning.

We first heard Paul Duan, the founder of BayesImpact.org, a YCombinator-backed non profit. His presentation is here.


Paul presented how they are producing codes and algorithms so that it can serve the ambulance dispatch system in San Francisco ( at some point he even mentioned the issue of fair algorithm, an issue we'll try to address in one of a future meetup). One of the challenging issue found by BayesImpact was the old equipment of some of these systems and the fact that some are disconnected from each other by law. Another issue was that it may sometimes be difficult to articulate spending some money on algorithms while there is an ongoing funding shortage.

Then we had Isabelle Guyon ( AutoML Challenge presentation (pdf), and site: ChaLearn Automatic Machine Learning Challenge (AutoML), Fully Automatic Machine Learning without ANY human intervention. )


Isabelle  talked about the recent ML challenge, she and ChaLearn, a non profit foundation, is putting in place. The goal of the challenge is somehow to see how a model can deal with increased complexity and remove humans from the process feature engineering and hyperparameter tuning which as we all know is black art on most accounts. Many people see this effort as potentially killing the business of being a data scientist but this is erronous in my view. First, Kaggle type of efforts give the impression that overfitting is not a crime. This type of challenge should squarely bring back some common sense to this view of the world. Second, most current challenges have no real strategy for dealing with non stationary datasets (i.e. datasets that increase in complexity with time). This type of challenge opens the door to developing strategies in that regards. Definitely another tough problem.
Here is the longer presentation: AutoML Challenge presentation (ppt), (pdf)


We then went on with  Frederic le Manach from the Bloom Association, an NGO, on the topic of Subsidizing overfishing (pdf), (ppt)




Frederic talked about a specific problem related to data-driven policy making. His NGO is focused on trying to bring some light on the tortuous paths taken by different govermental subsidies (negative or positive) to the fishing industry. As a backgrounder, he mentioned some interesting information I was not particularly aware of: namely that some fishing nets can go as deep as 1800m (5900 feet). His NGO's thesis is that one of the reason for overfishing may have to do with the various opaque mechanisms by which subsidies are handed out to various stakeholders. His organization intents on untangling this maze so that policymakers understand the real effect of the current system. Frederic is looking for data scientists who could find ways to gather and clean data from various sources. 

The other information of interest to me was that, in terms of jobs, having small fishing outfits seemed to be a win-win on many accounts (fisherman gets paid better, fish reserves are not depleted yielding potentially larger fish population, etc...).

In the beer session Franck and I had with him afterwards, we noted while the subsidies issue could be having an impact on policy making, it might not be the most effective way of bringing the attention of overfishing to the attention of the general public. One item that seemed obvious to us was that the current system did not have a good fish stock counting process. And sure enough, Frederic mentioned two studies that clearly showed an awful mismatch between certain fish population counts and predictions. 

by Carl Walters and Jean-Jacques Maguire

 The fascinating thing is that there is some more open datasets (as opposed to the subsidies sets):


The count does not seem to take into account the underlying structure of the signal (the fish population). Think of the problem as a little bit like a Matrix completion problem of sorts. What sort of side information do we have ? According to Frederic, there are several instances of fish which depleted in a matter of a few years (and put entire industry out of business in that same time span). The underlying reason for this is that there are certain species that are only going by flocks (flocks of 40000 individuals). Think of them as clusters. If somehow, a flock is being fished and only 20000 individuals remain then, these indviduals will look for another cluster to merge with, in order to go back to about 40000+ individuals.

If you do some sampling in the sea and not know about the social stucture of the population, then whatever underlying assumptions behind some linear model will probably over or undercount the actual population. At the very least, it will be easy for any stakeholder to discount the counting method as a tale based on a mere interpolation with no real value.




Yet, there are real time information that could be used as proxies for the count. There are currently GPS on boats and some that data flux is available. Through their radars, fishing boats are hunting out flocks of fish and could be seen as a good proxy of where the flocks are located.  



And this is where the matrix completion problem comes in. We are looking at a problem that is very similar to a Robust PCA, where ones wants to images all the flocks at once with very incomplete information yet, the spatio-temporal dataset has sone definite structure that comes from what we know of the social behaviors of these animals. The problem could also fit with a group testing/compressive sensing approach.

In the end, a more exact count would have an effect on all stakeholders. For instance, if there were only 30 flocks of a certain species left in the mediteraenean sea, even bankers would make different decisions when it comes to loaning money for a new ship. Other stakeholders would equally make different choices so that the fishing of that stock could last a much longer time period.  


Our next speaker was Emmanuel Dupoux, of EHESS/ENS/LPS who presented The Zero Resource Speech Challenge (presentation pdf, presentation ppt )  



 



Emmanuel described the Zero Speech Challenge (www.zerospeech.com) by arguing that the current approach to language learning is mostly through peer related interactions. Babies, in particular, can do a lot of unsupervised learning that is currently not the path taken by most algorithm development in Machine Learning. He also made an argument that the current path could probably not scale for languages that did not have large corpuses from which one could train current ML algorithms.

Emmanuel also mentioned the MIT dataset (See Deb Roy's talk "the Birth of a Word" ) where issues of privacy has overwhelmed the project to the point that the data is essentially closed. Emmanuel mentioned a similar project in his lab where similar issues of privacy have to be sorted through.




Eventually, we listened to Jean-Philippe Encausse who talked to us about S.A.R.A.H ( here is his presentation pdf, (ppt)). S.A.R.A.H is a system you can install at home and that enables people to communicate with their in-house connected devices. There is a potential for this system to produce large amount of data that could eventually be used by the academic community. It was interesting to see how very rapidily there could be an obvious match between the datasets potentially generated by S.A.R.A.H and those of direct relevance to the previous talk. Jean-Philippe described how a new kind of plugin could help in this endeavor.

Jean-Philippe wrote two blog entry on this potential use of S.A.R.A.H
 Here is the video of what S.A.R.A.H can do.



To paraphrase President Kennedy, we saw some of the hardest challenges we should be unwilling to postpone.
 
 

Godspeed Ian !

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, January 30, 2015

Unbiased Bayes for Big Data: Paths of Partial Posteriors / Computing Functions of Random Variables via Reproducing Kernel Hilbert Space Representations

 Today, we use Random Features:



Bayesian inference proceeds based on expectations of certain functions with respect to the posterior. Markov Chain Monte Carlo is a fundamental tool to compute these expectations. However, its feasibility is being challenged in the era of so called Big Data as all data needs to be processed in every iteration. Realising that such simulation is an unnecessarily hard problem if the goal is estimation, we construct a computationally scalable methodology that allows unbiased estimation of the required expectations -- without explicit simulation from the full posterior. The average computational complexity of our scheme is sub-linear in the size of the dataset and its variance is straightforward to control, leading to algorithms that are provably unbiased and naturally arrive at a desired error tolerance. We demonstrate the utility and generality of the methodology on a range of common statistical models applied to large scale benchmark datasets.


We describe a method to perform functional operations on probability distributions of random variables. The method uses reproducing kernel Hilbert space representations of probability distributions, and it is applicable to all operations which can be applied to points drawn from the respective distributions. We refer to our approach as {\em kernel probabilistic programming}. We illustrate it on synthetic data, and show how it can be used for nonparametric structural equation models, with an application to causal inference.
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, January 29, 2015

Efficient Blind Compressed Sensing and $\ell_0$ Sparsifying Transform Learning



Efficient Blind Compressed Sensing Using Sparsifying Transforms with Convergence Guarantees and Application to MRI  by Saiprasad Ravishankar, Yoram Bresler

Natural signals and images are well-known to be approximately sparse in transform domains such as Wavelets and DCT. This property has been heavily exploited in various applications in image processing and medical imaging. Compressed sensing exploits the sparsity of images or image patches in a transform domain or synthesis dictionary to reconstruct images from undersampled measurements. In this work, we focus on blind compressed sensing, where the underlying sparsifying transform is apriori unknown, and propose a framework to simultaneously reconstruct the underlying image as well as the sparsifying transform from highly undersampled measurements. The proposed block coordinate descent type algorithms involve highly efficient closed-form optimal updates. Importantly, we prove that although the proposed blind compressed sensing formulations are highly nonconvex, our algorithms converge to the set of critical points of the objectives defining the formulations. We illustrate the usefulness of the proposed framework for magnetic resonance image reconstruction from highly undersampled k-space measurements. As compared to previous methods involving the synthesis dictionary model, our approach is much faster, while also providing promising reconstruction quality.

$\ell_0$ Sparsifying Transform Learning with Efficient Optimal Updates and Convergence Guarantees by Saiprasad Ravishankar, Yoram Bresler

Many applications in signal processing benefit from the sparsity of signals in a certain transform domain or dictionary. Synthesis sparsifying dictionaries that are directly adapted to data have been popular in applications such as image denoising, inpainting, and medical image reconstruction. In this work, we focus instead on the sparsifying transform model, and study the learning of well-conditioned square sparsifying transforms. The proposed algorithms alternate between a $\ell_0$ "norm"-based sparse coding step, and a non-convex transform update step. We derive the exact analytical solution for each of these steps. The proposed solution for the transform update step achieves the global minimum in that step, and also provides speedups over iterative solutions involving conjugate gradients. We establish that our alternating algorithms are globally convergent to the set of local minimizers of the non-convex transform learning problems. In practice, the algorithms are insensitive to initialization. We present results illustrating the promising performance and significant speed-ups of transform learning over synthesis K-SVD in image denoising.  
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, January 28, 2015

Ce Soir / Tonight: Paris Machine Learning Meetup "Hors Série" 2 (Season 2) Data Science for Non-Profits



Tonight, we will have our first meetup featuring a number of projects by non profits (Universities, NGOs, Foundations, Open Source Projects...) that could be helped by Machine Learning. For this round, we identified five different projects developed by these non profits. The idea of this "Hors Série" is to be exploratory in nature. Some of these projects are very machine learning oriented whereas others have not yet indentified how Machine Learning could help.

Our host is TheAssets.co. The meetup starts at 7:00PM Paris time.

Our speakers: 
Paul will talk to us about one project at: http://www.bayesimpact.org, a YCombinator-backed non profits, featuring dispatching ambulance services in SF. Frederic will provide us infos on an issue with fishing and fishing nets and will probably discuss how there can be some data driven initiatives in that area. If you recall Jean-Philippe presented S.A.R.A.H last year, a project for home automation (think Internet of Things), this time we asked him to come back and tell us how to design plugins so that people can (when they want to) share in-house data in the cloud for scientific investigations. Isabelle, one of the people behind the SVM 92 paper cited 6250 times, will talk to us about the ChaLearn Automatic Machine Learning Challenge (AutoML) that aims performing Fully Automatic Machine Learning without ANY human intervention. And finally, we'll have Emmanuel who will talk to us about The Zero Resource Speech Challenge.

The meetup will be streamed online (see below), presentation are likely to be in spoken French but the slides will be in English.

Speakers (presentations will be available before the start of the meetup and eventually will be made available on the Paris Machine Learning Archives)

Want to do a presentation at one of our next meetups, here is the form (it's in French for the moment): http://goo.gl/forms/7ogXzchTfn
Want to receive our low frequency newsletter (one per month max), register here : http://goo.gl/forms/mqFB0e3SwM
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, January 27, 2015

Smoothed Low Rank and Sparse Matrix Recovery by Iteratively Reweighted Least Squares Minimization - implementation -

 
 
Smoothed Low Rank and Sparse Matrix Recovery by Iteratively Reweighted Least Squares Minimization by Canyi Lu, Zhouchen Lin, Shuicheng Yan

This work presents a general framework for solving the low rank and/or sparse matrix minimization problems, which may involve multiple non-smooth terms. The Iteratively Reweighted Least Squares (IRLS) method is a fast solver, which smooths the objective function and minimizes it by alternately updating the variables and their weights. However, the traditional IRLS can only solve a sparse only or low rank only minimization problem with squared loss or an affine constraint. This work generalizes IRLS to solve joint/mixed low rank and sparse minimization problems, which are essential formulations for many tasks. As a concrete example, we solve the Schatten-p norm and ℓ2,q-norm regularized Low-Rank Representation (LRR) problem by IRLS, and theoretically prove that the derived solution is a stationary point (globally optimal if p,q≥1). Our convergence proof of IRLS is more general than previous one which depends on the special properties of the Schatten-p norm and ℓ2,q-norm. Extensive experiments on both synthetic and real data sets demonstrate that our IRLS is much more efficient. 
An implementation of the fast IRLS solver is on Canyi Lu's page. 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

LatLRR : Robust latent low rank representation for subspace clustering - implementation -

another instance of subspace clustering (this time helped by robust PCA) and soon to be listed on the Advanced Matrix Factorization page.
 
Robust latent low rank representation for subspace clustering by Hongyang Zhanga, Zhouchen Lina, Chao Zhanga, Junbin Gaob,

Subspace clustering has found wide applications in machine learning, data mining, and computer vision. Latent Low Rank Representation (LatLRR) is one of the state-of-the-art methods for subspace clustering. However, its effectiveness is undermined by a recent discovery that the solution to the noiseless LatLRR model is non-unique. To remedy this issue, we propose choosing the sparest solution in the solution set. When there is noise, we further propose preprocessing the data with robust PCA. Experiments on both synthetic and real data demonstrate the advantage of our robust LatLRR over state-of-the-art methods.

An implementation of LatLRR is on Zouchin publication list (item 46).
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, January 26, 2015

Workshop on Big Data and Statistical Machine Learning, January 26 – 30, 2015

So there is the BASP2015 meeting in Switzerland while at the same time, there is the Workshop on Big Data and Statistical Machine Learning in Canada, which really means that if there were a live video feed in Switzerland we could have an always-on interesting view of what's going on in both of these conferences (jet lag helping). The Workshop on Big Data and Statistical Machine Learning, is going to be streamed live however from Monday through Friday. Thanks to the organizer, Ruslan Salakhutdinov, here is the program:
The aim of this workshop is to bring together researchers working on various large-scale deep learning as well as hierarchical models to discuss a number of important challenges, including the ability to perform transfer learning as well as the best strategies to learn these systems on large scale problems. These problems are "large" in terms of input dimensionality (in the order of millions), number of training samples (in the order of 100 millions or more) and number of categories (in the order of several tens of thousands).

Tentative Schedule

Monday January 26


8:30-9:15 Coffee and Registration

9:15-9:30 Ruslan Salakhutdinov: Welcome

9:30-10:30 Yoshua Bengio, Université de Montréal
Exploring alternatives to Boltzmann machine

10:30-11:00 Coffee

11:00-12:00 John Langford, Microsoft Research
Learning to explore

12:00-2:00 Lunch

2:00-3:00 Hau-tieng Wu, University of Toronto
Structure massive data by graph connection Laplacian and its application

3:00-3:30

Tea

3:30-4:30 Roger Grosse, University of Toronto
Scaling up natural gradient by factorizing Fisher information

4:30 Cash Bar Reception
Tuesday January 27


9:30-10:30 Brendan Frey, University of Toronto
The infinite genome project: Using statistical induction to understand the genome and improve human health

10:30-11:00 Coffee break

11:00-12:00 Daniel Roy, University of Toronto
Mondrian Forests: Efficient Online Random Forests

12:00-2:00 Lunch break

2:00-3:00
Raquel Urtasun, University of Toronto

3:00-3:30 Tea break
Wednesday January 28

9:30-10:30 Samy Bengio, Google Inc
The Battle Against the Long Tail


10:30-11:00 Coffee break


11:00-12:00 Richard Zemel, University of Toronto
Learning Rich But Fair Representations


12:00-1:00 Lunch break


2:00-3:00 David Blei, Princeton University
Probabilistic Topic Models and User Behavior


3:00-3:30 Tea break

3:30-4:30 Yura Burda, Fields Institute
Raising the Reliability of Estimates of Generative Performance of MRFs
Thursday January 29


9:30-10:30 Joelle Pineau, McGill University
Practical kernel-based reinforcement learning


10:30-11:00 Coffee break


11:00-12:00 Cynthia Rudin, MIT CSAIL and Sloan School of Management
Thoughts on Interpretable Machine Learning


12:00-2:00 Lunch


2:00-3:00 Radford Neal, University of Toronto
Learning to Randomize and Remember in Partially-Observed Environments


3:00-3:30 Tea break

Friday January 30
9:30-10:30 Alexander Schwing, The Fields Institute
Deep Learning meets Structured Prediction

10:30-11:00 Coffee break

11:00-12:00 Ruslan Salakhutdinov:Closing remarks.

12:00-2:00 Lunch
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

CfP: "Applications of Matrix Computational Methods in the Analysis of Modern Data” Workshop Reykjavik, Iceland , June 1-3, 2015

Kourosh Modarresi just sent me the following

Dear Igor,

Julie (Josse) recommended me to reach out about the meeting (Deadline for all submissions are 1/31/2015). I appreciate if you publicize the meeting through your network. Also, please do submit your own works for the meeting.

Thanks,
Kourosh
Here is the announcement:
ICCS 2015, AMCMD Workshop: Applications of Matrix Computational Methods in the Analysis of “Modern Data”

Dear All,

I am the organizer of Applications of Matrix Computational Methods in the Analysis of “Modern Data” workshop at ICCS2015 (premier conference in Scientific Computing). This is the workshop site,


This is an exciting opportunity and I am looking forward to it to be a great meeting. There will be at least two great invited lectures, Stanford Professors Jure Leskovec and Trevor Hastie. The ICCS (International Conference On Computational Science) is to be held in Reykjavík, Iceland on 1-3 June, 2015.

The submission is open till 1/31/2015.

When using the ICCS2015 site (http://www.iccs-meeting.org/iccs2015/) directly, you will be directed to EasyChair for submissions. Please make sure that you choose “Applications of Matrix Computational Methods in the Analysis of Modern Data” track for your submission choice.

Some more info about the workshop:

ICCS 2015 – AMCMD Workshop

Applications of Matrix Computational Methods in the Analysis of “Modern Data”

Description: “Modern Data” has unique characteristics such as, extreme sparsity, high correlation, high dimensionality and massive size. Modern data is very prevalent in all different areas of science such as Medicine, Environment, Finance, Marketing, Vision, Imaging, Text, Web, etc. A major difficulty is that many of the old methods that have been developed for analyzing data during the last decades cannot be applied on modern data. One distinct solution, to overcome this difficulty, is the application of matrix computation and factorization methods such as SVD (singular value decomposition), PCA (principal component analysis), and NMF (non- negative matrix factorization), without which the analysis of modern data is not possible. This workshop covers the application of matrix computational science techniques in dealing with Modern Data.


Sample Themes/Topics (not limited to the list):
  • Theoretical Aspects of “Modern Data”
  • Sparse Matrix Factorization
  • Recommender System
  • Dimension Reduction and Feature Learning
  • Deep Learning
  • Computational Finance
  • Singular Value Decomposition in “Modern Data”
  • Social Computing
  • Vision
  • Biostatistics and Computational Biology

****** ******

It is a great opportunity for researchers and practitioners in the related fields to present their works alongside works of some of the greatest scientists in this area.

Please feel free to pass this email to anyone may be interested and please let me know if you have any questions.



Many Thanks,

Kourosh
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Sunday, January 25, 2015

Sunday Morning Insight: What Happens When You Cross into P Territory ?



Today's insight is a follow up to other previous Sunday Morning Insights. The first of those insights cast genome sequencing as a formerly NP-Hard problem (Sunday Morning Insight: Escaping Feynman's NP-Hard "Map of a Cat": Genomic Sequencing Edition), the second one focused on how Advanced Matrix Factorization could now help speed up this technology ( Improving Pacific Biosciences' Single Molecule Real Time Sequencing Technology through Advanced Matrix Factorization ? ). Then, we conjectured, based on our previous experience with compressive sensing ( a formerly NP-hard problem) how the field of genome sequencing would grow (Sunday Morning Insight: The Stuff of Discovery and Sunday Morning Insight: Crossing into P territory ). Today's insight will follow through on all those previous  insights. In a recent Twitter exchange, one could read:
Lex Nederbragt responded with this preprint that I had wanted to mention much earlier. The use of LSH to perform alignement.

Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing by Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James Drake, Jane M Landolin, Adam M Phillippy

We report reference-grade de novo assemblies of four model organisms and the human genome from single-molecule, real-time (SMRT) sequencing. Long-read SMRT sequencing is routinely used to finish microbial genomes, but the available assembly methods have not scaled well to larger genomes. Here we introduce the MinHash Alignment Process (MHAP) for efficient overlapping of noisy, long reads using probabilistic, locality-sensitive hashing. Together with Celera Assembler, MHAP was used to reconstruct the genomes of Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster, and human from high-coverage SMRT sequencing. The resulting assemblies include fully resolved chromosome arms and close persistent gaps in these important reference genomes, including heterochromatic and telomeric transition sequences. For D. melanogaster, MHAP achieved a 600-fold speedup relative to prior methods and a cloud computing cost of a few hundred dollars. These results demonstrate that single-molecule sequencing alone can produce near-complete eukaryotic genomes at modest cost.
Emphasis mine. From the paper, one can read:

For example, using Amazon Web Services (AWS), the estimated cost to generate the D. melanogaster PBcR-BLASR assembly is over $100,000 at current rates, an order of magnitude higher than the sequencing cost. With MHAP, this cost is drastically reduced to under $300
.
MHAP is available here: http://www.cbcb.umd.edu/software/PBcR/MHAP

In summary, if LSH can be used for reconstruction, similar tools ought to be good enough to perform Compressive Genomics for even larger scale problems ...In effect, when you cross into P territory, you look for harder problems.

 

 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, January 24, 2015

Proceedings: Biomedical and Astronomical Signal Processing (BASP) Frontiers Workshop 2015

 
 
 
 
The ski report lists Avalanche Level is rated "Considerable Danger" so it might be wise to stay warm and talk real science.

Here is the program:

Sunday January 25, 2015
13.00 - 14.30    Aperitif and standing lunch
13.30 - 14.00    Lunch
14.00 - 15.45    Free time
15.45 - 16.00    Workshop opening
16.00 - 17.00    Conference introduction talk: Prof. A. Lasenby, Cavendish Laboratory, University of Cambridge
"The Search for Gravitational Waves in the Early Universe"
Abstract
Considerable excitement was caused in March 2014 by the announcement of a detection by the BICEP2 experiment of gravitational waves in the early universe via their effect on the Cosmic Microwave Background (CMB). These gravitational waves imprint themselves into a particular mode of polarisation of the CMB, and measurement of their amplitude would finally reveal the energy scale at which inflation took place, as well as providing direct evidence that it actually occurred. It would also represent the farthest back in time we could ever look, and the large amplitude as discovered provides a point of contact with string cosmology and other theories of the early universe, and has stimulated much theoretical work following the announcement of the discovery. This talk will look at the background to the experimental results, the theoretical implications of the range of possible amplitudes and also the latest information from the Planck Satellite, which as well as measuring the CMB itself, provides important information on the possible foreground contamination in the results.
17.00 - 17.30    Coffee
    
Session: "Successes and Opportunities in Medical Imaging"
17.30 - 19.45    Talks
17:30    Shreyas Vasanawala    Quantification in Cardiovascular MRI
18:05    Gregory Kicska    Digital Chest Tomosynthesis: Clinical Applications and Technical Obstacles
18:30    Charles Hennemeyer    Coming Challenges in Medical Device Development; Alternative, Counter-Current Ideas in Disease Physiology May Provide Ideas for Medical-Device-Based Solutions that will Run Countercurrent to Existing, Drug Based Therapies
18:55    Pejman Ghanouni    MR Guided High Intensity Focused Ultrasound
19:20    Martin Holler    ICTGV regularization for image sequence reconstruction of accelerated dynamic MRI
19.45 - 20.30    Deluxe posters and aperitif
Stephen Cauley    Hierarchically Semiseparable Generalized Encoding Matrix Compression for Fast Inverse Imaging
20.45 - 22.15    Dinner
Monday January 26, 2015
07.00 - 08.00    Breakfast
    
Session: "Signal Processing in Cosmology"
08.00 - 10.15    Talks
08:00    Benjamin Wandelt    3-D, physical image reconstruction in cosmology
08:35    Alan Heavens    Standard rulers, candles and clocks: measuring the BAO scale model-independently.
09:00    Boris Leistedt    Analysing the polarisation of the CMB with spin scale-discretised wavelets
09:25    Rita Tojeiro    Surveying the Universe - the past, present and future of galaxy redshift surveys.
09:50    Roberto Trotta    Bayesian hierarchical models for supernova cosmology
10.15 - 11.00    Deluxe posters and coffee
Paul Hurley    Gridding by Beamforming
Vijay Kartik    Dimension embedding for big data in radio interferometry
Ian Harrison    Challenges in Radio Weak Gravitational Lensing
11.00 - 17.00    Free time and extra activities 2
17.00 - 17.30    Coffee
    
Session: "Sparsity 2.0: New Trends in Sparse Approximation, Sparsity-based Signal Models and Algorithms"
17.30 - 19.45    Talks
17:30    Anna Gilbert    Sparse Approximation, List Decoding, and Uncertainty Principles
18:05    Michael Elad    SOS Boosting of Image Denoising Algorithms
18:30    Mike Davies    Compressed Quantitative MRI using BLIP
18:55    Miguel Rodrigues    Compressed Sensing with Prior Information: Theory and Practice
19:20    Pier Luigi Dragotti    A ProSparse Approach to find the Sparse Representation in Fourier and Canonical Bases
19.45 - 20.30    Deluxe posters and aperitif
Hanjie Pan    Annihilation-driven Image Edge Localization
Jon Oñativia    Sparsity According to Prony: From Structured to Unstructured Representations and Back
Enrico Magli    Fast IRLS for sparse reconstruction based on gaussian mixtures
Jonathan Masci    Sparse similarity-preserving hashing
20.45 - 22.15    Dinner
Tuesday January 27, 2015
07.00 - 08.00    Breakfast
    
Session: "Neuro and Quantitative Imaging"
08.00 - 10.15    Talks
08:00    Peter Basser    Opportunities and Challenges in Brain Mapping with Diffusion MRI
08:35    Jürgen Reichenbach    MR Susceptibility Imaging and Mapping
09:00    Kawin Setsompop    Wave-CAIPI for an order of magnitude acceleration in MRI acquisition
09:25    Sebastien Equis    4D in Bio-microscopy: Marker-free Live Cell Tomography
09:50    Kathryn Nightingale    Quantitative Elasticity Imaging With Acoustic Radiation Force: Methods and Clinical Applications
10.15 - 11.00    Deluxe posters and coffee
Chantal Tax    Towards Quantification of the Brain’s Sheet Structure in Diffusion MRI Data
Noam Shemesh    Cellular microstructures revealed by Non-Uniform Oscillating-Gradient Spin-Echo (NOGSE) MRI
Noam Ben-Eliezer    Non-Analytic Model-Based Reconstruction for Accelerated Multiparametric Mapping in MRI
11.00 - 17.00    Free time and extra activities 2
17.00 - 17.30    Coffee
    
Session: "Radio interferometric deconvolution and imaging techniques, from CLEAN to CS to Bayesian"
17.30 - 19.45    Talks
17:30    André Offringa    Radio interferometric imaging for the SKA and its pathfinders
18:05    Rafael Carrillo    Why CLEAN when you can PURIFY? A new approach for next-generation radio-interferometric imaging
18:30    Jean-Luc Starck    LOFAR and SKA Sparse Image Reconstruction
18:55    Henrik Junklewitz    RESOLVE: A new algorithm for aperture synthesis imaging of extended emission
19:20    Arwa Dabbech    MORESANE: a sparse deconvolution algorithm for radio interferometric imaging
19.45 - 20.30    Deluxe posters and aperitif
Daniel Muscat    The Malta-imager: A new high-performance imaging tool
Ludwig Schwardt    Fast Phase Transition Estimation
Jonathan Kenyon    PyMORESANE: Pythonic and CUDA-accelerated implementations of MORESANE
Oleg Smirnov    Accelerated facet-based widefield imaging
Malte Kuhlmann    Imaging Uncertainty in Radio Interferometry
20.30 - 20.45    Workshop picture
21.00 - 22.30    Dinner
Wednesday January 28, 2015
07.00 - 08.00    Breakfast
    
Session: "Modern Scalable Algorithms for Convex Optimization"
08.00 - 10.15    Talks
08:00    Jean-C. Pesquet    Proximal Primal-Dual Optimization Methods
08:35    Carola Schönlieb    Optimising the optimisers - image reconstruction by bilevel optimisation
09:00    Amir Beck    On the Convergence of Alternating Minimization with Applications to Iteratively Reweighted Least Squares and Decomposition Schemes
09:25    Jakub Konecny    Semi-Stochastic Gradient Descent Methods
09:50    Julien Mairal    Incremental and Stochastic Majorization-Minimization Algorithms
10.15 - 11.00    Deluxe posters and coffee
Quoc Tran-Dinh    A Primal-Dual Algorithmic Framework for Constrained Convex Optimization
Aurélie Pirayre    Discrete vs Continuous Optimization for Gene Regulatory Network Inference
Vassilis Kalofolias    Enhanced matrix completion with manifold learning
Mehrdad Yaghoobi    Non-Negative Orthogonal Matching Pursuit
11.00 - 17.00    Free time and extra activities 2
17.00 - 17.30    Coffee
    
Session: "Rapid and Multidimensional Imaging"
17.30 - 19.45    Talks
17:30    Zhi-Pei Liang    Multidimensional Imaging: A Path to High Resolution and High Speed through Subspaces
18:05    Ricardo Otazo    Low-rank plus sparse dynamic MRI: separation of background and dynamic components and self discovery of motion
18:30    Nicole Seiberlich    Magnetic Resonance Fingerprinting: Beyond Parameter Mapping to Clinical Application
18:55    Guang-Hong Chen    More is indeed different
19:20    Ge Wang    Imaging with X-ray Modulated Nanoparticles
19.45 - 20.30    Deluxe posters and aperitif
Jong Chul Ye    Semi-Analytic Iterative Framework for TV Penalized Cone-beam CT Reconstruction
Kelvin Layton    Spatial encoding with generalised magnetic field shapes
20.45 - 22.15    Dinner
Thursday January 29, 2015
07.00 - 08.00    Breakfast
    
Session: "Astrostatistics: Bayes and machines"
08.00 - 10.15    Talks
08:00    Michael Hobson    Neural networks and accelerated Bayesian inference
08:35    Martin Kunz    Bayesian Inference for Radio Observations
09:00    Thomas Kitching    Weak Gravitational Lensing
09:25    Aaron Robotham    Optimal deblending and stacking across multi-band surveys using LAMBDAR
09:50    Ewan Cameron    On functional regression-based emulators for faster Bayesian inference from computational simulations
10.15 - 11.00    Deluxe posters and coffee
Alan Heavens    Astrostatistics and Brain Imaging
Dovi Poznanski    Studying the Milky Way via stacks of low S/N spectra
Jean-F. Robitaille    Multiscale analysis of Galactic dust emission
Michelle Lochner    Bayesian Inference for Radio Observations: Source Separation
Raul Jimenez    Analytic PDFs for non-gaussian processes: towards a full bayesian analysis
11.00 - 17.00    Free time and extra activities 2
17.00 - 17.30    Coffee
    
Session: "Statistical Methods in Imaging"
17.30 - 19.45    Talks
17:30    Philip Schniter    Statistical Image Recovery: A Message-Passing Perspective
18:05    Dawn Woodard    Small-Feature Model-Based Image Segmentation
18:30    Jeffrey Morris    Functional Regression Methods for Biomedical Imaging Data
18:55    Timothy Johnson    Predicting Treatment Efficacy via Quantitative MRI: A Bayesian Joint Model
19:20    Sonja Greven    A Fully Bayesian Hierarchical Framework for Scalar-on-Image Regression
19.45 - 20.30    Deluxe posters and aperitif
Susan Wei    Asymptotic Inference for Integral Curves of Noisy Vector Fields
Valentina Masarotto    Bayesian Average SParsity
Frank Ong    Beyond Low Rank + Sparse: A Multi-scale Low Rank Decomposition
20.30 - 20.45    Best contribution awards
21.00 - 23.00    Workshop Dinner 3
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, January 23, 2015

In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning

How could I have missed this one from the papers currently in review for ICLR 2015 ? In fact, I did not miss it, I read it and then ... other things took over. So without further ado, here is the starting point of the study:

...Consider, however, the results shown in Figure 1, where we trained networks of increasing size on the MNIST and CIFAR-10 datasets. Training was done using stochastic gradient descent with momentum and diminishing step sizes, on the training error and without any explicit regularization. As expected, both training and test error initially decrease. More surprising is that if we increase the size of the network past the size required to achieve zero training error, the test error continues decreasing! This behavior is not at all predicted by, and even contrary to, viewing learning as fitting a hypothesis class controlled by network size...

and then they use advanced matrix factorization to understand the issue better, what's not to like?
 


In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning by Behnam Neyshabur, Ryota Tomioka, Nathan Srebro.

We present experiments demonstrating that some other form of capacity control, different from network size, plays a central role in learning multilayer feed-forward networks. We argue, partially through analogy to matrix factorization, that this is an inductive bias that can help shed light on deep learning.
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization

Videos are fourth order tensors that have much information. The following paper shows how much of it through Low rank Tensor reconstruction. Woohoo !


New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization by Bo Jiang, Shiqian Ma, Shuzhong Zhang

In this paper, we propose three new tensor decompositions for even-order tensors corresponding respectively to the rank-one decompositions of some unfolded matrices. Consequently such new decompositions lead to three new notions of (even-order) tensor ranks, to be called the M-rank, the symmetric M-rank, and the strongly symmetric M-rank in this paper. We discuss the bounds between these new tensor ranks and the CP(CANDECOMP/PARAFAC)-rank and the symmetric CP-rank of an even-order tensor. In particular, we show: (1) these newly defined ranks actually coincide with each other if the even-order tensor in question is super-symmetric; (2) the CP-rank and symmetric CP-rank for a fourth-order tensor can be both lower and upper bounded (up to a constant factor) by the corresponding M-rank. Since the M-rank is much easier to compute than the CP-rank, we can replace the CP-rank by the M-rank in the low-CP-rank tensor recovery model. Numerical results on both synthetic data and real data from colored video completion and decomposition problems show that the M-rank is indeed an effective and easy computable approximation of the CP-rank in the context of low-rank tensor recovery.
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Random Calibration for Accelerating MR-ARFI Guided Ultrasonic Focusing in Transcranial Therapy


 





Random Calibration for Accelerating MR-ARFI Guided Ultrasonic Focusing in Transcranial Therapy Na Liu , Antoine LiutkusJean-François Aubry , Laurent Marsac ,Mickael Tanter , Laurent Daudet

Abstract : Transcranial focused ultrasound is a promising therapeutic modality. It consists in placing transducers around the skull and emitting shaped ultrasound waves that propagate through the skull and then concentrate on one particular location within the brain. However, the skull bone is known to distort the ultrasound beam. In order to compensate for such distortions, a number of techniques have been proposed recently, for instance using Magnetic Resonance Imaging (MRI) feedback. In order to fully determine the focusing distortion due to the skull, such methods usually require as many calibration signals as transducers, resulting in a lengthy calibration process. In this paper, we investigate how the number of calibration sequences can be signicantly reduced, based on random measurements and optimization techniques. Experimental data with six human skulls demonstrate that the number of measurements can be up to three times lower than with the standard methods, while restoring 90% of the focusing eciency.
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, January 22, 2015

MORESANE: MOdel REconstruction by Synthesis-ANalysis Estimators. A sparse deconvolution algorithm for radio interferometric imaging - implementation

What is really happening with SKA is fascinating: This paper and others are using the latest and greatest reconstruction algorithm in compressive sensing to figure out some of this future telescope's technical specifications and data chains. 


MORESANE: MOdel REconstruction by Synthesis-ANalysis Estimators. A sparse deconvolution algorithm for radio interferometric imaging by Arwa Dabbech, Chiara Ferrari, David Mary, Eric Slezak, Oleg Smirnov, Jonathan S. Kenyon
The current years are seeing huge developments of radio telescopes and a tremendous increase of their capabilities. Such systems make mandatory the design of more sophisticated techniques not only for transporting, storing and processing this new generation of radio interferometric data, but also for restoring the astrophysical information contained in such data. In this paper we present a new radio deconvolution algorithm named MORESANE and its application to fully realistic simulated data of MeerKAT, one of the SKA precursors. This method has been designed for the difficult case of restoring diffuse astronomical sources which are faint in brightness, complex in morphology and possibly buried in the dirty beam's side lobes of bright radio sources in the field. MORESANE is a greedy algorithm which combines complementary types of sparse recovery methods in order to reconstruct the most appropriate sky model from observed radio visibilities. A synthesis approach is used for the reconstruction of images, in which the synthesis atoms representing the unknown sources are learned using analysis priors. We apply this new deconvolution method to fully realistic simulations of radio observations of a galaxy cluster and of an HII region in M31. We show that MORESANE is able to efficiently reconstruct images composed from a wide variety of sources from radio interferometric data. Comparisons with other available algorithms, which include multi-scale CLEAN and the recently proposed methods by Li et al. (2011) and Carrillo et al. (2012), indicate that MORESANE provides competitive results in terms of both total flux/surface brightness conservation and fidelity of the reconstructed model. MORESANE seems particularly well suited for the recovery of diffuse and extended sources, as well as bright and compact radio sources known to be hosted in galaxy clusters.
 The implementation of MORSEANE is on GitHub: https://github.com/ratt-ru/PyMORESANE
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly