Saturday, December 03, 2016

What to do in Barcelona this week, #NIPS2016 edition


So it looks like there will be more than 5700 people attending NIPS this year in Barcelona.


I should be one of them starting Thursday. If you are on your way to NIPS you want to
If you are already in Barcelona, it means
 In the meantime, here is the Salon des Réfusés papers followed by all the accepted papers of EWRL2016.
Universum Prescription: Regularization using Unlabeled Data
Xiang Zhang (New York University); Yann LeCun (New York University);
Infinite-Label Learning with Semantic Output Codes
Yang Zhang (University of Central Florida); Rupam Acharyya (University of Rochester); Ji Liu (University of Rochester); Boqing Gong (University of Central Florida);
Query-Efficient Imitation Learning for End-to-End Autonomous Driving
Jiakai Zhang (NYU); Kyunghyun Cho (NYU);
Leveraging Video Descriptions to Learn Video Question Answering
Kuo-Hao Zeng (Stanford University); Tseng-Hung Chen (National Tsing Hua University); Ching-Yao Chuang (National Tsing Hua University); Yuan-Hong Liao (National Tsing Hua University); Juan Carlos Niebles (Stanford University); Min Sun (National Tsing Hua University);
Joint Dimensionality Reduction for Two Feature Vectors
Yanjun Li (UIUC); Yoram Bresler (UIUC);
Reweighted Data for Robust Probabilistic Models
Yixin Wang (Columbia University); Alp Kucukelbir (Columbia University); David M. Blei (Columbia University);
Learning Sparse, Distributed Representations using the Hebbian Principle
Aseem Wadhwa (University of California Santa Barbara); Upamanyu Madhow (University of California Santa Barbara);
Diverse Beam Search: Decoding Diverse Sequences from Neural Sequence Models
Ashwin K. Vijayakumar (Virginia Tech); Michael Cogswell (Virginia Tech); Ramprasaath R. Selvaraju (Virginia Tech); Qing Sun (Virginia Tech); Stefan Lee (Virginia Tech); David Crandall (Indiana University); Dhruv Batra (Virginia Tech);
Generalizing the Convolution Operator to Extend CNNs to Irregular Domains
Jean-Charles Vialatte (Cityzen Data, Telecom Bretagne); Vincent Gripon (Telecom Bretagne); Grégoire Mercier (Telecom Bretagne);
Sifting Common Information from Many Variables
Greg Ver Steeg (USC); Shuyang Gao (USC); Kyle Reing (USC); Aram Galstyan (USC);
Reducing the error of Monte Carlo Algorithms by Learning Control Variates
Brendan Tracey (MIT, Santa Fe Institute); David Wolpert (Santa Fe Institute, ASU);
Recoverability of Joint Distribution from Missing Data
Jin Tian (Iowa State University);
Convergence rate of stochastic k-means
Cheng Tang (George Washington University); Claire Monteleoni (George Washington University);
Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation
Akash Srivastava (Informatics Forum, University of Edinburgh); James Zou (Microsoft Research and Stanford University); Charles Sutton (Informatics Forum, University of Edinburgh);
Scalable and Sustainable Deep Learning via Randomized Hashing
Ryan Spring (Rice University); Anshumali Shrivastava (Rice University);
Higher Order Recurrent Neural Networks
Rohollah Soltani (York University); Hui Jiang (York University);
Differentially Private Gaussian Processes
Michael Thomas Smith (University of Sheffield); Max Zwiessele (University of Sheffield); Neil D. Lawrence (University of Sheffield);
ProjE: Embedding Projection for Knowledge Graph Completion
Baoxu Shi (University of Notre Dame); Tim Weninger (University of Notre Dame);
Exploring Semantic Correspondence in Deep Convolutional Neural Networks
Zhiqiang Shen (Fudan University); Xiangyang Xue (Fudan University);
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Iulian Vlad Serban (University of Montreal); Alessandro Sordoni (University of Montreal); Ryan Lowe (McGill University); Laurent Charlin (McGill University); Joelle Pineau (McGill University); Aaron Courville (University of Montreal); Yoshua Bengio (University of Montreal);
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju (Virginia Tech); Abhishek Das (Virginia Tech); Ramakrishna Vedantam (Virginia Tech); Michael Cogswell (Virginia Tech); Devi Parikh (Georgia Tech); Dhruv Batra (Georgia Tech);
Learning activation functions from data using cubic spline interpolation
Simone Scardapane (Sapienza University of Rome); Michele Scarpiniti (Sapienza University of Rome); Danilo Comminiello (Sapienza University of Rome); Aurelio Uncini (Sapienza University of Rome);
Action Classification via Concepts and Attributes
Amir Rosenfeld (Weizmann Institute of Science); Shimon Ullman (Weizmann Institute of Science);
On numerical approximation schemes for expectation propagation
Alexis Roche (CHUV);
Differential response of the retinal neural code with respect to the sparseness of natural images
Cesar Ravello (CINV); Maria-Jose Escobar (Univ Tecnico Federico Santa María); Adrian Palacios (CINV); Laurent U. Perrinet (INT);
A Latent-Variable Lattice Model
Rajasekaran Masatran (IIT Madras);
On Enumerating Stable Configurations of Cellular Automata with the MAJORITY update rule
Predrag T. Tosic;
Holistic SparseCNN: Forging the Trident of Accuracy, Speed, and Size
Jongsoo Park (Intel Corporation); Sheng R. Li (Intel Corporation); Wei Wen (University of Pittsburgh); Hai Li (University of Pittsburgh); Yiran Chen (University of Pittsburgh); Pradeep Dubey (Intel Corporation);
DropNeuron: An Approach for Simplifying the Structure of Deep Neural Networks
Wei Pan; Hao Dong; Yike Guo;
Herding Generalizes Diverse M-Best Solutions
Ece Ozkan; Gemma Roig; Orcun Goksel; Xavier Boix;
Practical optimal experiment design with probabilistic programs
Long Ouyang (Stanford); Michael Henry Tessler (Stanford); Daniel Ly (Stanford); Noah D. Goodman (Stanford);
Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing
Hirotaka Niitsuma; Minho Lee;
Neural Semantic Encoders
Tsendsuren Munkhdalai (University of Massachusetts); Hong Yu (University of Massachusetts);
Neural Sampling by Irregular Gating Inhibition of Spiking Neurons and Attractor Networks
Lorenz K. Muller (Institute of Neuroinformatics, ETH Zurich and University of Zurich); Giacomo Indiveri (Institute of Neuroinformatics, ETH Zurich and University of Zurich);
Node-Adapt, Path-Adapt and Tree-Adapt: Model-Transfer Domain Adaptation for Random Forest
Azadeh S. Mozafari (Computer Engineering Department, Sharif university of Technology); David Vazquez (Computer Vision Center, UAB University); Mansour Jamzad (Computer Engineering Department, Sharif university of Technology); Antonio M. Lopez (Computer Vision Center, UAB University);
Inductive quantum learning: Why you are doing it almost right
Alex Monràs (Universitat Autònoma de Barcelona); Gael Sentís (Universidad del País Vasco); Peter Wittek (ICFO-The Institute of Photonic Sciences);
Adversarial Training Methods for Semi-Supervised Text Classification
Takeru Miyato (Kyoto Univ., Google Brain); Andrew M. Dai (Google Brain); Ian Goodfellow (OpenAI);
The Oesomeric model: giving Space to Reinforcement Learning Temporal Models
Pierre Michaud (IPC);
Learning from Binary Labels with Instance-Dependent Corruption
Aditya Krishna Menon (Data61); Brendan van Rooyen (QUT); Nagarajan Natarajan (MSR Bangalore);
A Modular Theory of Feature Learning
Daniel McNamara (Australian National University and Data61); Cheng Soon Ong (Australian National University and Data61); Robert C. Williamson (Australian National University and Data61;);
A Marginal-Based Technique for Distribution Estimation
Rajasekaran Masatran (IIT Madras);
Exploring and measuring non-linear correlations: Copulas, Lightspeed Transportation and Clustering
Gautier Marti (Hellebore Capital Ltd); Sébastien Andler (ENS de Lyon); Frank Nielsen (Ecole Polytechnique); Philippe Donnat (Hellebore Capital Ltd);
Quantifying the probable approximation error of probabilistic inference programs
Marco F. Cusumano-Towner (MIT); Vikash K. Mansinghka (MIT);
A performance-based approach to design the stimulus presentation paradigm for the P300-based BCI
Boyla Mainsah (Duke University); Galen Reeves (Duke University); Leslie Collins (Duke University); Chandra Throckmorton ;
Active Search for Sparse Signals with Region Sensing
Yifei Ma (Carnegie Mellon University); Roman Garnett (Washington University in St. Louis); Jeff Schneider (Carnegie Mellon University);
On Minimal Accuracy Algorithm Selection in Computer Vision and Intelligent Systems
Martin Lukac (Nazarbayev University); Kamila Abdiyeva (Nazarbayev University); Michitaka Kameyama (Ishinomaki University);
Multiple Kernel k-means with Incomplete Kernels
Xinwang Liu (NUDT); Miaomiao Li (NUDT); Lei Wang (NUDT); Yong Dou (NUDT); Jinping Yin (NUDT); En Zhu (NUDT);
Leveraging Union of Subspace Structure to Improve Constrained Clustering
John Lipor (University of Michigan, Ann Arbor); Laura Balzano (University of Michigan, Ann Arbor);
Differential Covariance: A New Class of Methods to Estimate Sparse Connectivity from Neural Recordings
Tiger W. Lin (UCSD/Salk); Anup Das (UCSD); Giri P. Krishnan (UCSD); Maxim Bazhenov (UCSD); Terrence J. Sejnowski (UCSD/Salk);
Learning to Optimize
Ke Li (UC Berkeley); Jitendra Malik (UC Berkeley);
Generalized Min-Max Kernel and Generalized Consistent Weighted Sampling
Ping Li;
Asaga: Asynchronous Parallel SAGA
Rémi Leblond (Ecole Normale Supérieure / INRIA Sierra); Fabian Pedregosa (Ecole Normale Supérieure / INRIA Sierra); Simon Lacoste-Julien (Department of (CS & OR DIRO) Université de Montréal);
Estimating Uncertainty Online Against an Adversary
Volodymyr Kuleshov; Stefano Ermon;
Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
Maximilian Karl (Technische Universität München); Maximilian Soelch (Technische Universität München); Justin Bayer (Data Lab, Volkswagen Group); Patrick van der Smagt (Data Lab, Volkswagen Group);
How to scale distributed deep learning?
Peter H. Jin (UC Berkeley); Qiaochu Yuan (UC Berkeley); Forrest Iandola (UC Berkeley); Kurt Keutzer (UC Berkeley);
Generating images with recurrent adversarial networks
Daniel Jiwoong Im; Chris Dongjoo Kim; Hui Jiang; Roland Memisevic;
Learning Unitary Operators with Help From u(n)
Stephanie L. Hyland (ETH Zurich); Gunnar Rätsch (ETH Zurich);
Character-Level Language Modeling with Hierarchical Recurrent Neural Networks
Kyuyeon Hwang (Seoul National University); Wonyong Sung (Seoul National University);
Training Spiking Deep Networks for Neuromorphic Hardware
Eric Hunsberger (University of Waterloo); Chris Eliasmith (University of Waterloo);
Fast Learning of Clusters and Topics via Sparse Posteriors
Michael C. Hughes (Brown University); Erik B. Sudderth (Brown University);
Unsupervised Learning of Word-Sequence Representations from Scratch via Convolutional Tensor Decomposition
Furong Huang (Microsoft Research); Animashree Anandkumar (UC Irvine);
The Shallow End: Empowering Shallower Deep-Convolutional Networks through Auxiliary Outputs
Yong Guo (South China University of Technology); Mingkui Tan (South China University of Technology); Qingyao Wu (South China University of Technology); Jian Chen (South China University of Technology); Anton Van Den Hengel (The University of Adelaide); Qinfeng Shi (The University of Adelaide);
A Robust Adaptive Stochastic Gradient Method for Deep Learning
Caglar Gulcehre; Jose Sotelo; Marcin Moczulski; Yoshua Bengio;
Faster Low-rank Approximation using Adaptive Gap-based Preconditioning
Alon Gonen (Hebrew University of Jeruslaem); Shai Shalev-Shwartz;
One Class Splitting Criteria for Random Forests with Application to Anomaly Detection
Nicolas Goix (Télécom Paristech); Romain Brault (Télécom Paristech); Nicolas Drougard (ISAE); Maël Chiapino (Télécom Paristech);
Causal inference for cloud computing
Philipp Geiger (MPI for Intelligent Systems); Lucian Carata (Univeristy of Cambridge); Bernhard Schölkopf (MPI for Intelligent Systems);
The Linearization of Belief Propagation on Pairwise Markov Random Fields
Wolfgang Gatterbauer (Carnegie Mellon University);
Optimal Number of Choices in Rating Contexts
Sam Ganzfried (Florida International University);
Bayesian Opponent Exploitation in Imperfect-Information Games
Sam Ganzfried (Florida International University);
Network of Bandits
Raphaël Féraud (Orange Labs);
Cognitive Discriminative Mappings for Rapid Learning
Wen-Chieh Fang; Yi-ting Chiang;
Stochastic Patching Process
Xuhui Fan (Data61, CSIRO, Australia); Bin Li (Data61, CSIRO, Australia); Yi Wang (Data61, CSIRO, Australia); Yang Wang (Data61, CSIRO, Australia); Fang Chen (Data61, CSIRO, Australia);
Perceptual Reward Functions
Ashley Edwards (Georgia Institute of Technology); Charles Isbell (Georgia Institute of Technology); Atsuo Takanishi (Waseda University);
Collaborative Filtering with Recurrent Neural Networks
Robin Devooght (ULB, IRIDIA); Hugues Bersini (ULB, IRIDIA);
Predictive Coding for Dynamic Vision: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model
Minkyu Choi (KAIST); Jun Tani (KAIST);
On the Optimal Sample Complexity for Best Arm Identification
Lijie Chen (Tsinghua University); Jian Li (Tsinghua University);
Stability revisited: new generalisation bounds for the Leave-one-Out
Alain Celisse (Université de Lille); Benjamin Guedj (Inria);
Dataflow matrix machines as programmable, dynamically expandable, self-referential generalized recurrent neural networks
Michael Bukatin (HERE North America LLC); Steve Matthews (University of Warwick); Andrey Radul (Project Fluid);
Crowdsourcing: Low Complexity, Minimax Optimal Algorithms
Thomas Bonald (Telecom ParisTech); Richard Combes (Centrale-Supelec);
Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
Théodore Bluche (A2iA); Jérôme Louradour (A2iA); Ronaldo Messina (A2iA);
Convergence Rate Analysis of a Stochastic Trust Region Method for Nonconvex Optimization
Jose Blanchet (Columbia University); Coralia Cartis (University of Oxford); Matt Menickelly (Lehigh University); Katya Scheinberg (Lehigh University);
Kernel regression, minimax rates and effective dimensionality: beyond the regular case
Gilles Blanchard (Potsdam University); Nicole Mücke (Potsdam University);
Defining the Neural Code
Thomas Bangert (Queen Mary University of London); Ebroul Izquierdo (Queen Mary University of London);
The Option-Critic Architecture
Pierre-Luc Bacon (McGill University); Jean Harb (McGill University); Doina Precup (McGill University);
Towards Optimality Conditions for Non-Linear Networks
Devansh Arpit (SUNY Buffalo); Hung Q. Ngo (LogicBlox); Yingbo Zhou (SUNY Buffalo); Nils Napp (SUNY Buffalo); Venu Govindaraju (SUNY Buffalo);
Improved Multi-Class Cost-Sensitive Boosting via Estimation of the Minimum-Risk Class
Ron Appel (Caltech); Xavier Burgos-Artizzu (THX); Pietro Perona (Caltech);
Learning Bayesian Networks with Incomplete Data by Augmentation
Tameem Adel; Cassio P. de Campos;
Unbiased Sparse Subspace Clustering By Selective Pursuit
Hanno Ackermann (Hanover University); Michael Yang (Twente University); Bodo Rosenhahn (Hanover University);
Linear Thompson Sampling Revisited
Marc Abeille (Inria-Lille); Alessandro Lazaric (Inria-Lille);

EWRL13 accepted papers: 

Accepted Papers





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, December 02, 2016

Slides: New Directions for Learning with Kernels and Gaussian Processes, Dagstuhl Seminar

 

Arthur Gretton, Philipp Hennig, Carl Edward Rasmussen, Bernhard Schölkopf organized the Dagstuhl seminar on New Directions for Learning with Kernels and Gaussian Processes
Here are the slides of the presentation availabl on this site:
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, December 01, 2016

Nuit Blanche in Review (November 2016)


Since the last Nuit Blanche in Review (October 2016), a somewhat very interesting idea came out in the form of this paper
Identifying how certain parts of the brain do a specific computation is indeed an awesome idea !

We had a one implementation, a few in-depth papers but we also saw the release of numerous papers for NIPS and about 500 submissions for a 2 year old conference (ICLR)! Among these papers, one has drawn the attention:
It seems to promise a faster training time in Deep Learning. We'll see. In the meantime, The Paris Machine Learning meetup had to meetup (one regular and one 'Hors série') but most importantly, we have a website:

Enjoy the rest of the review !

Implementation
In-depth

Conferences

Theses
Paris Machine Learning
Job
Videos:
Sunday Morning Insight


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Efficient Convolutional Auto-Encoding via Random Convexification and Frequency-Domain Minimization

Random Convexification ?
The omnipresence of deep learning architectures such as deep convolutional neural networks (CNN)s is fueled by the synergistic combination of ever-increasing labeled datasets and specialized hardware. Despite the indisputable success, the reliance on huge amounts of labeled data and specialized hardware can be a limiting factor when approaching new applications. To help alleviating these limitations, we propose an efficient learning strategy for layer-wise unsupervised training of deep CNNs on conventional hardware in acceptable time. Our proposed strategy consists of randomly convexifying the reconstruction contractive auto-encoding (RCAE) learning objective and solving the resulting large-scale convex minimization problem in the frequency domain via coordinate descent (CD). The main advantages of our proposed learning strategy are: (1) single tunable optimization parameter; (2) fast and guaranteed convergence; (3) possibilities for full parallelization. Numerical experiments show that our proposed learning strategy scales (in the worst case) linearly with image size, number of filters and filter size.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, November 29, 2016

CfP: SPARS 2017, Signal Processing with Adaptive Sparse Structured Representations

 
 
 
  Mark just sent me the following:
 
 
Dear Igor, 
A quick reminder for Nuit Blanche readers:
The deadline for submissions to SPARS 2017 is just 2 weeks away, on 12 December!
Best wishes,
 
Mark
 
Here is the announcement:
 
================================================

 
  SPARS 2017

  Signal Processing with Adaptive Sparse Structured Representations

  Lisbon, Portugal - June 5-8, 2017

  Submission deadline: December 12, 2016

  http://spars2017.lx.it.pt/

  ------------------------------------------

CALL FOR PAPERS

The Signal Processing with Adaptive Sparse Structured Representations (SPARS) workshop aims to bring together people from statistics, engineering, mathematics, and computer science, fostering the exchange and dissemination of new ideas and results, both applied and theoretical, on the general area of sparsity-related techniques and computational methods, for high dimensional data analysis, signal processing, and related applications.

Contributions (talks and demos) are solicited as one-page abstracts, which may extend to a second page in order to include figures, tables and references. Talks should present recent and novel research results. We welcome abstract submissions for technological demonstrations of the mathematical topics within our scope.

Topics of interest include (but are not limited to):

 * Sparse coding and representations, and dictionary learning
 * Sparse and low-rank approximation algorithms
 * Compressive sensing and learning
 * Dimensionality reduction and feature extraction
 * Sparsity in approximation theory, information theory, and statistics
 * Low-complexity/low-dimensional regularization
 * Statistical/Bayesian models and algorithms for sparsity
 * Sparse network theory and analysis
 * Sparsity and low-rank regularization
 * Applications including but not limited to: communications,
   geophysics, neuroscience, audio & music, imaging, denoising, genetics.

PLENARY SPEAKERS:

 * Yoram Bresler, University of Illinois, USA
 * Volkan Cevher, École Polytechnique Fédérale de Lausanne, Switzerland
 * Jalal Fadili, École Nationale Supérieure d'Ingénieurs de Caen, France
 * Anders Hansen, University of Cambridge, UK
 * Gitta Kutyniok, Technische Universität Berlin, Germany
 * Philip Schniter, Ohio State University, USA
 * Eero Simoncelli, Howard Hughes Medical Institute, NYU, USA
 * Rebecca Willett, University of Wisconsin, USA

VENUE:

SPARS 2017 will be held at Instituto Superior Técnico (IST), the
engineering school of the University of Lisbon, Portugal.

IMPORTANT DATES:

* Submission deadline: December 12, 2016
* Notification of acceptance: March 27, 2017
* Summer School: May 31-June 2, 2017 (tbc)
* Workshop: June 5-8, 2017

CHAIRS:

 Mario A. T. Figueiredo, Instituto Superior Técnico
 Mark Plumbley, University of Surrey


FURTHER INFORMATION:  http://spars2017.lx.it.pt/

============================================================

Prof Mark D Plumbley
Professor of Signal Processing
Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK
Email: m.plumbley@surrey.ac.uk
 
 
 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thesis: Robust Large Margin Approaches for Machine Learning in Adversarial Settings

Congratulations Ali ! Maybe I missed something but this is the first time I see a connection between Random Features and Dropout in non-linear SVMs. Nice !

Robust Large Margin Approaches for Machine Learning in Adversarial Settings by Mohamad Ali Torkamani.
Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisions. Determining the right decision strongly relies on the correctness of the input data. This fact provides a tempting incentive for criminals to try to deceive machine learning algorithms by manipulating the data that is fed to the algorithms. And yet, traditional machine learning algorithms are not designed to be safe when confronting unexpected inputs. In this dissertation, we address the problem of adversarial machine learning; i.e., our goal is to build safe machine learning algorithms that are robust in the presence of noisy or adversarially manipulated data. Many complex questions -- to which a machine learning system must respond -- have complex answers. Such outputs of the machine learning algorithm can have some internal structure, with exponentially many possible values. Adversarial machine learning will be more challenging when the output that we want to predict has a complex structure itself. In this dissertation, a significant focus is on adversarial machine learning for predicting structured outputs. In this thesis, first, we develop a new algorithm that reliably performs collective classification: It jointly assigns labels to the nodes of graphed data. It is robust to malicious changes that an adversary can make in the properties of the different nodes of the graph. The learning method is highly efficient and is formulated as a convex quadratic program. Empirical evaluations confirm that this technique not only secures the prediction algorithm in the presence of an adversary, but it also generalizes to future inputs better, even if there is no adversary. While our robust collective classification method is efficient, it is not applicable to generic structured prediction problems. Next, we investigate the problem of parameter learning for robust, structured prediction models. This method constructs regularization functions based on the limitations of the adversary in altering the feature space of the structured prediction algorithm. The proposed regularization techniques secure the algorithm against adversarial data changes, with little additional computational cost. In this dissertation, we prove that robustness to adversarial manipulation of data is equivalent to some regularization for large-margin structured prediction, and vice versa. This confirms some of the previous results for simpler problems. As a matter of fact, an ordinary adversary regularly either does not have enough computational power to design the ultimate optimal attack, or it does not have sufficient information about the learner's model to do so. Therefore, it often tries to apply many random changes to the input in a hope of making a breakthrough. This fact implies that if we minimize the expected loss function under adversarial noise, we will obtain robustness against mediocre adversaries. Dropout training resembles such a noise injection scenario. Dropout training was initially proposed as a regularization technique for neural networks. The procedure is simple: At each iteration of training, randomly selected features are set to zero. We derive a regularization method for large-margin parameter learning based on dropout. Our method calculates the expected loss function under all possible dropout values. This method results in a simple objective function that is efficient to optimize. We extend dropout regularization to non-linear kernels in several different directions. We define the concept of dropout for input space, feature space, and input dimensions, and we introduce methods for approximate marginalization over feature space, even if the feature space is infinite-dimensional. Empirical evaluations show that our techniques consistently outperform the baselines on different datasets






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly