Seminar: “Information Theory of the Learnable,” Prof. Meir Feder (Tel-Aviv University), EE-314, 10:40AM October 11 (EN)

SEMINAR: Information Theory of the Learnable
By Professor Meir Feder
School of Electrical Engineering, Tel-Aviv University, Israel
The seminar will be on Thursday, October 11, 2018 at 10:40, @EE-314

Abstract:
Learning problems are considered from an information theoretic point of view following results on universal prediction developed in the 90’s. The main focus will be on supervised learning, where a training sequence is given and the goal is to predict (or generalize) a new label from a new data sample. The information theoretical approach naturally uses the self- information or log-loss. The problem can be defined in a stochastic setting where it is assumed that the true relation between the data samples and labels is given by one of the distributions in an assumed class, or in an individual setting where the data and labels (both in the training and test) are specific individual sequences. The individual setting implies and supersede the common PAC setting in learning theory. New results combined with old, provide learning schemes in the various settings and more importantly provide information theoretic expressions that define the ability to learn in the situation given by the available data. In particular, the proposed learning scheme in the individual batch problem, termed PNML – Predictive Normalized Maximum Likelihood – provides a robust learning solution that outperforms the common approach based on Empirical Risk Minimization. Further, the PNML regret indicates a “pointwise” learnability measure for a given test challenge and specific training examples. The PNML, its advantage and the resulting learnability measurement are demonstrated for simple “perceptron” models and for deep neural networks as measured on the CIFAR-10 database. Towards the end on the talk, several challenges will be presented and discussed. For example, how to choose model class? Can we build a hierarchy to avoid over parametrization? How to efficiently and actively choose the training examples? How to extend the results to unsupervised learning? In all these questions, an information theoretic approach can be defined and should be pursued.

Biography: Meir Feder is a Professor at the School of Electrical Engineering, Tel-Aviv University, the incumbent of the Information Theory Chair. An internationally recognized authority in signal processing, communication and information theory, Professor Feder holds Sc.D. degree from the Massachusetts Institute of Technology (MIT) was a visiting Professor in MIT, and had visiting positions at Bell laboratories and Scripps Institute of Oceanography. He is an IEEE Fellow, and received several academic awards including the IEEE Information Theory society best paper award. During his academic career, Prof. Feder was closely involved in the high-tech industry with numerous companies, including working with Intel on the MMX architecture and designing efficient multimedia algorithms for it. In 1998 he co-founded Peach Networks, a provider of server-based interactive TV system via the cable network, acquired in 2000 by Microsoft. He then co-founded Bandwiz, a provider of massive content delivery systems for enterprise networks. In 2004 he co-founded Amimon, a fab-less ASIC Company, the developer of “Wireless Home Digital Interface” (WHDI™) technology, and an emerging leading provider of ASICs for wireless high-definition A/V connectivity at the home.