Article


Article Code : 13970230173915112283

Article Title : Information Bottleneck and its Applications in Deep Learning

Keywords :

Journal Number : 23 Summer 2018

Visited : 496

Files : 267 KB


List of Authors

  Full Name Email Grade Degree Corresponding Author
1 Hassan Hafez-Kolahi hafez@ce.sharif.edu Graduate Graduate Student
2 Shohreh Kasaei kasaei@sharif.edu Professor PhD

Abstract

Information Theory (IT) has been used in Machine Learning (ML) from early days of this field. In the last decade, advances in Deep Neural Networks (DNNs) have led to surprising improvements in many applications of ML. The result has been a paradigm shift in the community toward revisiting previous ideas and applications in this new framework. Ideas from IT are no exceptions. The fast rate of new publications, the diversity of seemingly unrelated applications, and the longtime-span of previous ideas which are being revisited, make a challenge for a researcher to view the whole picture. In this survey, these problems are mostly addressed by giving an organized review on the vast amount of recent publications on the intersection of IT and ML, while mentioning their connections to previous ideas. The focus is on Information Bottleneck (IB), a formulation of information extraction based on IT which is recently found to be a good candidate in studying DNNs. Two important concepts are highlighted in this narrative on the subject, (i) the concise and universal view that IT provides on seemingly unrelated methods of ML, demonstrated by explaining how IB relates to minimal sufficient statistics, stochastic gradient descent, and variational auto-encoders, and (ii) the common technical mistakes and problems caused by applying ideas from IT, which is discussed by a careful study of some recent methods suffering from them.