Title: Compositional Representations in Restricted Boltzmann Machines: theory and applications.
Restricted Boltzmann Machines (RBM) form a family of probability distributions simple yet powerful for modeling high-dimensional, complex data sets. Besides learning a generative model, they also extract features, producing a graded and distributed representation of data. However, not all variants of RBM perform equally well, and little theoretical arguments exist for these empirical observations. By analyzing an ensemble of RBMs with random weights using statistical physics tools, we characterize the structural conditions (statistics of weights, choice of non-linearity…) allowing the emergence of such efficient representations.
Lastly, we present a new application of RBMs: the analysis of protein sequences alignments. We show that RBMs extract high-order patterns of coevolution that arise from the structural and functional constraints of the protein family. These patterns can be recombined to generate artificial protein sequences with prescribed chemical properties.