Sparse and Low-Dimensional Structures in Deep Networks
26 Feb 2025, 13:00 — Room 326, UniGe DIBRIS/DIMA, Via Dodecaneso 35
Speaker:
Akshay Rangamani — New Jersey Institute of Technology
Akshay Rangamani — New Jersey Institute of Technology
Abstract:
In this talk I will describe recent progress in characterizing low dimensional structure that emerges in deep networks. First we explore the phenomenon of Neural Collapse where top-layer feature embeddings of samples from the same class tend to concentrate around their means, and the top layer’s weights align with those features. We first show how it emerges when training deep networks with weight decay and normalization. We then investigate these properties in intermediate layers and explore its implications for generalization in deep learning. Next we show how to extend the basic tool of decomposing covariances to uncover low-dimensional structures in deep regression models. Finally we investigate a toy model for modular addition through Fourier representations to show how language models might learn low-dimensional structures.
In this talk I will describe recent progress in characterizing low dimensional structure that emerges in deep networks. First we explore the phenomenon of Neural Collapse where top-layer feature embeddings of samples from the same class tend to concentrate around their means, and the top layer’s weights align with those features. We first show how it emerges when training deep networks with weight decay and normalization. We then investigate these properties in intermediate layers and explore its implications for generalization in deep learning. Next we show how to extend the basic tool of decomposing covariances to uncover low-dimensional structures in deep regression models. Finally we investigate a toy model for modular addition through Fourier representations to show how language models might learn low-dimensional structures.
Bio:
Akshay Rangamani is an Assistant professor of Data Science at the New Jersey Institute of Technology. Prior to this he was a postdoctoral associate at the Center for Brains, Minds, and Machines at MIT, and received a fellowship from the K.Lisa Yang Integrative Computational Neuroscience center. He obtained his PhD in Electrical and Computer Engineering from Johns Hopkins University. His research interests are at the intersection of the science of deep learning, signal processing, and optimization.
Akshay Rangamani is an Assistant professor of Data Science at the New Jersey Institute of Technology. Prior to this he was a postdoctoral associate at the Center for Brains, Minds, and Machines at MIT, and received a fellowship from the K.Lisa Yang Integrative Computational Neuroscience center. He obtained his PhD in Electrical and Computer Engineering from Johns Hopkins University. His research interests are at the intersection of the science of deep learning, signal processing, and optimization.