Learning zero, and what SLT gets wrong about it
This is a first in a pair of posts I'm hoping to write about Singular Learning Theory (SLT) and singularities as a model of data degeneracy. If I get to it, the second post is going to be more general-audience; this one is more technical.Introduction To me, SLT is an important source of toy models which point at an interesting class of new statistical phenomena in learning. It is also a valuable correction to an older and (at this point) largely-defunct story of learning being fully controlled by Hessian eigenvalues and "nonsingular basins". Practitioners of SLT have been instrumental for developing and refining the practice of Bayesian sampling (used by physicists in papers like this one) to empirical models. And the theory's founder Sumio Watanabe is a once-in-a-generation genius who saw and mathematically justified crucial statistical and information-theoretic concepts in learning before long before they appeared in "mainstream" ML theory.However there is a frequently repeated statement in SLT papers – one that doesn't affect empirical results – which I think is wrong in a load-bearing way. This is the statement that models that appear in machine learning are singular in the infinite-data limit, and that a measurement associated with this singularity, called the RLCT, controls generalization and free energy in cases of interest. This isn't a fixable detail, but rather an unavoidable structural issue I'll spell out below. I think it's unfortunate that an elegant and useful theory is linked with an incorrect statement, and it causes potential for future disappointment and for research stuck in a less-useful direction. Many theory and empirics results associated with SLT are important and, I think, interpretability-relevant, independently of the question of whether singularity "explains degeneracy" in practice. As I'll explain below, I think that singular models correctly occupy the same role as symmetries in condensed-matter physics. Many key phenomena, in particula