Introduction
Statistical Mechanics (SM) provides a probabilistic formulation of the macroscopic behaviour of systems made of many microscopic entities, possibly interacting with each other. This discipline is one of the pillars of Theoretical Physics and arose in the past century to characterise emerging collective phenomena, of which phase transitions constitute a paradigmatic example. Remarkably, typical features of biological neural networks (such as memory, computation, and other emergent skills) can be framed in the rationale of SM once the mathematical modelling of its elemental constituents, (i.e. neurons equipped with their axons, synapses, etc.) is available. In fact, no single neuron can recognise an information pattern: this ability genuinely emerges from the mutual interaction among neurons. It is thus not surprising that, since the pioneering paper by Amit, Gutfreund and Sompolinsky [1] in the post-winter era of AI, SM of Disordered and Complex Systems has played a pivotal role in understanding information processing in Artificial Neural Networks (ANN). Indeed, it is expected to play a crucial role én route toward Explainable Artificial Intelligence (XAI) even in the modern formalisation of the new generation of (possibly “deep”) neural networks and learning machines [2,3]. The present workshop will retain a SM perspective, mixing mathematical and theoretical physics with machine learning.
Explaining the science
In the zoo of task-specific ANNs, Restricted Boltzmann Machine (RBM) are one of the cornerstones of these modern learning architectures. RBM are made up of a two-layer (shallow) neural network, whose visible layer is fed in with empirical data and whose hidden layer infers correlations in the dataset searching patterns to store; furthermore, as deep belief networks (e.g. deep Boltzmann machines) are architectures where several RBMs are stuck one into the other, understanding shallow networks is a mandatory prerequisite in order to reach a mature vision of deep learning itself.
In the biological counterpart, the Hebbian learning rule is the backbone of the Hopfield model, which is then considered as the “harmonic oscillator” for associative memory and pattern recognition. In the SM jargon, the Hopfield model is a “spin-glass” and, in fact, scientists have been able to use analytical tools inspired from disordered and complex SM (e.g., replica trick, cavity fields, stochastic stability) to investigate its features.
In recent times, it became clear that Boltzmann machines and Hopfield models are intimately related [4,5]: roughly speaking, once the RBM is trained, in its future usage it will behave as an Hopfield model for pattern recognition. This allows importing an arsenal of techniques originally stemmed in the SM of disordered systems [6] (hence to address the Hopfield model), also in machine learning (to inspect and quantify the emergent properties of RBMs and their generalisations). This bridge between the two poles of biological versus artificial information processing will guide the present workshop.
Challenge aims
Understanding the skills that deep artificial networks spontaneously show as their control parameters are made to vary is the main focus of this workshop. At present, it is not known what exactly these skills are, nor how many emergent properties we must expect from these networks. However, there are crystal clear suggestions about “where to sharpen the investigation”: in particular, it is known that, unlike shallow networks, deep networks may have variable signal-to-noise ratios, namely these nets can decide autonomously to sacrifice storage capacity to enhance their threshold for signal detection [7]. This is a very recent result which deserves to be deepened, as applications of this feature, once made well-controllable, can be valuable and have the potential to revolutionise disciplines where early signal detection is a priority.
Another aspect we aim to investigate are the spectral properties that these networks show and their relation to the ability to avoid over-fitting even when hyper-parametrised [8.9]: on random and structureless datasets, it is clear the mechanism by which these networks destroy the spin-glass state obtaining extensive free storage as a reward (the critical storage load shifts from 0.14 bit per neuron and saturate to 1 bit per neuron once these spectral mechanisms are at work) but the control on structured datasets is lacking at present.
Finally, with the current standard package of disordered SM, we can rigorously address networks whose underlying neural and synaptic stochastic dynamics is equipped with detailed balance, hence feed-forward networks (with their chain-rule for learning) are ruled out. These could be possibly included in a SM treatment by relaxing the requirement of detailed balance, but this implies a shift from glassy equilibrium statistical mechanics to completely off-equilibrium statistical mechanics. Although there is a long way to go, in principle, it should be possible to pave such a route toward a complete SM perspective on Theoretical Artificial Intelligence [10,11,12].
Potential for impact
XAI is a central theme of many research teams in machine learning worldwide. The present workshop aims at improving our understanding of AI decision processes by framing its intimate mechanisms in a scientific perspective. This will help the transition from matte-box to clear-box machine learning algorithms.
Recent updates
Physics-informed machine learning workshop - 9 October 2023
A workshop to showcase new methodologies developed during the TMCF two-week event took place on Monday 9 October 2023, see here. One of the primary objectives of this workshop was to comprehend the emergent skills deep artificial networks display when their control parameters are varied. It is presently unclear what these skills are and how many emergent properties these networks possess, but deep networks' capability to variably adjust their signal-to-noise ratios offers a clear avenue for investigation. You can watch the recording of the presentations here.
Publications
- About the de Almeida-Thouless line in neural networks - https://arxiv.org/pdf/2303.06375.pdf
- Thermodynamics of bidirectional associative memories - https://arxiv.org/pdf/2211.09694.pdf
- Ultrametric identities in glassy models of Natural Evolution - https://arxiv.org/pdf/2306.13430.pdf
- Dense Hebbian neural networks: a replica symmetric picture of unsupervised learning - https://arxiv.org/pdf/2211.14067.pdf
- Dense Hebbian neural networks: a replica symmetric picture of supervised learning - https://arxiv.org/pdf/2212.00606.pdf
- Parallel learning by multitasking neural networks - https://arxiv.org/abs/2308.04106
Organisers
Elena Agliari (“Sapienza” University of Rome)
Adriano Barra (University of Salento)
Andrea Pizzoferrato (University of Bath, The Alan Turing Institute)
References
[1] Amit, Daniel J., Hanoch Gutfreund, and Haim Sompolinsky. "Storing infinite numbers of patterns in a spin-glass model of neural networks." Physical Review Letters 55.14 (1985): 1530.
[2] Coolen, Anthony CC, Reimer Kühn, and Peter Sollich. Theory of neural information processing systems. OUP Oxford, 2005.
[3] Agliari, Elena, Adriano Barra, Peter Sollich, and Lenka Zdeborova. "Machine learning and statistical physics: theory, inspiration, application." Journal of Physics A: Special 2020 (2020).
[4] Barra Adriano, Genovese Giuseppe, Sollich Peter, Tantari Daniele, "Phase diagram of restricted Boltzmann machines and generalized Hopfield networks with arbitrary priors" Phys. Rev. E 97, 022310 (2018).
[5] Tubiana, Jérôme, and Rémi Monasson. "Emergence of compositional representations in restricted Boltzmann machines." Physical review letters 118.13 (2017): 138301.
[6] Decelle, Aurelien, et al. "Inference and phase transitions in the detection of modules in sparse networks." Physical Review Letters 107.6 (2011): 065701.
[7] Agliari, E., Alemanno, F., Barra, A., Centonze, M., & Fachechi, A. (2020). Neural networks with a redundant representation: detecting the undetectable. Physical review letters, 124(2), 028301.
[8] Decelle, Aurélien, Giancarlo Fissore, and Cyril Furtlehner. "Spectral dynamics of learning in restricted Boltzmann machines." EPL (Europhysics Letters) 119.6 (2017): 60001.
[9] Fachechi, Alberto, Elena Agliari, and Adriano Barra. "Dreaming neural networks: forgetting spurious memories and reinforcing pure ones." Neural Networks 112 (2019): 24-40.
[10] Del Prete, Valeria, and Anthony CC Coolen. "Non-equilibrium statistical mechanics of recurrent networks with realistic neurons." Neurocomputing 58 (2004): 239-244.
[11] Coolen, Ton, and David Sherrington. "Dynamics of attractor neural networks." Mathematical Approaches to Neural Networks, Elsevier 51 (1993): 293-306.
[12] Mozeika, Alexander, Bo Li, and David Saad. "Space of Functions Computed by Deep-Layered Machines." Physical Review Letters 125.16 (2020): 168301.