What can machine learning do for theoretical science?

Abhishek Mukherjee Abhishek Mukherjee
September 4, 2018 AI & Machine Learning

Ready to learn Machine Learning? Browse Machine Learning Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.

Scientific theories are what make the world comprehensible, at least for most of us. But then we heard a rumor that there is a new game in town: machine learning. Along with its sibling, big data, they threatened to drive scientific theories out of town. Machine learning, and especially deep learning, have already become a magic box for building ever more accurate predictive models. Using it, one could make predictions based on patterns found in previous observations. Traditionally, making predictions was a complicated business, involving, amongst other things, developing underlying theories for understanding how things work. But now you could throw enough data at a large enough neural network and you will have predictions coming out from the other side. So why bother with theories at all?

The rumor dissipated soon enough, because it was based on the false premise that the goal of science is to churn out predictions. It is not. The goal of science is to provide understanding. Understanding comes from explanations, and explanations are provided by theories. The whole edifice of modern science stands on the shoulders of a web of interconnected theories.

The rumor might have died, but its ghost continues to haunt us. Old school theorists tend to regard this new wave of empiricism as an attack on their profession by the plebs. And, many freshly minted data experts, coming from the less analytical lands of our newly democratized landscape, often seem to conflate theory with preconceived bias.

For me, personally, this rather sorry state of affairs is … somewhat awkward. I started out as a theoretical physicist. Theories are what help me make sense of the world. Yet, I make my living now by tinkering machine learning algorithms. I can appreciate, first hand, the power of these algorithms. Yes, machine learning is a tool, but it is a tool like no other. It fundamentally alters our relationship with information. One way or the other, our conception of what constitutes an understanding of reality will be shaped by the role that machine learning plays in science.

If rationalism is to survive this deluge of empiricism, then theorists need to find a way to incorporate machine learning meaningfully into their world. Not as a foreign clerk dealing with the mindless drudgery of mining through data, but as a full citizen and guide to the art of building scientific theories.

It is not such a strange wish. After all, most of the important advancements in how we store, process or convey information, be it new mathematical techniques or electronic computers, have found their use in the development of scientific theories. There is no reason why machine learning should remain the surly exception. The question is, how?

The template that we use for building theories is derived largely from physics. A theory is essentially a set of rules that can be used to derive predictive models of different aspects of phenomena. The explanatory power of theories comes from their ability to provide holistic pictures of aspects of reality, i.e. in being able to show that disparate phenomena emerge from a small set of simple rules. For example, the same rules of statistical mechanics can be used to calculate the thermodynamic properties (such as temperature, pressure, density) of any substance in equilibrium.

Historically, our belief in being able to explain the universe on the basis of such theoretical frameworks has been motivated largely by the spectacular successes of physics. However, thanks to insights provided by the seminal work by Kenneth Wilson and others in the last quarter of the previous century, this belief now stands on a healthy foundation of understanding.

Consider a hierarchy of rulesets, with the initial (bottom level) ruleset representing the mathematical structure of a theory and the final (top level) one representing the mathematical structure of the observed stable correlations in data. One can now think of a transformation such that the rulesets at each level are obtained by applying this transformation to the ruleset at the previous level. This process used to derive higher level rulesets from the lower level ones is called the renormalization group flow (I am using this term very loosely).

For certain kinds of transformations and rulesets, something quite remarkable and unexpected happens; starting from very different initial rulesets you end up with the same final ruleset. The final ruleset in this case is called a fixed point and the group of initial rulesets that lead to the same fixed point are said to constitute a universality class. The hypothesis of universality (or simply universality for brevity) states that rulesets and transformations that are actually found in nature are of the above kind. (See here for an introduction to universality and renormalization group).

If universality is true, then it would mean that the observed stable correlations in complex systems would be independent of the details of the underlying theory, i.e. simple theories may be good enough. And, in addition, we should see correlations having the same mathematical structure across various unrelated domains.

Universality was first observed and studied in the behavior of the thermodynamic variables of disparate systems near continuous phase transitions. Since then it has been observed in a variety of diverse and unrelated places such as the dynamics of complex networks, multi agent systems, the occurrence of pink noise and the bus system of a town in Mexico, to name a few (see here for some interesting examples). There is enough empirical evidence to believe that nature (including many man-made entities) really does indeed favor universality.

Although theories belonging to a universality class may have very different origins (with respect to the aspect of reality they are trying to explain) and mathematical details, they share some important mathematical properties which puts tight constraints on their mathematical structure. For the universality classes found in physics these properties are usually symmetries, dimensionality and locality. But, in general, they will depend of the specific universality class, and can be determined by carrying out the renormalization group flow of a member of the class.

Universality, by itself, can only partially explain why the theoretical frameworks in physics are so successful. The second part comes from the observation that the hierarchy of rulesets in physical systems corresponds very nicely with our intuition. In physics the hierarchy of rules is the hierarchy of scales or resolution. Intuitively, we expect that big things (macroscopic objects) have rules and so must small things (microscopic entities). We also know that big things are composed of small things, hence the macroscopic patterns should follow from microscopic theory. And this is exactly what happens in reality. This is the reason why (almost naive) reductionism works so well in most areas of physics.

The final piece in this puzzle has to do with the timeline of technological development. We started off by observing phenomena at the human scale, and only then started developing the technology, microscopes and telescopes, to observe phenomena at progressively smaller and larger scales. This timeline corresponds very nicely with the hierarchy of rulesets in physical systems. As a result we could develop a very fruitful feedback between theory and experiment. But, even more importantly the starting point was very crucial — for many physical systems the human scale is the one where universality kicks in. What this meant was that the stable correlations were manifest even with small amounts data and manual inspection.

To appreciate why the above points are so important, consider the situation where instead of measurements of thermodynamic properties we started off with pictures containing the snapshots of all the atoms in a box of gas at different times. How easy would it be to derive thermodynamics or statistical mechanics from this data?

The situation that we currently encounter in fields such as biology, economics or social sciences is not very different from the above situation. Unlike physics, in these fields we do not have the luxury of knowing what the hierarchy of rulesets corresponds to in reality. Neither do we know at which stage should universality kick in and we should expect to see stable correlations.

But what we did not have before, and we do have now, is a lot more data and a tool, machine learning, for distilling that data and finding these stable correlations. There is good reason to believe that deep neural networks essentially perform a version of renormalization group flow, and that one of the reasons why they are so effective is because in many situations generative processes (rulesets) for data generation are hierarchical. When viewed through the prism of universality, this means that deep neural networks provide us with access to a renormalization group flow in the universality class containing the correct underlying theory, which can then be used to constrain the mathematical structure of the underlying theory.

Consider a thought experiment where a deep neural network is provided with the snapshots of gas atoms along with the value of some complicated function of the thermodynamic variables; and we train the network with the task of predicting the value from the snapshots. Do we expect thermodynamics to emerge in the final layers of the network? Should we be able to constrain the mathematical structure of statistical mechanics from the weights of the network? There is no reason, in principle, to believe otherwise.

To come back to the question raised earlier; how can machine learning help theoretical science? Machine learning can provide the mathematical scaffolding for scientific theories, to which theorists will then add meaning and the bridge to reality. However, before we can get there we will need to develop a much better understanding of machine learning. We will need to understand machine learning algorithms from general principles. In other words, what are the analogs of symmetry, dimensionality and locality in machine learning? Perhaps, it is time to start developing a real theory of machine learning.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Abhishek Mukherjee

    Tags
    Artificial Intelligence
    © 2021, Experfy Inc. All rights reserved.
    Leave a Comment
    Next Post
    The Future of Smart Devices with Natural Language Processing

    The Future of Smart Devices with Natural Language Processing

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: [email protected]

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2025, Experfy Inc. All rights reserved.