Symbolic Regression was initially developed in the 70's and reshaped in the early '90s by John Koza. Its innovation is that instead of fitting the parameters of a predefined equation (model) to the data, the algorithm creates the equation itself, from scratch.
To do so, symbolic regression in its most recent variants uses genetic algorithms to traverse a vast search space of candidate equations. This results in models that can be very complex yet interpretable, potentially uncovering complex relationships that may contribute to new scientific discoveries.
Despite the inherent limitations of efficiently searching such vast spaces, experiments have yielded very promising results in rediscovering complex, previously known mathematical equations.
GPU acceleration and more sophisticated search strategies have further expanded the capabilities of this method.
Link to the Python PySR package
https://pypi.org/project/pysr/
Wikipedia article:
https://en.wikipedia.org/wiki/Symbolic_regression
#datascience #ml #ai #python #rstats #regression #symbolicregression