This blog post about creating Gaussian processes from scratch in Python helped me to make sense it. https://peterroelants.github.io/posts/gaussian-process-tutorial/

#gaussianProcess

2/n

Gaussian processes (1/3) - From scratch

This post explores some concepts behind Gaussian processes, such as stochastic processes and the kernel function. We will build up deeper understanding of Gaussian process regression by implementing them from scratch using Python and NumPy.

Peter’s Notes

A Gaussian process is a random function, which for me was hard to understand. How do you get these smooth functions if the value of the function at each x position is random?

Now it's making more sense to me. You can pull the whole function out of a hat! The distribution has correlations that encode relationships between the different dimensions (x values). So even though the whole function is random, adjacent points on the function could be closely related.

#gaussianProcess
#TIL
#randomFunction

I’m excited to announce our #SIGGRAPHAsia2025 paper, which makes #rendering of #GaussianProcess implicit surfaces (GPISes) practical:

Project page: https://cs.dartmouth.edu/~wjarosz/publications/xu25practical.html

We achieve this with a novel #procedural noise formulation and by enabling next-event estimation for specular BRDFs. [1/7]

We propose a new family of probability densities that have closed form normalising constants. Our densities use two layer neural networks as parameters, and strictly generalise exponential families. We show that the squared norm can be integrated in closed form, resulting in the normalizing constant. We call the densities Squared Neural Family (#SNEFY), which are closed under conditioning.

Accepted at #NeurIPS2023. #MachineLearning #Bayesian #GaussianProcess

https://arxiv.org/abs/2305.13552

Squared Neural Families: A New Class of Tractable Density Models

Flexible models for probability distributions are an essential ingredient in many machine learning tasks. We develop and investigate a new class of probability distributions, which we call a Squared Neural Family (SNEFY), formed by squaring the 2-norm of a neural network and normalising it with respect to a base measure. Following the reasoning similar to the well established connections between infinitely wide neural networks and Gaussian processes, we show that SNEFYs admit closed form normalising constants in many cases of interest, thereby resulting in flexible yet fully tractable density models. SNEFYs strictly generalise classical exponential families, are closed under conditioning, and have tractable marginal distributions. Their utility is illustrated on a variety of density estimation, conditional density estimation, and density estimation with missing data tasks.

arXiv.org

#arxivfeed :

"Neural network with optimal neuron activation functions based on additive Gaussian process regression"
https://arxiv.org/abs/2301.05567

#MachineLearning #DeepLearning #GaussianProcess

Neural network with optimal neuron activation functions based on additive Gaussian process regression

Feed-forward neural networks (NN) are a staple machine learning method widely used in many areas of science and technology. While even a single-hidden layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow using fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids non-linear fitting of neural network parameters. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of a NN. We demonstrate the approach by fitting the potential energy surface of the water molecule. Without requiring any non-linear optimization, the additive GPR based approach outperforms a conventional NN in the high accuracy regime, where a conventional NN suffers more from overfitting.

arXiv.org

#arxivfeed :

"Dynamic Bayesian Learning and Calibration of Spatiotemporal Mechanistic Systems"
https://arxiv.org/abs/2208.06528

#DynamicalSystems #ModelCalibration #Bayesian #ParameterEstimation #GaussianProcess

Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models

We develop an approach for Bayesian learning of spatiotemporal dynamical mechanistic models. Such learning consists of statistical emulation of the mechanistic system that can efficiently interpolate the output of the system from arbitrary inputs. The emulated learner can then be used to train the system from noisy data achieved by melding information from observed data with the emulated mechanistic system. This joint melding of mechanistic systems employ hierarchical state-space models with Gaussian process regression. Assuming the dynamical system is controlled by a finite collection of inputs, Gaussian process regression learns the effect of these parameters through a number of training runs, driving the stochastic innovations of the spatiotemporal state-space component. This enables efficient modeling of the dynamics over space and time. This article details exact inference with analytically accessible posterior distributions in hierarchical matrix-variate Normal and Wishart models in designing the emulator. This step obviates expensive iterative algorithms such as Markov chain Monte Carlo or variational approximations. We also show how emulation is applicable to large-scale emulation by designing a dynamic Bayesian transfer learning framework. Inference on $\bm η$ proceeds using Markov chain Monte Carlo as a post-emulation step using the emulator as a regression component. We demonstrate this framework through solving inverse problems arising in the analysis of ordinary and partial nonlinear differential equations and, in addition, to a black-box computer model generating spatiotemporal dynamics across a graphical model.

arXiv.org
Take the idea of random Fourier features, as applied to #GaussianProcess regression in #MachineLearning. There is a method in the probabilistic numerics textbook about Gaussian quadrature (same Gauss, different method) which gives good convergence with respect to the spectrum of a function. Show that a high quality #kernel (low rank approximation) can be computed efficiently (sublinear in the number of training points).
https://www.jmlr.org/papers/v23/21-0030.html
Gauss-Legendre Features for Gaussian Process Regression

Been trying out a lot of #GaussianProcess libraries in #python lately. For what my opinion is worth, I'm really enjoying using GPFlow.

It seems to have a good balance that you can customize the things you want, and not have to over-worry about the rest. Documentation includes simple examples to very advanced. Comprehensive enough for a guy like me to get a model up and running in a couple hours on real data.

@FCAI Update on Zheyang’s status: The opponent got there on the last minute, gave an enlightening view of the position of the thesis in #BayesianModeling and #GaussianProcess es, and asked a set of broad and challenging questions. Zheyand did outstandingly well, with still some unsolved questions which he will be eager to pursue when on the job market at some point. Big congrats Zheyang Shen and many thanks opponent Chris Oates! #PhD #AaltoUniversity @FCAI
Towards Improved Learning in Gaussian Processes: The Best of Two Worlds

Gaussian process training decomposes into inference of the (approximate) posterior and learning of the hyperparameters. For non-Gaussian (non-conjugate) likelihoods, two common choices for approximate inference are Expectation Propagation (EP) and Variational Inference (VI), which have complementary strengths and weaknesses. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, it does not automatically imply it is a good learning objective for hyperparameter optimization. We design a hybrid training procedure where the inference leverages conjugate-computation VI and the learning uses an EP-like marginal likelihood approximation. We empirically demonstrate on binary classification that this provides a good learning objective and generalizes better.

arXiv.org