Gaussian Processes (GPs) have been widely used in robotics as models, and more recently as key structures in active learning algorithms, such as Bayesian optimization. GPs consist of two main components: the mean function and the kernel. Specifying a prior mean function has been a common way to incorporate prior knowledge. When a prior mean function could not be constructed manually, the next default has been to incorporate prior (simulated) observations into a GP as 'fake' data. Then, this GP would be used to further learn from true data on the target (real) domain. We argue that embedding prior knowledge into GP kernels instead provides a more flexible way to capture simulation-based information. We give examples of recent works that demonstrate the wide applicability of such kernel-centric treatment when using GPs as part of Bayesian optimization. We also provide discussion that helps to build intuition for why such 'kernels as priors' view is beneficial.
QC 20210614