Overview of gradient and curvature based optimisation techniques for macromolecular crystallography
Garib N. MurshudovYork Structural Biology Laboratory, University of York, UK E-mail: garib@ysbl.york.ac.uk
Optimisation of a function (e.g. likelihood, posterior distribution) is an essential ingredient of any statistical estimation. Functions encountered in practice are highly nonlinear and there is no closed
form solution for most optimisation problems. Usually optimisations are carried out either using one of the stochastic techniques with their various variants or by iterative techniques that utilise first (and
second) derivatives of the functions. Although stochastic techniques can give good approximations to global optima they are very slow for large-scale problems. To speed up an estimation procedure iterative
techniques are used: 1) initial values for the parameters are postulated; 2) then the function is optimised using derivative based local optimisers; 3) then the values of the parameters are examined
either automatically or manually and the procedure is repeated starting from step 2.
Derivative based techniques can loosely be divided into two main classes: methods based on first derivatives (e.g. steepest descent, conjugate gradient, variable metric and their various variants)
and those utilising information about curvature of the function in a form of second derivative or information matrix. Newton-Raphson, Levenberg-Marquard, Fisher's scoring methods belong to the
second class of techniques. Although the radius of convergence of Newton-Raphson techniques is smaller than that of gradient based techniques, they are the fastest available methods for function
optimisation when parameters are in the vicinity of local optima. The rate of convergence of Levenberg-Marquard and Fisher's scoring techniques are slower than that of the Newton-Raphson method,
however their radius of convergence is similar to that of gradient based techniques and with proper implementation they can coincide with the radius of convergence of gradient based techniques. One of
the challenges of derivative based methods is the fast calculation of derivatives or their approximations.
In this presentation an overview of various optimisation techniques and their use in macromolecular crystal structure analysis will be given. The main focus of the lecture will be on the fast and efficient
calculation of first derivatives and an efficient approximation to the second derivative and information matrix.
These pages are maintained by the Commission Last updated: 15 Oct 2021